From murdoch.duncan at gmail.com Tue Nov 1 00:16:03 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Mon, 31 Oct 2011 19:16:03 -0400 Subject: [R] 3D Graph Surface and single points (eg wireframe with points) In-Reply-To: <1320099095.30547.YahooMailNeo@web29718.mail.ird.yahoo.com> References: <1320099095.30547.YahooMailNeo@web29718.mail.ird.yahoo.com> Message-ID: <4EAF2C33.5070603@gmail.com> On 11-10-31 6:11 PM, Karl Knoblick wrote: > Hallo! > > I just want to make a 3D plot of a surface of a cone and want to plot some single points around. > > I tried wireframe but cannot find how to plot single points > > I tried scatterplot3d but there the surface is not simple to plot. And: How can I rotate the point of view by the z-axis > > I tried persp3d but how can I add some single points? After drawing the surface with persp3d, just use points3d to add the points. You say it didn't work for you, but you don't show what you did, so I have no idea what went wrong. Duncan Murdoch > > Example: > > library(lattice) > library(scatterplot3d) > > #data > d<- expand.grid(x = 1:10, y = 5:15) > d$z<- sqrt((d$x-4)^2 + (d$y-10)^2) > > wireframe(z ~ x * y, data = d, drape=T) > # How to plot points?? > > scatterplot3d(d$x, d$y, d$z) > # How to plot nice surface? > # How to rotate point of view by z axis? > # BTW: points3d should add some points (does not work for me) > > # persp3d no example > > > Can anybody help? Has anybody such an example? > > Actually, I think it is possible with R to dray such 3D plots - the question is how? Even possible to animate the 3D plot? Or rotate interactive? > > > Best regards > Karl > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From dwinsemius at comcast.net Tue Nov 1 01:04:47 2011 From: dwinsemius at comcast.net (Comcast) Date: Mon, 31 Oct 2011 20:04:47 -0400 Subject: [R] lapply and Two TimeStamps as input In-Reply-To: <1320100645.19608.YahooMailNeo@web120106.mail.ne1.yahoo.com> References: <1320100645.19608.YahooMailNeo@web120106.mail.ne1.yahoo.com> Message-ID: <081116CA-B467-434B-9B12-88520263E766@comcast.net> Provide some sample data. For instance, a data frame with two columns and the function. On Oct 31, 2011, at 6:37 PM, Alaios wrote: > Dear all, > > I have a function that recognizes the following format for timestamps > "%Y-%m-%d %H:%M:%S" > > my function takes two input arguments the TimeStart and TimeEnd > I would like to help me create the right list with pairs of TimeStart and TimeEnd which I can feed to lapply (I am using mclapply actually). > For every lapply I want two inputs to be fed to my function. I only know how to feed one input to the lapply > > Could you please help me with that? > I would like to thank you in advance for your help > > B.R > Alex > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From dwinsemius at comcast.net Tue Nov 1 01:18:02 2011 From: dwinsemius at comcast.net (Comcast) Date: Mon, 31 Oct 2011 20:18:02 -0400 Subject: [R] oversampling code In-Reply-To: References: Message-ID: <16D6EF48-C49E-4B48-95E3-D51EA0865198@comcast.net> On Oct 31, 2011, at 1:54 PM, loubna ibn majdoub hassani wrote: > Hir > I have an umbalanced data set where I want to predict a binary variable Y. > I want to do an under sampling by keeping all the 1 and taking just some of > the 0 such as I'll have 90% of 0 and 10% of 1. ou haven' t given much detail , buteo thing like this will take all of the 1's and 10% of the 0's dfrm[c(rownames(dfrm[dorm$Y==1,]), sample(rownames(dfrm[dfrm$Y==0]), 0.10)) , ] > Can u help me do that > Thank u > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From Peter.Brecknock at bp.com Tue Nov 1 01:21:12 2011 From: Peter.Brecknock at bp.com (Pete Brecknock) Date: Mon, 31 Oct 2011 17:21:12 -0700 (PDT) Subject: [R] lapply and Two TimeStamps as input In-Reply-To: <1320100645.19608.YahooMailNeo@web120106.mail.ne1.yahoo.com> References: <1320100645.19608.YahooMailNeo@web120106.mail.ne1.yahoo.com> Message-ID: <1320106872368-3962139.post@n4.nabble.com> alaios wrote: > > Dear all, > > I have a function that recognizes the following format for timestamps > "%Y-%m-%d %H:%M:%S" > > my function takes two input arguments the TimeStart and TimeEnd > I would like to help me create the right list with pairs of TimeStart and > TimeEnd which I can feed to lapply (I am using mclapply actually).? > For every lapply I want two inputs to be fed to my function. I only know > how to feed one input to the lapply > > Could you please help me with that? > I would like to thank you in advance for your help > > B.R > Alex > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@ mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Is this of any use? # Data in list - first time is start time, second time is end time myListStartEnd = list(c(strptime("2010-01-01 09:00:00","%Y-%m-%d %H:%M:%S"), strptime("2011-01-01 11:30:00","%Y-%m-%d %H:%M:%S")), c(strptime("2010-12-01 10:00:00","%Y-%m-%d %H:%M:%S"), strptime("2010-12-25 06:00:00","%Y-%m-%d %H:%M:%S"))) lapply(myListStartEnd,function(x) x[2]-x[1]) # Output [[1]] Time difference of 365.1042 days [[2]] Time difference of 23.83333 days HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/lapply-and-Two-TimeStamps-as-input-tp3961939p3962139.html Sent from the R help mailing list archive at Nabble.com. From anopheles123 at gmail.com Tue Nov 1 01:42:41 2011 From: anopheles123 at gmail.com (Weidong Gu) Date: Mon, 31 Oct 2011 20:42:41 -0400 Subject: [R] oversampling code In-Reply-To: <16D6EF48-C49E-4B48-95E3-D51EA0865198@comcast.net> References: <16D6EF48-C49E-4B48-95E3-D51EA0865198@comcast.net> Message-ID: You should figure out how many samples you want for Y=1 and 0, then sample from the relevant subset dfrm[dfrm$Y==1] by sampling row.names(dfrm[dfrm$Y==1] using replace=FALSE ?sample On Mon, Oct 31, 2011 at 8:18 PM, Comcast wrote: > > > On Oct 31, 2011, at 1:54 PM, loubna ibn majdoub hassani wrote: > >> Hir >> I have an umbalanced data set where I want to predict a binary variable Y. >> I want to do an under sampling by keeping all the 1 and taking just some of >> the 0 such as I'll have 90% of 0 and 10% of 1. > > ou haven' t given much detail , buteo thing like ?this will take all of the 1's and 10% of the 0's > > dfrm[c(rownames(dfrm[dorm$Y==1,]), sample(rownames(dfrm[dfrm$Y==0]), 0.10)) , ] >> Can u help me do that >> Thank u >> >> ? ?[[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From dwinsemius at comcast.net Tue Nov 1 02:05:47 2011 From: dwinsemius at comcast.net (Comcast) Date: Mon, 31 Oct 2011 21:05:47 -0400 Subject: [R] Vectorize 'eol' characters In-Reply-To: <5F844D2C953ACA48849C700A12752B940260432B@colhpaexc010.HPA.org.uk> References: <5F844D2C953ACA48849C700A12752B9402604329@colhpaexc010.HPA.org.uk> <5F844D2C953ACA48849C700A12752B940260432B@colhpaexc010.HPA.org.uk> Message-ID: <12A151C6-87DC-4F1A-BC36-A83CFBBD7F4E@comcast.net> On Oct 31, 2011, at 2:01 PM, "Stefano Conti" wrote: > Thanks to Dr Shepard and Prof Riply for their helpful replies. > > In my original query I should have also specified that I have tried the trick, also suggested by Prof Ripley, of appending the extra-column to the original matrix before dumping to text; however, in cases where the field separator string (argument of the 'sep' option in write.table) is non-null, I'd then have it also between the original matrix's last column and the appended text column -- which is not what I want. > > Any additional suggestion / follow-up on this? With continued thanks, > Sometimes it would help to answer the question, "why bother?" I can imagine and have have even tested as a concept the possibility of intercepting the output of capture.output(write.table(...)) so that the last separator could be removed. Before posting any code I would like to see if the effort would be on target and worth the further effort. What separator are you thinking would be used and how complex is this rolling eol??? (it's a bit like an actor saying to the director ... What's my motivation?) -- David. > > -- > Dr Stefano Conti > Statistics Unit (room #2A19) > Health Protection Services > HPA Colindale > 61 Colindale Avenue > London NW9 5EQ, UK > tel: +44 (0)208-3277825 > fax: +44 (0)208-2007868 > > > > -----Original Message----- > From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] > Sent: Mon 31/10/2011 16:45 > To: Stefano Conti > Cc: r-help at r-project.org > Subject: Re: [R] Vectorize 'eol' characters > > On Mon, 31 Oct 2011, Stefano Conti wrote: > >> Dear R users, >> >> When dumping an R matrix object into a file -- typically via the >> 'write.table' function -- the 'eol' option can be used to specify >> the end-of-line character(s) which should appear at the end of each >> row. >> >> However the argument to 'eol' seems to be restricted to have length >> 1, whereas ideally I would like different rows to be written to file >> each with its own end character string. For instance: > > That's not what 'eol' means. It is the indicator of the end of line, > so of course it is the same for every line. > >>> test <- matrix(1:12, nrow=4); test >> [,1] [,2] [,3] >> [1,] 1 5 9 >> [2,] 2 6 10 >> [3,] 3 7 11 >> [4,] 4 8 12 >> >>> write.table(test, file="test.txt", sep=" ", eol=paste(" test", 1:4, "\n", sep="")) >> >>> read.table(file="test.txt", sep=" ") >> V1 V2 V3 test1 >> 1 1 5 9 test1 >> 2 2 6 10 test1 >> 3 3 7 11 test1 >> 4 4 8 12 test1 >> >> whereas I would like the last column of the dump file to be "test1", >> "test2", "test3", "test4". Is there a way this could be achieved? > > Hmn, you said it: 'the last column'. > Create what you want as the last column of your data frame: it wil > then be written to the file as the last column. > > The author of write.table. > >> With many thanks in advance for your help, kind regards, >> >> >> -- >> Dr Stefano Conti >> Statistics Unit (room #2A19) >> Health Protection Services >> HPA Colindale >> 61 Colindale Avenue >> London NW9 5EQ, UK >> tel: +44 (0)208-3277825 >> fax: +44 (0)208-2007868 >> >> ----------------------------------------- >> ************************************************************************** >> The information contained in the EMail and any attachments is >> confidential and intended solely and for the attention and use of >> the named addressee(s). It may not be disclosed to any other person >> without the express authority of the HPA, or the intended >> recipient, or both. If you are not the intended recipient, you must >> not disclose, copy, distribute or retain this message or any part >> of it. This footnote also confirms that this EMail has been swept >> for computer viruses, but please re-sweep any attachments before >> opening or saving. HTTP://www.HPA.org.uk >> ************ >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From vwmoriarty at gmail.com Tue Nov 1 04:27:16 2011 From: vwmoriarty at gmail.com (Vinny Moriarty) Date: Mon, 31 Oct 2011 20:27:16 -0700 Subject: [R] Multiple time series with zoo Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From richard.derozario at gmail.com Tue Nov 1 04:34:46 2011 From: richard.derozario at gmail.com (RdR) Date: Mon, 31 Oct 2011 20:34:46 -0700 (PDT) Subject: [R] low sigma in lognormal fit of gamlss Message-ID: <1320118486511-3962480.post@n4.nabble.com> Hi, I'm playing around with gamlss and don't entirely understand the sigma result from an attempted lognormal fit. In the example below, I've created lognormal data with mu=10 and sigma=2. When I try a gamlss fit, I get an estimated mu=9.947 and sigma=0.69 The mu estimate seems in the ballpark, but sigma is very low. I get similar results on repeated trials and with Normal and standard normal distributions. How should I understand sigma in these results? cheers, RdR ######### Example ######### # enable reproduction set.seed(1234) # create some lognormal data X <- rlnorm(1000,meanlog=10,sdlog=2) # try gamlss fit gLNO <- gamlss(X~1,family=LNO) summary(gLNO) ******************************************************************* Family: c("LNO", "Box-Cox") Call: gamlss(formula = X ~ 1, family = LNO) Fitting method: RS() ------------------------------------------------------------------- Mu link function: identity Mu Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 9.947 0.06305 157.8 0 ------------------------------------------------------------------- Sigma link function: log Sigma Coefficients: Estimate Std. Error t value (Intercept) 0.69 0.02236 30.86 Pr(>|t|) (Intercept) 2.19e-147 ------------------------------------------------------------------- No. of observations in the fit: 1000 Degrees of Freedom for the fit: 2 Residual Deg. of Freedom: 998 at cycle: 2 Global Deviance: 24111.45 AIC: 24115.45 SBC: 24125.27 ******************************************************************* Warning message: In summary.gamlss(gLNO) : summary: vcov has failed, option qr is used instead -- View this message in context: http://r.789695.n4.nabble.com/low-sigma-in-lognormal-fit-of-gamlss-tp3962480p3962480.html Sent from the R help mailing list archive at Nabble.com. From richard.derozario at gmail.com Tue Nov 1 04:38:35 2011 From: richard.derozario at gmail.com (RdR) Date: Mon, 31 Oct 2011 20:38:35 -0700 (PDT) Subject: [R] low sigma in lognormal fit of gamlss In-Reply-To: <1320118486511-3962480.post@n4.nabble.com> References: <1320118486511-3962480.post@n4.nabble.com> Message-ID: <1320118715469-3962487.post@n4.nabble.com> Actually, I think I've spotted it: the link function for sigma is "log" and exp(0.69) is nearly 2 -- the original sigma -RdR -- View this message in context: http://r.789695.n4.nabble.com/low-sigma-in-lognormal-fit-of-gamlss-tp3962480p3962487.html Sent from the R help mailing list archive at Nabble.com. From mmstat at comcast.net Tue Nov 1 05:49:48 2011 From: mmstat at comcast.net (mmstat at comcast.net) Date: Tue, 1 Nov 2011 04:49:48 +0000 Subject: [R] drawing ellipses in R Message-ID: <68176232.1578217.1320122988281.JavaMail.root@sz0115a.westchester.pa.mail.comcast.net> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ata.sonu at gmail.com Tue Nov 1 07:38:36 2011 From: ata.sonu at gmail.com (ATANU) Date: Mon, 31 Oct 2011 23:38:36 -0700 (PDT) Subject: [R] quantmod package In-Reply-To: References: <1319094206975-3921071.post@n4.nabble.com> <1319211044012-3925863.post@n4.nabble.com> <1319782887275-3947026.post@n4.nabble.com> Message-ID: <1320129516858-3962665.post@n4.nabble.com> Dear Michael , Thanks for your help. I figured out the fault. Actually i was running the code for a very short time(less than one minute) and was then trying to make chart out of that. The code works well if I run it for more than one minute. Thanks again for your help. Atanu -- View this message in context: http://r.789695.n4.nabble.com/quantmod-package-tp3921071p3962665.html Sent from the R help mailing list archive at Nabble.com. From paul.hiemstra at knmi.nl Tue Nov 1 09:21:48 2011 From: paul.hiemstra at knmi.nl (Paul Hiemstra) Date: Tue, 01 Nov 2011 08:21:48 +0000 Subject: [R] troubles installing R In-Reply-To: <39B5ED61E7BFC24FA8277B6DE92A9A3F04204341@fkimlki01.enterprise.afmc.ds.af.mil> References: <39B5ED61E7BFC24FA8277B6DE92A9A3F04204341@fkimlki01.enterprise.afmc.ds.af.mil> Message-ID: <4EAFAC1C.8010507@knmi.nl> On 10/31/2011 06:30 PM, Cable, Sam B Civ USAF AFMC AFRL/RVBXI wrote: > I am trying to install R on a pretty up-to-date CentOS system. I have > tried installing 2.14.0 and 2.13.2. In both cases, the configure step > fails, and does not produce a Makefile. So, of course, I can't issue > "make". > > The errors that I can find in config.log are, first, several missing > files: ac_nonexistent.h, minix/config.h., readline/history.h, and > readline/readline.h. > > Then, what seems to finally bring configure to a halt is the final > error: > > configure:20796: error: --with-readline=yes (default) and headers/libs > are not available Hi, It seems that the header files/libs are not available... You probably need to install some development libraries from the CentOS repositories. like readline-dev or such. Installing these libraries and the header files will be available. regards, Paul > After this error, configure dumps long lists of shell variables, output > variables, and confdefs.h -- none of which tell a whole lot, as far as I > can see -- and then exits with a status of "1". > > Anyone know what this means? I can give bigger snippets of the > config.log, if anyone wants to see them. > > I have installed R on several Linux machines. This is the first time > I've been stumped. > > Thanks. > > --Sam > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 From alaios at yahoo.com Tue Nov 1 09:24:41 2011 From: alaios at yahoo.com (Alaios) Date: Tue, 1 Nov 2011 01:24:41 -0700 (PDT) Subject: [R] Nested lapply? Is this allowed? Message-ID: <1320135881.4481.YahooMailNeo@web120111.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From b.rowlingson at lancaster.ac.uk Tue Nov 1 09:44:04 2011 From: b.rowlingson at lancaster.ac.uk (Barry Rowlingson) Date: Tue, 1 Nov 2011 08:44:04 +0000 Subject: [R] drawing ellipses in R In-Reply-To: <68176232.1578217.1320122988281.JavaMail.root@sz0115a.westchester.pa.mail.comcast.net> References: <68176232.1578217.1320122988281.JavaMail.root@sz0115a.westchester.pa.mail.comcast.net> Message-ID: On Tue, Nov 1, 2011 at 4:49 AM, wrote: > Hello, > > I have been following the thread dated Monday, October 9, 2006 when Kamila Naxerova asked a question about plotting elliptical shapes. Can you explain the equations for X and Y. I believe they used the parametric form of x and y (x=r cos(theta), y=r sin(theta). I don't know what r is here ? Can you explain 1)the origin of these equations and 2) what is r? > The origin of these equations? Probably this guy: http://en.wikipedia.org/wiki/File:God_the_Geometer.jpg Or maybe Euclid. What is r (as opposed to R)? Mathematical questions are nicely answered by Mathworld: http://mathworld.wolfram.com/Circle.html - a fantastic resource for maths and stats information. Barry From muhammad.rahiz at ouce.ox.ac.uk Tue Nov 1 09:47:17 2011 From: muhammad.rahiz at ouce.ox.ac.uk (Muhammad Rahiz) Date: Tue, 1 Nov 2011 08:47:17 +0000 (GMT) Subject: [R] Significance of trend In-Reply-To: References: Message-ID: Thanks for those who replied. I know what a p-value is and the links given reaffirm my understanding. The code below gives p=0.26. This is more than 0.05 - hence null hypothesis i.e. attributed by chance. But I did not specify the significance level at 0.05. So I'm wondering if any part of the code states so. Or does R define significance level at 0.05 by default? Thanks -- Muhammad On Mon, 31 Oct 2011, R. Michael Weylandt wrote: > I'm thinking you need to read up on what a p-value is: > https://secure.wikimedia.org/wikipedia/en/wiki/P-value > > Michael Weylandt > > On Mon, Oct 31, 2011 at 2:51 PM, Muhammad Rahiz > wrote: >> Hi everyone, >> >> I'm trying to determine the significance of a trendline. From my internet >> search months ago, I came across the following post. I modified tim and >> dat for simiplicity. >> >> tim <- 1:10 >> dat <- c(0.17, 1.09 ,0.11, 0.82, 0.23, 0.38 ,2.47 ,0.41 ,0.75, 1.44) >> >> fstat <- summary(lm(dat~tim))$fstatistic >> p.val <- 1-pf(fstat["value"],fstat["numdf"],fstat["dendf"]) >> >> The resultant p.value is 0.261. But I want to know if this >> value is significant and at what significance level. How do I check for >> it? >> >> Thanks >> >> -- >> Muhammad >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > From tramni at abv.bg Tue Nov 1 10:05:39 2011 From: tramni at abv.bg (Martin Ivanov) Date: Tue, 1 Nov 2011 11:05:39 +0200 (EET) Subject: [R] triangles point left, filled? Message-ID: <239734346.246556.1320138339838.JavaMail.apache@mail21.abv.bg> Dear R users, I want to plot not only triangles point up and triangles point down, which is easy using the "pch" argument to "points". I want to plot left and right pointing triangles as well. They must be fillable with colour. I browsed a little in the documentation, tried rotating the up and down pointing triangles, but of no avail. Any suggestions will be appreciated. Regards, Martin Ivanov ----------------------------------------------------------------- 100 ?? ?????. ???-?????? ???????????. Tempobet.com http://bg.tempobet.com/affiliates/3208311 From ripley at stats.ox.ac.uk Tue Nov 1 10:50:43 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Tue, 1 Nov 2011 09:50:43 +0000 (GMT) Subject: [R] troubles installing R In-Reply-To: <4EAFAC1C.8010507@knmi.nl> References: <39B5ED61E7BFC24FA8277B6DE92A9A3F04204341@fkimlki01.enterprise.afmc.ds.af.mil> <4EAFAC1C.8010507@knmi.nl> Message-ID: On Tue, 1 Nov 2011, Paul Hiemstra wrote: > On 10/31/2011 06:30 PM, Cable, Sam B Civ USAF AFMC AFRL/RVBXI wrote: >> I am trying to install R on a pretty up-to-date CentOS system. I have >> tried installing 2.14.0 and 2.13.2. In both cases, the configure step >> fails, and does not produce a Makefile. So, of course, I can't issue >> "make". >> >> The errors that I can find in config.log are, first, several missing >> files: ac_nonexistent.h, minix/config.h., readline/history.h, and >> readline/readline.h. >> >> Then, what seems to finally bring configure to a halt is the final >> error: >> >> configure:20796: error: --with-readline=yes (default) and headers/libs >> are not available > > Hi, > > It seems that the header files/libs are not available... You probably > need to install some development libraries from the CentOS repositories. > like readline-dev or such. Installing these libraries and the header readline-devel on CentOS. > files will be available. >From the INSTALL file: 'The main source of information on installation is the `R Installation and Administration Manual', an HTML copy of which is available as file `doc/html/R-admin.html'. Please read that before installing R. But if you are impatient, read on but please refer to the manual to resolve any problems.' and this *is* covered in the manual. The volunteers who give you the free gift of R can (and do) write the manual for you, but we cannot read it for you. > > regards, > Paul > >> After this error, configure dumps long lists of shell variables, output >> variables, and confdefs.h -- none of which tell a whole lot, as far as I >> can see -- and then exits with a status of "1". >> >> Anyone know what this means? I can give bigger snippets of the >> config.log, if anyone wants to see them. >> >> I have installed R on several Linux machines. This is the first time >> I've been stumped. >> >> Thanks. >> >> --Sam >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > -- > Paul Hiemstra, Ph.D. > Global Climate Division > Royal Netherlands Meteorological Institute (KNMI) > Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 > P.O. Box 201 | 3730 AE | De Bilt > tel: +31 30 2206 494 > > http://intamap.geo.uu.nl/~paul > http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From paul.hiemstra at knmi.nl Tue Nov 1 11:36:49 2011 From: paul.hiemstra at knmi.nl (Paul Hiemstra) Date: Tue, 01 Nov 2011 10:36:49 +0000 Subject: [R] troubles installing R In-Reply-To: References: <39B5ED61E7BFC24FA8277B6DE92A9A3F04204341@fkimlki01.enterprise.afmc.ds.af.mil> <4EAFAC1C.8010507@knmi.nl> Message-ID: <4EAFCBC1.4020603@knmi.nl> On 11/01/2011 09:50 AM, Prof Brian Ripley wrote: > On Tue, 1 Nov 2011, Paul Hiemstra wrote: > >> On 10/31/2011 06:30 PM, Cable, Sam B Civ USAF AFMC AFRL/RVBXI wrote: >>> I am trying to install R on a pretty up-to-date CentOS system. I have >>> tried installing 2.14.0 and 2.13.2. In both cases, the configure step >>> fails, and does not produce a Makefile. So, of course, I can't issue >>> "make". >>> >>> The errors that I can find in config.log are, first, several missing >>> files: ac_nonexistent.h, minix/config.h., readline/history.h, and >>> readline/readline.h. >>> >>> Then, what seems to finally bring configure to a halt is the final >>> error: >>> >>> configure:20796: error: --with-readline=yes (default) and headers/libs >>> are not available >> >> Hi, >> >> It seems that the header files/libs are not available... You probably >> need to install some development libraries from the CentOS repositories. >> like readline-dev or such. Installing these libraries and the header > > readline-devel on CentOS. > >> files will be available. > > From the INSTALL file: > > 'The main source of information on installation is the `R Installation > and Administration Manual', an HTML copy of which is available as file > `doc/html/R-admin.html'. Please read that before installing R. But > if you are impatient, read on but please refer to the manual to > resolve any problems.' > > and this *is* covered in the manual. The volunteers who give you the > free gift of R can (and do) write the manual for you, but we cannot > read it for you. fortune('rtfm') :) Paul > >> >> regards, >> Paul >> >>> After this error, configure dumps long lists of shell variables, output >>> variables, and confdefs.h -- none of which tell a whole lot, as far >>> as I >>> can see -- and then exits with a status of "1". >>> >>> Anyone know what this means? I can give bigger snippets of the >>> config.log, if anyone wants to see them. >>> >>> I have installed R on several Linux machines. This is the first time >>> I've been stumped. >>> >>> Thanks. >>> >>> --Sam >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> >> -- >> Paul Hiemstra, Ph.D. >> Global Climate Division >> Royal Netherlands Meteorological Institute (KNMI) >> Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 >> P.O. Box 201 | 3730 AE | De Bilt >> tel: +31 30 2206 494 >> >> http://intamap.geo.uu.nl/~paul >> http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 From gulabrani.ritika at gmail.com Tue Nov 1 08:37:32 2011 From: gulabrani.ritika at gmail.com (Ritz) Date: Tue, 1 Nov 2011 00:37:32 -0700 (PDT) Subject: [R] Read/Write textbox in R Message-ID: <1320133052436-3962727.post@n4.nabble.com> I am writing a GUI for my R script. It is a very basic form consisting of textboxes and buttons. I tried to run the following example to learn how to read value currently entered into the textbox: ( it requires tcltk/tcltk2 packages ) # Create the widgets base <- tktoplevel() list <- tklistbox(base, width = 20, height = 5) entry <- tkentry(base) text <- tktext(base, width = 20, height = 5) tkpack(list, entry, text) # Write and read from the widgets writeList(list, c("Option1", "Option2", "Option3")) writeList(entry, "An Entry box") writeText(text, "A text box") # Will be NULL if not selected getListValue(list) getTextValue(text) getEntryValue(entry) # Destroy toplevel widget # tkdestroy(base) ## End(Not run) I took it from: http://svitsrv25.epfl.ch/R-doc/library/widgetTools/html/writeText.html. When I run the example I get the following error: "Error in function () : could not find function "getTextValue" What's missing here? Is the function getTextValue() inactive/not supported anymore? Any alternatives? Thanks! Regards, Ritika -- View this message in context: http://r.789695.n4.nabble.com/Read-Write-textbox-in-R-tp3962727p3962727.html Sent from the R help mailing list archive at Nabble.com. From vogeltho at staff.hu-berlin.de Tue Nov 1 09:29:51 2011 From: vogeltho at staff.hu-berlin.de (Thorsten Vogel) Date: Tue, 1 Nov 2011 09:29:51 +0100 Subject: [R] Question on estimating standard errors with noisy signals using the quantreg package In-Reply-To: <92112359-E586-4FB4-90D7-F37DA67823C1@illinois.edu> References: <004e01cc97c8$e97f7700$bc7e6500$@hu-berlin.de> <92112359-E586-4FB4-90D7-F37DA67823C1@illinois.edu> Message-ID: <000301cc9871$a98f29c0$fcad7d40$@hu-berlin.de> Many thanks for your comments. The median of the r_i is something around 1000. And for the time being there are no covariates, though this might change in the future. We are only starting to exploit a very nice data set. Regarding the probability of being in the data, p, I would say it is indeed constant across doctors. The data set is a subset of a larger administrative data set. While the administrative data cover all patients, the data we use cover all patients born on one of four days of the month (which are specified a priori). Since I regard this sampling procedure akin to drawing patients at random from the complete administrative data set, I think p=4/30 is constant across doctors. Again, I very much appreciate any comments or suggestions. Regards, Thorsten -----Urspr?ngliche Nachricht----- Von: Roger Koenker [mailto:rkoenker at illinois.edu] Gesendet: Montag, 31. Oktober 2011 21:24 An: Thorsten Vogel Cc: r-help at r-project.org help Betreff: Re: [R] Question on estimating standard errors with noisy signals using the quantreg package On Oct 31, 2011, at 7:30 AM, Thorsten Vogel wrote: > Dear all, > > My question might be more of a statistics question than a question on R, > although it's on how to apply the 'quantreg' package. Please accept my > apologies if you believe I am strongly misusing this list. > > To be very brief, the problem is that I have data on only a random draw, not > all of doctors' patients. I am interested in the, say, median number of > patients of doctors. Does it suffice to use the "nid" option in summary.rq? > > More specifically, if the model generating the number of patients, say, r_i, > of doctor i is > r_i = const + u_i, > then I think I would obtain the median of the number of doctors' patients > using rq(r~1, ...) and plugging this into summary.rq() using the option > se="iid". How big are the r_i? I presume that they are big enough so that you don't want to worry about the integer "features" of the data? Are there really no covariates? If so then you are fine with the iid option, but if not, probably better to use "nid". If the r_i can be small, it is worth considering the dithering approach of Machado and Santos-Silva (JASA, 2005). > > Unfortunately, I don't observe r_i in the data but, instead, in the data I > only have a fraction p of these r_i patients. In fact, with (known) > probability p a patient is included in the data. Thus, for each doctor i the > number of patients IN THE DATA follows a binomial distribution with > parameters r_i and p. For each i I now have s_i patients in the data where > s_i is a draw from this binomial distribution. That is, the problem with the > data is that I don't observe r_i but s_i. Is it reasonable to assume that the p is the same across doctors? This seems to be some sort of compound Poisson problem to me, but I may misunderstand your description. > > Simple montecarlo experiments confirm my intuition that standard errors > should be larger when using the "noisy" information s_i/p instead of (the > unobserved) r_i. > > My guess is that I can consistently estimate any quantile of the number of > doctors' patients AND THEIR STANDARD ERRORS using the quantreg's rq command: > rq(I(s/p)~1, ...) and the summary.rq() command with option se="nid". > > Am I correct? I am greatful for any help on this issue. > > Best regards, > Thorsten Vogel > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From Stefano.Conti at hpa.org.uk Tue Nov 1 10:18:08 2011 From: Stefano.Conti at hpa.org.uk (Stefano Conti) Date: Tue, 1 Nov 2011 09:18:08 -0000 Subject: [R] Vectorize 'eol' characters References: <5F844D2C953ACA48849C700A12752B9402604329@colhpaexc010.HPA.org.uk> <5F844D2C953ACA48849C700A12752B940260432B@colhpaexc010.HPA.org.uk> <12A151C6-87DC-4F1A-BC36-A83CFBBD7F4E@comcast.net> Message-ID: <5F844D2C953ACA48849C700A12752B940260432C@colhpaexc010.HPA.org.uk> Dear David, My ultimate purpose is to generate a text file encoding a LaTeX table for later inclusion in a report; while I'm aware of, and familiar with, Sweave, such table would feature _some_ whole or partial crossing horizontal lines (\hline or \cline with varying arguments, which need to be placed after the tabular end-line mark '\\'), making it amenable to neither Sweave (at least as I understand it) nor similar R functions (like Frank Harrel's latex command). Hence why I'd be bothering with differentiating end-of-line characters: ideally I require in my write.table statement the options sep="\t&\t" and eol="\\\\\n", yet with more flexibility after the LaTeX newline command '\\'. In the meantime I've managed to resolve my problem, albeit not as elegantly as I'd initially wished: I've replaced the last column of my original R matrix with an edited (through appropriate use of the paste function) version, which now incorporates all correct end-of-line strings, and then dumped to file via write.table with quote=FALSE, sep="\t&\t" and eol="\n". I'd be happy to stick with the above fix so long as I'm still missing some better solution. With many thanks for the pointers so far, all the best, -- Dr Stefano Conti Statistics Unit (room #2A19) Health Protection Services HPA Colindale 61 Colindale Avenue London NW9 5EQ, UK tel: +44 (0)208-3277825 fax: +44 (0)208-2007868 -----Original Message----- From: Comcast [mailto:dwinsemius at comcast.net] Sent: Tue 01/11/2011 01:05 To: Stefano Conti Cc: Prof Brian Ripley; r-help at r-project.org Subject: Re: [R] Vectorize 'eol' characters On Oct 31, 2011, at 2:01 PM, "Stefano Conti" wrote: > Thanks to Dr Shepard and Prof Riply for their helpful replies. > > In my original query I should have also specified that I have tried the trick, also suggested by Prof Ripley, of appending the extra-column to the original matrix before dumping to text; however, in cases where the field separator string (argument of the 'sep' option in write.table) is non-null, I'd then have it also between the original matrix's last column and the appended text column -- which is not what I want. > > Any additional suggestion / follow-up on this? With continued thanks, > Sometimes it would help to answer the question, "why bother?" I can imagine and have have even tested as a concept the possibility of intercepting the output of capture.output(write.table(...)) so that the last separator could be removed. Before posting any code I would like to see if the effort would be on target and worth the further effort. What separator are you thinking would be used and how complex is this rolling eol??? (it's a bit like an actor saying to the director ... What's my motivation?) -- David. > > -- > Dr Stefano Conti > Statistics Unit (room #2A19) > Health Protection Services > HPA Colindale > 61 Colindale Avenue > London NW9 5EQ, UK > tel: +44 (0)208-3277825 > fax: +44 (0)208-2007868 > > > > -----Original Message----- > From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] > Sent: Mon 31/10/2011 16:45 > To: Stefano Conti > Cc: r-help at r-project.org > Subject: Re: [R] Vectorize 'eol' characters > > On Mon, 31 Oct 2011, Stefano Conti wrote: > >> Dear R users, >> >> When dumping an R matrix object into a file -- typically via the >> 'write.table' function -- the 'eol' option can be used to specify >> the end-of-line character(s) which should appear at the end of each >> row. >> >> However the argument to 'eol' seems to be restricted to have length >> 1, whereas ideally I would like different rows to be written to file >> each with its own end character string. For instance: > > That's not what 'eol' means. It is the indicator of the end of line, > so of course it is the same for every line. > >>> test <- matrix(1:12, nrow=4); test >> [,1] [,2] [,3] >> [1,] 1 5 9 >> [2,] 2 6 10 >> [3,] 3 7 11 >> [4,] 4 8 12 >> >>> write.table(test, file="test.txt", sep=" ", eol=paste(" test", 1:4, "\n", sep="")) >> >>> read.table(file="test.txt", sep=" ") >> V1 V2 V3 test1 >> 1 1 5 9 test1 >> 2 2 6 10 test1 >> 3 3 7 11 test1 >> 4 4 8 12 test1 >> >> whereas I would like the last column of the dump file to be "test1", >> "test2", "test3", "test4". Is there a way this could be achieved? > > Hmn, you said it: 'the last column'. > Create what you want as the last column of your data frame: it wil > then be written to the file as the last column. > > The author of write.table. > >> With many thanks in advance for your help, kind regards, >> >> >> -- >> Dr Stefano Conti >> Statistics Unit (room #2A19) >> Health Protection Services >> HPA Colindale >> 61 Colindale Avenue >> London NW9 5EQ, UK >> tel: +44 (0)208-3277825 >> fax: +44 (0)208-2007868 >> >> ----------------------------------------- >> ************************************************************************** >> The information contained in the EMail and any attachments is >> confidential and intended solely and for the attention and use of >> the named addressee(s). It may not be disclosed to any other person >> without the express authority of the HPA, or the intended >> recipient, or both. If you are not the intended recipient, you must >> not disclose, copy, distribute or retain this message or any part >> of it. This footnote also confirms that this EMail has been swept >> for computer viruses, but please re-sweep any attachments before >> opening or saving. HTTP://www.HPA.org.uk >> ************ >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From alexandre.chausson at unil.ch Tue Nov 1 10:43:50 2011 From: alexandre.chausson at unil.ch (AlexC) Date: Tue, 1 Nov 2011 02:43:50 -0700 (PDT) Subject: [R] Correlation Matrix in R In-Reply-To: <64059A92-D1AA-4D5A-94DB-C7B78B536EF6@mcmaster.ca> References: <1319576996685-3938274.post@n4.nabble.com> <1319627025864-3940170.post@n4.nabble.com> <64059A92-D1AA-4D5A-94DB-C7B78B536EF6@mcmaster.ca> Message-ID: <1320140630142-3962939.post@n4.nabble.com> Hello, Thank you for your replies. I cannot run the function rcor.test even when having loaded package ltm. Perhaps it has to do with the fact that I am using the latest version of R and this package wasn't created under that version The function corr.test in package psych works fine. Is there anyway to export the results in a txt or csv file? Since it isn't in a data frame format it cannot simply be exported using write.table Alexandre -- View this message in context: http://r.789695.n4.nabble.com/Correlation-Matrix-in-R-tp3938274p3962939.html Sent from the R help mailing list archive at Nabble.com. From a.salucci at yahoo.com Tue Nov 1 12:21:58 2011 From: a.salucci at yahoo.com (Anera Salucci) Date: Tue, 1 Nov 2011 04:21:58 -0700 (PDT) Subject: [R] multivariate random variable Message-ID: <1320146518.55127.YahooMailNeo@web120302.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djandrija at gmail.com Tue Nov 1 12:29:49 2011 From: djandrija at gmail.com (andrija djurovic) Date: Tue, 1 Nov 2011 12:29:49 +0100 Subject: [R] Correlation Matrix in R In-Reply-To: <1320140630142-3962939.post@n4.nabble.com> References: <1319576996685-3938274.post@n4.nabble.com> <1319627025864-3940170.post@n4.nabble.com> <64059A92-D1AA-4D5A-94DB-C7B78B536EF6@mcmaster.ca> <1320140630142-3962939.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From msuzen at mango-solutions.com Tue Nov 1 12:38:24 2011 From: msuzen at mango-solutions.com (Mehmet Suzen) Date: Tue, 1 Nov 2011 11:38:24 -0000 Subject: [R] multivariate random variable In-Reply-To: <1320146518.55127.YahooMailNeo@web120302.mail.ne1.yahoo.com> References: <1320146518.55127.YahooMailNeo@web120302.mail.ne1.yahoo.com> Message-ID: <3CBFCFB1FEFFA841BA83ADF2F2A9C6FA01C9B5CE@mango-data1.Mango.local> You can use mvtnorm package http://cran.r-project.org/web/packages/mvtnorm/index.html >-----Original Message----- >From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] >On Behalf Of Anera Salucci >Sent: 01 November 2011 11:22 >To: r-help at r-project.org >Subject: [R] multivariate random variable > >Dear All, > >How can I generate multivariate random variable (not multivariate normal >) > >I am in urgent > [[alternative HTML version deleted]] LEGAL NOTICE This message is intended for the use o...{{dropped:10}} From ggrothendieck at gmail.com Tue Nov 1 12:56:11 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Tue, 1 Nov 2011 07:56:11 -0400 Subject: [R] Multiple time series with zoo In-Reply-To: References: Message-ID: On Mon, Oct 31, 2011 at 11:27 PM, Vinny Moriarty wrote: > Thanks for everyone's input so far, it is greatly appreciated. But I've got > one last task I could use some advice on > > > Here are the first few lines of my data set: > > site,time_local,time_utc,reef_type_code,sensor_type,sensor_depth_m,temperature_c > 06,2006-04-09 10:20:00,2006-04-09 20:20:00,BAK,sb39, 2, 29.63 > 06,2006-04-09 10:40:00,2006-04-09 20:40:00,BAK,sb39, 2, 29.56 > 06,2006-04-09 11:00:00,2006-04-09 21:00:00,BAK,sb39, 2, 29.51 > 06,2006-04-09 11:20:00,2006-04-09 21:20:00,BAK,sb39, 10, 29.53 > 06,2006-04-09 11:40:00,2006-04-09 21:40:00,BAK,sb39, 2, 29.57 > 06,2006-04-09 12:00:00,2006-04-09 22:00:00,BAK,sb39, 2, 29.60 > 06,2006-04-09 12:20:00,2006-04-09 22:20:00,BAK,sb39, 2, 29.66 > 06,2006-04-09 12:40:00,2006-04-09 22:40:00,BAK,sb39, 2, 29.68 > 06,2006-04-09 13:00:00,2006-04-09 23:00:00,BAK,sb39, 10, 29.68 > 06,2006-04-09 13:20:00,2006-04-09 23:20:00,BAK,sb39, 2, 29.71 > 06,2006-04-09 13:40:00,2006-04-09 23:40:00,BAK,sb39, 2, 29.68 > 06,2006-04-09 14:00:00,2006-04-10 00:00:00,BAK,sb39, 10, 29.49 > 06,2006-04-09 14:20:00,2006-04-10 00:20:00,BAK,sb39, 2, 29.31 > 06,2006-04-09 14:40:00,2006-04-10 00:40:00,BAK,sb39, 10, 29.27 > > My goal was to extract all of the 10m data (all of the "10"'s from the > "sensor_depth_m" column) and than calculate daily averages. > > > With the help from this forum I came up with this: > > library(zoo) > Data=read.table("06_BottomMountThermistors.csv",sep=",",header=TRUE,as.is > =TRUE) > Ten=subset(Data,sensor_depth_m==10L) > ?z=zoo(Ten$temperature_c,Ten$time_local) > Warning message: > In zoo(Ten$temperature_c, Ten$time_local) : > ?some methods for ?zoo? objects do not work if the index entries in ? > order.by? are not unique > ?ag=aggregate(zoo,as.Date,mean) > write.csv(ag,file="LTER_6_10m.csv") > > > Which works fine. I'm not sure why I get the error concerning unique > entries as all of the 10m "time local" data is sequential and thus unique. > Certainly some of the "temperature_c" data is repeated, but my > understanding of a zoo object is that I have the "time_local" column set up > as the order.by index. > So any thoughts on the warning message and my understanding of zoo objects > would be appreciated. > > But the last task I have for this data is to average several of these data > sets together. My thoughts were to run the code as above for 6 different > sites (column 1 is the "site" index). I still think in excel, so I was > planning on lining up all 6 sites in a spreadsheet so that the dates (daily > means from the above code) line up, and then just averaging the data like > so: > > > Date, Site1, Site2, Site3, Site4, Site5, Site6 ? Average > 2006-04-09 ,20,19,20,19,14,12,average(Site1-6) > 2006-04-10,12,13,14,15,16,12 ,average(Site1-6) > 2006-04-11,12,12,12,13,12,12 , average(Site1-6) > 2006-04-12,12,13,13,12,12,12, average(Site1-6) > > > But I figure R can do this for me, so why bother going back to excel when R > is turning out to be a way better way to work with this kind of data. I > tried using merge, but I don't think this is the right command. > So is there anyway I can have R ?take 6 different data sets, line them up > by date, and pull a grand average by day for all 6 sites? > > The solution is basically only three lines: read.zoo, grep out the required columns and append the average. A fourth statement to clean up the column names would be optional. 1. The read.zoo command below reads all the files indicated by the glob, ignoring the "NULL" columns (colClasses=...), splits each by sensor_depth_m (split=2), takes only the Date part of the date/time (FUN=as.Date) and aggregates (aggregate=mean) using mean over the rows with the same Date. It then arranges the result, z, with daily rows that look like this so that there is one column for each sensor depth/file combination: > z 2.data1.txt 10.data1.txt 2.data2.txt 10.data2.txt 2006-04-09 29.591 29.4925 29.591 29.4925 2. Then we reduce that to just the 10 columns using grep yielding z10 3. append an overall average which looks like this: > z10 10.data1.txt 10.data2.txt Average 2006-04-09 29.4925 29.4925 29.4925 4. It would be possible to add an optional fourth statement which would be as follows in this sample case in order to get prettier column names after creating z10. The statement will vary depending on the precise form of your file names so this line will work with the sample here since we are using filenames of data1.txt and data2.txt but you will may need to adjust it to your file names if they are substantially different. Here we look for column names that start with 10 followed by a dot [.] and then some characters .* and then a dot [.] and other characters .* and replace that with just the parenthesized portion \\1: colnames(z10) <- sub("10[.](.*)[.].*", "\\1", colnames(z10)) # generate test data Lines <- "site,time_local,time_utc,reef_type_code,sensor_type,sensor_depth_m,temperature_c 06,2006-04-09 10:20:00,2006-04-09 20:20:00,BAK,sb39, 2, 29.63 06,2006-04-09 10:40:00,2006-04-09 20:40:00,BAK,sb39, 2, 29.56 06,2006-04-09 11:00:00,2006-04-09 21:00:00,BAK,sb39, 2, 29.51 06,2006-04-09 11:20:00,2006-04-09 21:20:00,BAK,sb39, 10, 29.53 06,2006-04-09 11:40:00,2006-04-09 21:40:00,BAK,sb39, 2, 29.57 06,2006-04-09 12:00:00,2006-04-09 22:00:00,BAK,sb39, 2, 29.60 06,2006-04-09 12:20:00,2006-04-09 22:20:00,BAK,sb39, 2, 29.66 06,2006-04-09 12:40:00,2006-04-09 22:40:00,BAK,sb39, 2, 29.68 06,2006-04-09 13:00:00,2006-04-09 23:00:00,BAK,sb39, 10, 29.68 06,2006-04-09 13:20:00,2006-04-09 23:20:00,BAK,sb39, 2, 29.71 06,2006-04-09 13:40:00,2006-04-09 23:40:00,BAK,sb39, 2, 29.68 06,2006-04-09 14:00:00,2006-04-10 00:00:00,BAK,sb39, 10, 29.49 06,2006-04-09 14:20:00,2006-04-10 00:20:00,BAK,sb39, 2, 29.31 06,2006-04-09 14:40:00,2006-04-10 00:40:00,BAK,sb39, 10, 29.27" cat(Lines, "\n", file = "data1.txt") cat(Lines, "\n", file = "data2.txt") library(zoo) z <- read.zoo(Sys.glob("data*.txt"), header = TRUE, sep = ",", split = 2, colClasses = c("NULL", NA, "NULL", "NULL", "NULL", NA, NA), FUN = as.Date, aggregate = mean) z10 <- z[, grep("^10[.]", colnames(z))] z10$Average <- rowMeans(z10) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From ligges at statistik.tu-dortmund.de Tue Nov 1 13:34:46 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Tue, 01 Nov 2011 13:34:46 +0100 Subject: [R] Significance of trend In-Reply-To: References: Message-ID: <4EAFE766.8030602@statistik.tu-dortmund.de> On 01.11.2011 09:47, Muhammad Rahiz wrote: > Thanks for those who replied. > > I know what a p-value is and the links given reaffirm my understanding. > The code below gives p=0.26. This is more than 0.05 - hence null > hypothesis i.e. attributed by chance. > > But I did not specify the significance level at 0.05. So I'm wondering > if any part of the code states so. Or does R define significance level > at 0.05 by default? That's why you have been told to read what a p-value is! It is independent of the level of significance (which is not true for critical values). Just compare. Uwe Ligges > > Thanks > From ligges at statistik.tu-dortmund.de Tue Nov 1 13:44:00 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Tue, 01 Nov 2011 13:44:00 +0100 Subject: [R] Nested lapply? Is this allowed? In-Reply-To: <1320135881.4481.YahooMailNeo@web120111.mail.ne1.yahoo.com> References: <1320135881.4481.YahooMailNeo@web120111.mail.ne1.yahoo.com> Message-ID: <4EAFE990.7070203@statistik.tu-dortmund.de> For short: 1. I wonder if you cannot make the two steps one and hence only need one lapply anway. 2. Why not just try out? 3. If it does not work, ordinary loops may be able to. best, Uwe Ligges On 01.11.2011 09:24, Alaios wrote: > Dear all, > > I want for a given data set to call a funciton many time. That bring us all to the notion of lapply (or any other variance). My problem is that this data set should also be created from another lapply > > As for example > > > > DataToAnalyse<- return_selected_time_interval(TimeStamps,Time[[1]], Time[[2]]) # Where Time[[1]] and Time[[2]] are lists > return(lapply(do_analysis(DataToAnalyse))) > > > as you can see the DataToAnalyse should be also an lapply. For the list of give Time (Time[[1]] and Time[[2]]) should create a list DataToAnalyse. > > My concern though is that the DataToAnalyse for all the Time is huge so I was looking for something memory efficient that can do something like that > > Create Data To Analyse-->feed it into do_analysis-->store result into a list(The result is small). Repeat that sentence for all Time > > How I can do something like that efficiently in R? > > B.R > Alex > [[alternative HTML version deleted]] > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From JRadinger at gmx.at Tue Nov 1 13:52:18 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Tue, 01 Nov 2011 13:52:18 +0100 Subject: [R] Combine variables of different length Message-ID: <20111101125218.56010@gmx.net> Hi, I have got a dataset with the variables Y,X1,X2,X3. Some of these variables contain NAs. Therefore incomplete datasets aren't recognized when I am doing a regression like: model <- lm(Y~X1+X2+X3) so the resulting vector of resid(model) is obviousely shorter then the original variables. How can I combine the residuals-vector with the original dataset (Y,Xi,...). I recognize that the residuals give also a kind of an index like from the row names (2,5,7,8,9,12)... Is it possible to use in such a case the data.frame command? Or what is the best way to attach the resulting residuals back to the original dataframe with incomplete datasets? thanks /Johannes -- From marion.wenty at gmail.com Tue Nov 1 14:29:53 2011 From: marion.wenty at gmail.com (Marion Wenty) Date: Tue, 1 Nov 2011 14:29:53 +0100 Subject: [R] package descr: create weighted cross tabulation Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rharlow86 at gmail.com Tue Nov 1 12:45:43 2011 From: rharlow86 at gmail.com (Bob) Date: Tue, 1 Nov 2011 04:45:43 -0700 (PDT) Subject: [R] Interesting Memory Management Problem (Windows) In-Reply-To: <1319718994147-3944243.post@n4.nabble.com> References: <1319718994147-3944243.post@n4.nabble.com> Message-ID: <1320147943572-3963242.post@n4.nabble.com> Just to close my issue: When changing the data types from xts to numeric, the problem went away and the speed of processing increased dramatically. (seems obvious in hindsight) -- View this message in context: http://r.789695.n4.nabble.com/Interesting-Memory-Management-Problem-Windows-tp3944243p3963242.html Sent from the R help mailing list archive at Nabble.com. From someguy235 at gmail.com Tue Nov 1 13:15:24 2011 From: someguy235 at gmail.com (ethan.shepherd) Date: Tue, 1 Nov 2011 05:15:24 -0700 (PDT) Subject: [R] weibull fitdistr problem: optimization failed In-Reply-To: References: <1319810526004-3947997.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pmassicotte at hotmail.com Tue Nov 1 13:20:22 2011 From: pmassicotte at hotmail.com (Filoche) Date: Tue, 1 Nov 2011 05:20:22 -0700 (PDT) Subject: [R] Greek letter Message-ID: <1320150022095-3963311.post@n4.nabble.com> Hi everyone. I'm trying to use small letter phi in a graph produced in R. However, the small letter phi does not look as it should. In fact, it looks like this: http://r.789695.n4.nabble.com/file/n3963311/Untitled.png instead of what is here http://en.wikipedia.org/wiki/Phi Here's the code I use: expression(phi [1]) Anyone has an idea? With regards, Phil -- View this message in context: http://r.789695.n4.nabble.com/Greek-letter-tp3963311p3963311.html Sent from the R help mailing list archive at Nabble.com. From natalia.vizcaino.palomar at gmail.com Tue Nov 1 13:57:46 2011 From: natalia.vizcaino.palomar at gmail.com (=?ISO-8859-1?Q?Natalia_Vizca=EDno_Palomar?=) Date: Tue, 1 Nov 2011 12:57:46 +0000 Subject: [R] predict lmer Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ligges at statistik.tu-dortmund.de Tue Nov 1 15:09:43 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Tue, 01 Nov 2011 15:09:43 +0100 Subject: [R] Greek letter In-Reply-To: <1320150022095-3963311.post@n4.nabble.com> References: <1320150022095-3963311.post@n4.nabble.com> Message-ID: <4EAFFDA7.3090803@statistik.tu-dortmund.de> On 01.11.2011 13:20, Filoche wrote: > Hi everyone. > > I'm trying to use small letter phi in a graph produced in R. However, the > small letter phi does not look as it should. > > In fact, it looks like this: > > http://r.789695.n4.nabble.com/file/n3963311/Untitled.png Yes, an excellent phi. > instead of what is here http://en.wikipedia.org/wiki/Phi > > Here's the code I use: > > expression(phi [1]) > > Anyone has an idea? Yes, read ?plotmath and try "phi1", if that better matches your taste. Uwe Ligges > With regards, > Phil > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Greek-letter-tp3963311p3963311.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From michael.weylandt at gmail.com Tue Nov 1 15:10:42 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Tue, 1 Nov 2011 10:10:42 -0400 Subject: [R] Greek letter In-Reply-To: <1320150022095-3963311.post@n4.nabble.com> References: <1320150022095-3963311.post@n4.nabble.com> Message-ID: Try changing phi to varphi. Michael On Tue, Nov 1, 2011 at 8:20 AM, Filoche wrote: > Hi everyone. > > I'm trying to use small letter phi in a graph produced in R. However, the > small letter phi does not look as it should. > > In fact, it looks like this: > > http://r.789695.n4.nabble.com/file/n3963311/Untitled.png > > instead of what is here http://en.wikipedia.org/wiki/Phi > > Here's the code I use: > > expression(phi [1]) > > Anyone has an idea? > > With regards, > Phil > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Greek-letter-tp3963311p3963311.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From pdalgd at gmail.com Tue Nov 1 15:15:51 2011 From: pdalgd at gmail.com (peter dalgaard) Date: Tue, 1 Nov 2011 15:15:51 +0100 Subject: [R] Greek letter In-Reply-To: <1320150022095-3963311.post@n4.nabble.com> References: <1320150022095-3963311.post@n4.nabble.com> Message-ID: On Nov 1, 2011, at 13:20 , Filoche wrote: > Hi everyone. > > I'm trying to use small letter phi in a graph produced in R. However, the > small letter phi does not look as it should. > > In fact, it looks like this: > > http://r.789695.n4.nabble.com/file/n3963311/Untitled.png > > instead of what is here http://en.wikipedia.org/wiki/Phi Actually, the former _is_ in the latter... However, you probably want TeX's \varphi. > > Here's the code I use: > > expression(phi [1]) > > Anyone has an idea? demo(plotmath) # 2nd page, 2nd column, 6th entry from end. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com From lists at revelle.net Tue Nov 1 15:56:12 2011 From: lists at revelle.net (William Revelle) Date: Tue, 1 Nov 2011 09:56:12 -0500 Subject: [R] Correlation Matrix in R In-Reply-To: <1320140630142-3962939.post@n4.nabble.com> References: <1319576996685-3938274.post@n4.nabble.com> <1319627025864-3940170.post@n4.nabble.com> <64059A92-D1AA-4D5A-94DB-C7B78B536EF6@mcmaster.ca> <1320140630142-3962939.post@n4.nabble.com> Message-ID: Alexandre, The output from corr.test is a list of matrices. To export one of those matrices, simply specify which one you want: Using the example from my previous note: > library(psych) > examp <- corr.test(sat.act) > mat.c.p <- lower.tri(examp$r)*examp$r + t(lower.tri(examp$p)*examp$p) > mat.c.p mat.cp is a matrix and can be directly written using write.table (if you want). To find out what are the elements of the list produced by corr.test, use the str command; str(examp) will produce List of 5 $ r : num [1:6, 1:6] 1 0.0873 -0.0209 -0.0365 -0.0188 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:6] "gender" "education" "age" "ACT" ... .. ..$ : chr [1:6] "gender" "education" "age" "ACT" ... $ n : num [1:6, 1:6] 700 700 700 700 700 687 700 700 700 700 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:6] "gender" "education" "age" "ACT" ... .. ..$ : chr [1:6] "gender" "education" "age" "ACT" ... $ t : num [1:6, 1:6] Inf 2.314 -0.551 -0.965 -0.498 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:6] "gender" "education" "age" "ACT" ... .. ..$ : chr [1:6] "gender" "education" "age" "ACT" ... $ p : num [1:6, 1:6] 0 0.0209 0.5818 0.3349 0.6187 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:6] "gender" "education" "age" "ACT" ... .. ..$ : chr [1:6] "gender" "education" "age" "ACT" ... $ Call: language corr.test(x = sat.act) - attr(*, "class")= chr [1:2] "psych" "corr.test" To export one of those as a text file you could just copy the output, or you can write.table one element. e.g., write.table(examp$r) Bill On Nov 1, 2011, at 4:43 AM, AlexC wrote: > Hello, > > Thank you for your replies. I cannot run the function rcor.test even when > having loaded package ltm. Perhaps it has to do with the fact that I am > using the latest version of R and this package wasn't created under that > version > > The function corr.test in package psych works fine. Is there anyway to > export the results in a txt or csv file? Since it isn't in a data frame > format it cannot simply be exported using write.table > > Alexandre > > -- > View this message in context: http://r.789695.n4.nabble.com/Correlation-Matrix-in-R-tp3938274p3962939.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > William Revelle http://personality-project.org/revelle.html Professor http://personality-project.org Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern University http://www.northwestern.edu/ Use R for psychology http://personality-project.org/r It is 6 minutes to midnight http://www.thebulletin.org From marion.wenty at gmail.com Tue Nov 1 16:02:54 2011 From: marion.wenty at gmail.com (Marion Wenty) Date: Tue, 1 Nov 2011 16:02:54 +0100 Subject: [R] writing data from several matrices in R into one excel-file with several sheets In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.weylandt at gmail.com Tue Nov 1 16:14:51 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Tue, 1 Nov 2011 11:14:51 -0400 Subject: [R] Combine variables of different length In-Reply-To: <20111101125218.56010@gmx.net> References: <20111101125218.56010@gmx.net> Message-ID: Perhaps something like this: X = jitter(1:10) Y = jitter(3*X-5, factor = 3) X[3] = NA m = lm(Y~X)$fitted.values fits <- rep(NA, length(X)); fits[as.numeric(names(m))] <- m; cbind(X,Y,fits) Michael On Tue, Nov 1, 2011 at 8:52 AM, Johannes Radinger wrote: > Hi, > > I have got a dataset with > the variables Y,X1,X2,X3. > Some of these variables contain NAs. Therefore > incomplete datasets aren't recognized when > I am doing a regression like: > > model <- lm(Y~X1+X2+X3) > > so the resulting vector of resid(model) is > obviousely shorter then the original variables. > > How can I combine the residuals-vector with > the original dataset (Y,Xi,...). I recognize > that the residuals give also a kind of an index > like from the row names (2,5,7,8,9,12)... > > Is it possible to use in such a case the > data.frame command? Or what is the best > way to attach the resulting residuals back to > the original dataframe with incomplete datasets? > > thanks > > /Johannes > -- > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From wendy2.qiao at gmail.com Tue Nov 1 16:22:17 2011 From: wendy2.qiao at gmail.com (Wendy) Date: Tue, 1 Nov 2011 08:22:17 -0700 (PDT) Subject: [R] annotate histogram Message-ID: <1320160937019-3963960.post@n4.nabble.com> Hi all, I want to make a histogram like the one show http://nar.oxfordjournals.org/content/39/suppl_1/D1011/F1.expansion.html here , but I did not figure out how to add the red marks at the bottom of the bars. Could anybody help? Thank you very much -- View this message in context: http://r.789695.n4.nabble.com/annotate-histogram-tp3963960p3963960.html Sent from the R help mailing list archive at Nabble.com. From cof at qualityexcellence.es Tue Nov 1 17:14:49 2011 From: cof at qualityexcellence.es (Carlos Ortega) Date: Tue, 1 Nov 2011 17:14:49 +0100 Subject: [R] annotate histogram In-Reply-To: <1320160937019-3963960.post@n4.nabble.com> References: <1320160937019-3963960.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Tue Nov 1 17:27:12 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 1 Nov 2011 12:27:12 -0400 Subject: [R] Vectorize 'eol' characters In-Reply-To: <5F844D2C953ACA48849C700A12752B940260432C@colhpaexc010.HPA.org.uk> References: <5F844D2C953ACA48849C700A12752B9402604329@colhpaexc010.HPA.org.uk> <5F844D2C953ACA48849C700A12752B940260432B@colhpaexc010.HPA.org.uk> <12A151C6-87DC-4F1A-BC36-A83CFBBD7F4E@comcast.net> <5F844D2C953ACA48849C700A12752B940260432C@colhpaexc010.HPA.org.uk> Message-ID: <17E0A716-0756-4D3C-A07E-F93392D6F56A@comcast.net> On Nov 1, 2011, at 5:18 AM, Stefano Conti wrote: > Dear David, > > My ultimate purpose is to generate a text file encoding a LaTeX > table for later inclusion in a report; while I'm aware of, and > familiar with, Sweave, such table would feature _some_ whole or > partial crossing horizontal lines (\hline or \cline with varying > arguments, which need to be placed after the tabular end-line mark '\ > \'), making it amenable to neither Sweave (at least as I understand > it) nor similar R functions (like Frank Harrel's latex command). > > Hence why I'd be bothering with differentiating end-of-line > characters: ideally I require in my write.table statement the > options sep="\t&\t" and eol="\\\\\n", yet with more flexibility > after the LaTeX newline command '\\'. > > In the meantime I've managed to resolve my problem, albeit not as > elegantly as I'd initially wished: I've replaced the last column of > my original R matrix with an edited (through appropriate use of the > paste function) version, which now incorporates all correct end-of- > line strings, and then dumped to file via write.table with > quote=FALSE, sep="\t&\t" and eol="\n". (Sounds like what I arrived at.) > > I'd be happy to stick with the above fix so long as I'm still > missing some better solution. With many thanks for the pointers so > far, all the best, Here is what I cobbled together to as a work-around to get rid of the separator before your pseudo-EOL. Seems that some parts of it might apply in your situation, but I'm not sure it's any better than what you have constructed. Perhaps it will give you further ideas about how to encapsulate behaviors you desire in a function: # Take a dataframe: dfrm <- data.frame(a=rnorm(5), b=rnorm(5), cc =paste("tt", 1:5) ) # You can remove the last separator (in this example a comma) before the # varying "eol string" (in this example "tt") as long as it is unique on each line. sub(',tt', 'tt', capture.output(write.table(dfrm , file="", sep=",", quote=FALSE)) ) # Then this can be written to file with: mod.df <- sub(',tt', 'tt', capture.output( write.table(dfrm , file="", sep=",", col.names=FALSE, quote=FALSE)) ) writeLines(mod.df, con=file("test.txt") ) (It will still have a regular "eol" == "\n" unless you change that it the writeLines call.) -- David. > > -- > Dr Stefano Conti > > -----Original Message----- > From: Comcast [mailto:dwinsemius at comcast.net] > Sent: Tue 01/11/2011 01:05 > To: Stefano Conti > Cc: Prof Brian Ripley; r-help at r-project.org > Subject: Re: [R] Vectorize 'eol' characters > > > > On Oct 31, 2011, at 2:01 PM, "Stefano Conti" > wrote: > >> Thanks to Dr Shepard and Prof Riply for their helpful replies. >> >> In my original query I should have also specified that I have tried >> the trick, also suggested by Prof Ripley, of appending the extra- >> column to the original matrix before dumping to text; however, in >> cases where the field separator string (argument of the 'sep' >> option in write.table) is non-null, I'd then have it also between >> the original matrix's last column and the appended text column -- >> which is not what I want. >> >> Any additional suggestion / follow-up on this? With continued >> thanks, >> > > Sometimes it would help to answer the question, "why bother?" > > I can imagine and have have even tested as a concept the possibility > of intercepting the output of capture.output(write.table(...)) so > that the last separator could be removed. Before posting any code I > would like to see if the effort would be on target and worth the > further effort. What separator are you thinking would be used and > how complex is this rolling eol??? > > (it's a bit like an actor saying to the director ... What's my > motivation?) > > -- > David. >> >> -- >> Dr Stefano Conti >> Statistics Unit (room #2A19) >> >> >> -----Original Message----- >> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] >> Sent: Mon 31/10/2011 16:45 >> To: Stefano Conti >> Cc: r-help at r-project.org >> Subject: Re: [R] Vectorize 'eol' characters >> >> On Mon, 31 Oct 2011, Stefano Conti wrote: >> >>> Dear R users, >>> >>> When dumping an R matrix object into a file -- typically via the >>> 'write.table' function -- the 'eol' option can be used to specify >>> the end-of-line character(s) which should appear at the end of each >>> row. >>> >>> However the argument to 'eol' seems to be restricted to have length >>> 1, whereas ideally I would like different rows to be written to file >>> each with its own end character string. For instance: >> >> That's not what 'eol' means. It is the indicator of the end of line, >> so of course it is the same for every line. >> >>>> test <- matrix(1:12, nrow=4); test >>> [,1] [,2] [,3] >>> [1,] 1 5 9 >>> [2,] 2 6 10 >>> [3,] 3 7 11 >>> [4,] 4 8 12 >>> >>>> write.table(test, file="test.txt", sep=" ", eol=paste(" test", >>>> 1:4, "\n", sep="")) >>> >>>> read.table(file="test.txt", sep=" ") >>> V1 V2 V3 test1 >>> 1 1 5 9 test1 >>> 2 2 6 10 test1 >>> 3 3 7 11 test1 >>> 4 4 8 12 test1 >>> >>> whereas I would like the last column of the dump file to be "test1", >>> "test2", "test3", "test4". Is there a way this could be achieved? >> >> Hmn, you said it: 'the last column'. >> Create what you want as the last column of your data frame: it wil >> then be written to the file as the last column. >> >> The author of write.table. (B. D. Ripley.) >> >>> With many thanks in advance for your help, kind regards, >>> >>> >>> -- David Winsemius, MD Heritage Laboratories West Hartford, CT From tom.fletcher.mp7e at statefarm.com Tue Nov 1 17:27:40 2011 From: tom.fletcher.mp7e at statefarm.com (Tom Fletcher) Date: Tue, 1 Nov 2011 16:27:40 +0000 Subject: [R] annotate histogram In-Reply-To: <1320160937019-3963960.post@n4.nabble.com> References: <1320160937019-3963960.post@n4.nabble.com> Message-ID: See rug() and use col=2 to get red. So, as an example ... x <- rchisq(100, df=2) hist(x) abline(v=median(x), lty=2) rug(x, col=2) TF -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Wendy Sent: Tuesday, November 01, 2011 10:22 AM To: r-help at r-project.org Subject: [R] annotate histogram Hi all, I want to make a histogram like the one show http://nar.oxfordjournals.org/content/39/suppl_1/D1011/F1.expansion.html here , but I did not figure out how to add the red marks at the bottom of the bars. Could anybody help? Thank you very much -- View this message in context: http://r.789695.n4.nabble.com/annotate-histogram-tp3963960p3963960.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From amredd at gmail.com Tue Nov 1 17:33:57 2011 From: amredd at gmail.com (Andrew Redd) Date: Tue, 1 Nov 2011 10:33:57 -0600 Subject: [R] Sample size calculations for one sided binomial exact test Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From erskinepeter at hotmail.com Tue Nov 1 17:31:17 2011 From: erskinepeter at hotmail.com (Peter Erskine) Date: Tue, 1 Nov 2011 11:31:17 -0500 Subject: [R] Marketing Mix Model using PDL Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From loubn181 at gmail.com Tue Nov 1 17:32:25 2011 From: loubn181 at gmail.com (loubna181) Date: Tue, 1 Nov 2011 09:32:25 -0700 (PDT) Subject: [R] oversampling code In-Reply-To: <16D6EF48-C49E-4B48-95E3-D51EA0865198@comcast.net> References: <16D6EF48-C49E-4B48-95E3-D51EA0865198@comcast.net> Message-ID: <1320165145510-3964240.post@n4.nabble.com> Hi, Thanks all for your responses, but as I m a new user of R while trying to apply what David suggests I dont know what *"dorm" *refers to. dfrm[c(rownames(dfrm[*dorm*$Y==1,]), sample(rownames(dfrm[dfrm$Y==0]), 0.10)) , ] But to give you more details , I'm working on a table calles balance from UCI machine learning I do have a variable called class and takes 3 values : B, L and R. B represents 8% of the total and L and R 46% each one. The purpose is to have a data set with 10% of B, 40% of L and 40% of R. Thank u -- View this message in context: http://r.789695.n4.nabble.com/oversampling-code-tp3956664p3964240.html Sent from the R help mailing list archive at Nabble.com. From kevin.thorpe at utoronto.ca Tue Nov 1 17:59:14 2011 From: kevin.thorpe at utoronto.ca (Kevin E. Thorpe) Date: Tue, 01 Nov 2011 12:59:14 -0400 Subject: [R] oversampling code In-Reply-To: <1320165145510-3964240.post@n4.nabble.com> References: <16D6EF48-C49E-4B48-95E3-D51EA0865198@comcast.net> <1320165145510-3964240.post@n4.nabble.com> Message-ID: <4EB02562.4090009@utoronto.ca> On 11/01/2011 12:32 PM, loubna181 wrote: > Hi, > Thanks all for your responses, but as I m a new user of R while trying to > apply what David suggests I dont know what *"dorm" *refers to. > > dfrm[c(rownames(dfrm[*dorm*$Y==1,]), sample(rownames(dfrm[dfrm$Y==0]), > 0.10)) , ] I suspect that dorm was a typo and that dfrm is what was meant. > > But to give you more details , I'm working on a table calles balance from > UCI machine learning > I do have a variable called class and takes 3 values : B, L and R. > B represents 8% of the total and L and R 46% each one. > The purpose is to have a data set with 10% of B, 40% of L and 40% of R. > Thank u > > -- > View this message in context: http://r.789695.n4.nabble.com/oversampling-code-tp3956664p3964240.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Kevin E. Thorpe Biostatistician/Trialist, Applied Health Research Centre (AHRC) Li Ka Shing Knowledge Institute of St. Michael's Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.thorpe at utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 From dwinsemius at comcast.net Tue Nov 1 18:59:07 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 1 Nov 2011 13:59:07 -0400 Subject: [R] triangles point left, filled? In-Reply-To: <239734346.246556.1320138339838.JavaMail.apache@mail21.abv.bg> References: <239734346.246556.1320138339838.JavaMail.apache@mail21.abv.bg> Message-ID: <408B81B0-855B-46BD-98BE-4FC7676C6934@comcast.net> On Nov 1, 2011, at 5:05 AM, Martin Ivanov wrote: > Dear R users, > > I want to plot not only triangles point up and triangles point down, > which is easy using the "pch" argument to "points". I want to plot > left and right pointing triangles as well. They must be fillable > with colour. > > I browsed a little in the documentation, tried rotating the up and > down pointing triangles, but of no avail. Any suggestions will be > appreciated. > plot(1:10) > polygon(x=c(1.5,1.6,1.6), y=c(1.5,1.6,1.4)) > polygon(x=c(1.5,1.4,1.4)+1, y=c(1.5,1.6,1.4)+1) > polygon(x=c(1.5,1.7,1.7)+1, y=c(1.5,1.7,1.3)+2, col="red") > > Regards, > Martin Ivanov > > ----------------------------------------------------------------- > 100 ?? ?????. ???-?????? ???????????. > Tempobet.com > http://bg.tempobet.com/affiliates/3208311 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From michael.weylandt at gmail.com Tue Nov 1 19:18:39 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Tue, 1 Nov 2011 14:18:39 -0400 Subject: [R] help with means using tail() In-Reply-To: <1319646816.92077.YahooMailNeo@web161908.mail.bf1.yahoo.com> References: <1319646816.92077.YahooMailNeo@web161908.mail.bf1.yahoo.com> Message-ID: I think a lag() command will work for you -- but a more helpful piece of advise is probably to use xts or zoo packages rather than the base ts. Most people find ts very hard to work with and find that all their difficulties magically go away when they use those classes. In xts: it would be something like: mean(tail(rp,9)) mean(lag(tail(rp,9))) Michael On Wed, Oct 26, 2011 at 12:33 PM, Iara Faria wrote: > Hi all, > > I have 5 series??(5 ts objects: rp, igpm, ereal, jurosreal, crescpib), and want to create a vector with the means of the last values of each variable. > What I did was this: > > mrp1<-mean(tail(rp,9)) > migpm1<-mean(tail(igpm,9)) > mereal1<-mean(tail(ereal,9)) > mjr1<-mean(tail(jurosreal,9)) > mcp1<-mean(tail(crescpib,9)) > means=rbind(mrp1,migpm1,mereal1,mjr1,mcp1) > > They are monthly series, from 1995.1 to 2011.6. > So what I did was generate the mean of each variable for [2010.10 to 2011.6] (9 months, as I?wanted). > But now I want to create a vector with the means of the last 9 values [2010.10 to 2011.6]?AND the means of of 9 months but deslocated one month, that is, [2010.9 to 2011.5]. > > I tried to find examples of this but with no help. > Can anyone give a hand? > > Thanks in advance. > Regards, > Iara > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From Stefano.Conti at hpa.org.uk Tue Nov 1 18:08:21 2011 From: Stefano.Conti at hpa.org.uk (Stefano Conti) Date: Tue, 1 Nov 2011 17:08:21 -0000 Subject: [R] Vectorize 'eol' characters References: <5F844D2C953ACA48849C700A12752B9402604329@colhpaexc010.HPA.org.uk> <5F844D2C953ACA48849C700A12752B940260432B@colhpaexc010.HPA.org.uk> <12A151C6-87DC-4F1A-BC36-A83CFBBD7F4E@comcast.net> <5F844D2C953ACA48849C700A12752B940260432C@colhpaexc010.HPA.org.uk> <17E0A716-0756-4D3C-A07E-F93392D6F56A@comcast.net> Message-ID: <5F844D2C953ACA48849C700A12752B9402604335@colhpaexc010.HPA.org.uk> Dear David, Thank you for your follow-up and extra-suggestion. Your additional stab at my problem indeed looks like it'd work too; as I previously wrote, I had already devised a work-around but was nonetheless left wondering whether a more elegant and compact solution was still escaping my knowledge. Thank you as well to all R-helpers who've provided me with feedback and insights; all the best, -- Dr Stefano Conti Statistics Unit (room #2A19) Health Protection Services HPA Colindale 61 Colindale Avenue London NW9 5EQ, UK tel: +44 (0)208-3277825 fax: +44 (0)208-2007868 -----Original Message----- From: David Winsemius [mailto:dwinsemius at comcast.net] Sent: Tue 01/11/2011 16:27 To: Stefano Conti Cc: Prof Brian Ripley; r-help at r-project.org Subject: Re: [R] Vectorize 'eol' characters On Nov 1, 2011, at 5:18 AM, Stefano Conti wrote: > Dear David, > > My ultimate purpose is to generate a text file encoding a LaTeX > table for later inclusion in a report; while I'm aware of, and > familiar with, Sweave, such table would feature _some_ whole or > partial crossing horizontal lines (\hline or \cline with varying > arguments, which need to be placed after the tabular end-line mark '\ > \'), making it amenable to neither Sweave (at least as I understand > it) nor similar R functions (like Frank Harrel's latex command). > > Hence why I'd be bothering with differentiating end-of-line > characters: ideally I require in my write.table statement the > options sep="\t&\t" and eol="\\\\\n", yet with more flexibility > after the LaTeX newline command '\\'. > > In the meantime I've managed to resolve my problem, albeit not as > elegantly as I'd initially wished: I've replaced the last column of > my original R matrix with an edited (through appropriate use of the > paste function) version, which now incorporates all correct end-of- > line strings, and then dumped to file via write.table with > quote=FALSE, sep="\t&\t" and eol="\n". (Sounds like what I arrived at.) > > I'd be happy to stick with the above fix so long as I'm still > missing some better solution. With many thanks for the pointers so > far, all the best, Here is what I cobbled together to as a work-around to get rid of the separator before your pseudo-EOL. Seems that some parts of it might apply in your situation, but I'm not sure it's any better than what you have constructed. Perhaps it will give you further ideas about how to encapsulate behaviors you desire in a function: # Take a dataframe: dfrm <- data.frame(a=rnorm(5), b=rnorm(5), cc =paste("tt", 1:5) ) # You can remove the last separator (in this example a comma) before the # varying "eol string" (in this example "tt") as long as it is unique on each line. sub(',tt', 'tt', capture.output(write.table(dfrm , file="", sep=",", quote=FALSE)) ) # Then this can be written to file with: mod.df <- sub(',tt', 'tt', capture.output( write.table(dfrm , file="", sep=",", col.names=FALSE, quote=FALSE)) ) writeLines(mod.df, con=file("test.txt") ) (It will still have a regular "eol" == "\n" unless you change that it the writeLines call.) -- David. > > -- > Dr Stefano Conti > > -----Original Message----- > From: Comcast [mailto:dwinsemius at comcast.net] > Sent: Tue 01/11/2011 01:05 > To: Stefano Conti > Cc: Prof Brian Ripley; r-help at r-project.org > Subject: Re: [R] Vectorize 'eol' characters > > > > On Oct 31, 2011, at 2:01 PM, "Stefano Conti" > wrote: > >> Thanks to Dr Shepard and Prof Riply for their helpful replies. >> >> In my original query I should have also specified that I have tried >> the trick, also suggested by Prof Ripley, of appending the extra- >> column to the original matrix before dumping to text; however, in >> cases where the field separator string (argument of the 'sep' >> option in write.table) is non-null, I'd then have it also between >> the original matrix's last column and the appended text column -- >> which is not what I want. >> >> Any additional suggestion / follow-up on this? With continued >> thanks, >> > > Sometimes it would help to answer the question, "why bother?" > > I can imagine and have have even tested as a concept the possibility > of intercepting the output of capture.output(write.table(...)) so > that the last separator could be removed. Before posting any code I > would like to see if the effort would be on target and worth the > further effort. What separator are you thinking would be used and > how complex is this rolling eol??? > > (it's a bit like an actor saying to the director ... What's my > motivation?) > > -- > David. >> >> -- >> Dr Stefano Conti >> Statistics Unit (room #2A19) >> >> >> -----Original Message----- >> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] >> Sent: Mon 31/10/2011 16:45 >> To: Stefano Conti >> Cc: r-help at r-project.org >> Subject: Re: [R] Vectorize 'eol' characters >> >> On Mon, 31 Oct 2011, Stefano Conti wrote: >> >>> Dear R users, >>> >>> When dumping an R matrix object into a file -- typically via the >>> 'write.table' function -- the 'eol' option can be used to specify >>> the end-of-line character(s) which should appear at the end of each >>> row. >>> >>> However the argument to 'eol' seems to be restricted to have length >>> 1, whereas ideally I would like different rows to be written to file >>> each with its own end character string. For instance: >> >> That's not what 'eol' means. It is the indicator of the end of line, >> so of course it is the same for every line. >> >>>> test <- matrix(1:12, nrow=4); test >>> [,1] [,2] [,3] >>> [1,] 1 5 9 >>> [2,] 2 6 10 >>> [3,] 3 7 11 >>> [4,] 4 8 12 >>> >>>> write.table(test, file="test.txt", sep=" ", eol=paste(" test", >>>> 1:4, "\n", sep="")) >>> >>>> read.table(file="test.txt", sep=" ") >>> V1 V2 V3 test1 >>> 1 1 5 9 test1 >>> 2 2 6 10 test1 >>> 3 3 7 11 test1 >>> 4 4 8 12 test1 >>> >>> whereas I would like the last column of the dump file to be "test1", >>> "test2", "test3", "test4". Is there a way this could be achieved? >> >> Hmn, you said it: 'the last column'. >> Create what you want as the last column of your data frame: it wil >> then be written to the file as the last column. >> >> The author of write.table. (B. D. Ripley.) >> >>> With many thanks in advance for your help, kind regards, >>> >>> >>> -- David Winsemius, MD Heritage Laboratories West Hartford, CT ----------------------------------------- ************************************************************************** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of the HPA, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses, but please re-sweep any attachments before opening or saving. HTTP://www.HPA.org.uk *********** From wendy2.qiao at gmail.com Tue Nov 1 18:07:10 2011 From: wendy2.qiao at gmail.com (Wendy) Date: Tue, 1 Nov 2011 10:07:10 -0700 (PDT) Subject: [R] round up a number to 10^4 Message-ID: <1320167230550-3964394.post@n4.nabble.com> Hi all, I have a list of numbers, e.g., X = c(60593.23, 71631.17, 75320.1), and want to round them so the output is Y = c(60000, 80000, 80000). I tried Y<-round(X,-4), but it gives me Y = c(60000, 70000, 80000). Do anybody know how to round up a number to 10^4? Thank you in advance. Wendy -- View this message in context: http://r.789695.n4.nabble.com/round-up-a-number-to-10-4-tp3964394p3964394.html Sent from the R help mailing list archive at Nabble.com. From godina at Dal.Ca Tue Nov 1 18:16:23 2011 From: godina at Dal.Ca (Aurelie Cosandey Godin) Date: Tue, 1 Nov 2011 14:16:23 -0300 Subject: [R] Removal/selecting specific rows in a dataframe conditional on 2 columns Message-ID: <7F02926F-A8B4-42B8-89F1-864CE20B5648@dal.ca> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rpeak.99 at gmail.com Tue Nov 1 19:37:29 2011 From: rpeak.99 at gmail.com (peak99) Date: Tue, 1 Nov 2011 11:37:29 -0700 (PDT) Subject: [R] help with unequal variances Message-ID: <1320172649279-3964779.post@n4.nabble.com> Hello, I have some patient data for my masters thesis with three groups (n=16, 19 & 20) I have completed compiling the results of 7 tests, for which one of these tests the variances are unequal. I wish to perform an ANOVA between the three groups but for the one test with unequal variance (<0.001 by both bartlett and levene's test) I am not sure what to do. I thought i would run ANOVA with bonferonni post-test for groups with equal variances, then for the test with unequal variance i would use the welch correction and games-howell post-test. Does this sound reasonable? Someone has also recommended to me to use Kruskal-wallis ANOVA, then use Wilcoxon sign rank test pairwise to determine which groups are significantly different (ON ALL DATA, both equal and unequal variance tests). I don't think this is right, for two reasons: 1) Kruskal-wallis is for non-gaussian data, and i have no reason to believe they are not normal. - I have run normality tests which say they are normal, although perhaps my sample sizes are too small for a normality test? 2) i believe running pairwise Wilcoxon sign rank test is not acceptable unless there is a post-test correction for multiple comparisons (i am not aware of one); also on the wiki page for this test one of the assumptions says "Under the null hypothesis the distributions of both groups are equal" which i read to say that the variances must be equal. So I think there recommendations were based more on sample size and normality, and not my issue with variance? Ultimately i would like to know if i am going about this right with my deduction (ANOVA/Bonferonni of the test results, but welch correction and games-howell for the test with significantly different variances). and if not why and/or what you think is a better option. most appreciated to any help received! -- View this message in context: http://r.789695.n4.nabble.com/help-with-unequal-variances-tp3964779p3964779.html Sent from the R help mailing list archive at Nabble.com. From bps0002 at auburn.edu Tue Nov 1 20:39:04 2011 From: bps0002 at auburn.edu (B77S) Date: Tue, 1 Nov 2011 12:39:04 -0700 (PDT) Subject: [R] help with unequal variances In-Reply-To: <1320172649279-3964779.post@n4.nabble.com> References: <1320172649279-3964779.post@n4.nabble.com> Message-ID: <1320176344008-3965046.post@n4.nabble.com> the following is a more appropriate forum for your question, seeing as this has nothing to do with R (per se). http://stats.stackexchange.com/questions good luck. peak99 wrote: > > Hello, > > I have some patient data for my masters thesis with three groups (n=16, 19 > & 20) > > I have completed compiling the results of 7 tests, for which one of these > tests the variances are unequal. > > > I wish to perform an ANOVA between the three groups but for the one test > with unequal variance (<0.001 by both bartlett and levene's test) I am not > sure what to do. > > I thought i would run ANOVA with bonferonni post-test for groups with > equal variances, then for the test with unequal variance i would use the > welch correction and games-howell post-test. Does this sound reasonable? > > Someone has also recommended to me to use Kruskal-wallis ANOVA, then use > Wilcoxon sign rank test pairwise to determine which groups are > significantly different (ON ALL DATA, both equal and unequal variance > tests). I don't think this is right, for two reasons: > > 1) Kruskal-wallis is for non-gaussian data, and i have no reason to > believe they are not normal. > - I have run normality tests which say they are normal, although > perhaps my sample sizes are too small for a normality test? > 2) i believe running pairwise Wilcoxon sign rank test is not acceptable > unless there is a post-test correction for multiple comparisons (i am not > aware of one); also on the wiki page for this test one of the assumptions > says "Under the null hypothesis the distributions of both groups are > equal" which i read to say that the variances must be equal. > > So I think there recommendations were based more on sample size and > normality, and not my issue with variance? > > > Ultimately i would like to know if i am going about this right with my > deduction (ANOVA/Bonferonni of the test results, but welch correction and > games-howell for the test with significantly different variances). and if > not why and/or what you think is a better option. > > most appreciated to any help received! > -- View this message in context: http://r.789695.n4.nabble.com/help-with-unequal-variances-tp3964779p3965046.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Tue Nov 1 20:42:10 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 1 Nov 2011 15:42:10 -0400 Subject: [R] round up a number to 10^4 In-Reply-To: <1320167230550-3964394.post@n4.nabble.com> References: <1320167230550-3964394.post@n4.nabble.com> Message-ID: <9E8C9CA0-22EE-4C0A-AB64-D83F839897FF@comcast.net> On Nov 1, 2011, at 1:07 PM, Wendy wrote: > Hi all, > > I have a list of numbers, e.g., X = c(60593.23, 71631.17, 75320.1), > and want > to round them so the output is Y = c(60000, 80000, 80000). Under what notion of "rounding" would that be the result? > I tried > Y<-round(X,-4), but it gives me Y = c(60000, 70000, 80000). Do > anybody know > how to round up a number to 10^4? > > Thank you in advance. -- David Winsemius, MD Heritage Laboratories West Hartford, CT From michael.weylandt at gmail.com Tue Nov 1 20:48:42 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt ) Date: Tue, 1 Nov 2011 15:48:42 -0400 Subject: [R] round up a number to 10^4 In-Reply-To: <9E8C9CA0-22EE-4C0A-AB64-D83F839897FF@comcast.net> References: <1320167230550-3964394.post@n4.nabble.com> <9E8C9CA0-22EE-4C0A-AB64-D83F839897FF@comcast.net> Message-ID: <29992AF2-412A-4E6A-8CA1-1107C01712FD@gmail.com> Could you divide by your desired order of magnitude, use ceiling and then re-multiply? Michael On Nov 1, 2011, at 3:42 PM, David Winsemius wrote: > > On Nov 1, 2011, at 1:07 PM, Wendy wrote: > >> Hi all, >> >> I have a list of numbers, e.g., X = c(60593.23, 71631.17, 75320.1), and want >> to round them so the output is Y = c(60000, 80000, 80000). > > Under what notion of "rounding" would that be the result? > >> I tried >> Y<-round(X,-4), but it gives me Y = c(60000, 70000, 80000). Do anybody know >> how to round up a number to 10^4? >> >> Thank you in advance. > > > -- > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From clint at ecy.wa.gov Tue Nov 1 20:55:18 2011 From: clint at ecy.wa.gov (Clint Bowman) Date: Tue, 1 Nov 2011 12:55:18 -0700 (PDT) Subject: [R] round up a number to 10^4 In-Reply-To: <9E8C9CA0-22EE-4C0A-AB64-D83F839897FF@comcast.net> References: <1320167230550-3964394.post@n4.nabble.com> <9E8C9CA0-22EE-4C0A-AB64-D83F839897FF@comcast.net> Message-ID: Or does the middle number have two digits switched? 76131.17 would round up to 80000 very nicely. -- Clint Bowman INTERNET: clint at ecy.wa.gov Air Quality Modeler INTERNET: clint at math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600 FAX: (360) 407-7534 Olympia, WA 98504-7600 USPS: PO Box 47600, Olympia, WA 98504-7600 Parcels: 300 Desmond Drive, Lacey, WA 98503-1274 On Tue, 1 Nov 2011, David Winsemius wrote: > > On Nov 1, 2011, at 1:07 PM, Wendy wrote: > >> Hi all, >> >> I have a list of numbers, e.g., X = c(60593.23, 71631.17, 75320.1), and >> want >> to round them so the output is Y = c(60000, 80000, 80000). > > Under what notion of "rounding" would that be the result? > >> I tried >> Y<-round(X,-4), but it gives me Y = c(60000, 70000, 80000). Do anybody know >> how to round up a number to 10^4? >> >> Thank you in advance. > > > From gunter.berton at gene.com Tue Nov 1 21:13:02 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Tue, 1 Nov 2011 13:13:02 -0700 Subject: [R] round up a number to 10^4 In-Reply-To: <9E8C9CA0-22EE-4C0A-AB64-D83F839897FF@comcast.net> References: <1320167230550-3964394.post@n4.nabble.com> <9E8C9CA0-22EE-4C0A-AB64-D83F839897FF@comcast.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jholtman at gmail.com Tue Nov 1 21:38:26 2011 From: jholtman at gmail.com (jim holtman) Date: Tue, 1 Nov 2011 16:38:26 -0400 Subject: [R] round up a number to 10^4 In-Reply-To: References: <1320167230550-3964394.post@n4.nabble.com> <9E8C9CA0-22EE-4C0A-AB64-D83F839897FF@comcast.net> Message-ID: Also be aware of the IEEE standard of rounding to even: > round(61000, -4) [1] 60000 > round(62000, -4) [1] 60000 > round(65000, -4) [1] 60000 > round(66000, -4) [1] 70000 > round(76000, -4) [1] 80000 > round(75000, -4) [1] 80000 > notice what 65000 rounds to and what 75000 rounds to. On Tue, Nov 1, 2011 at 4:13 PM, Bert Gunter wrote: > Yes, I agree with David that this looks like an error. > > However, for fun, one might ask: what is the fewest number of R elementary > math operations that would produce such a result -- this might be good for > clever 6th or 7th graders, for example. > > For here, I leave this as an exercise for the reader. > > -- Bert > > On Tue, Nov 1, 2011 at 12:42 PM, David Winsemius wrote: > >> >> On Nov 1, 2011, at 1:07 PM, Wendy wrote: >> >> ?Hi all, >>> >>> I have a list of numbers, e.g., X = c(60593.23, 71631.17, 75320.1), and >>> want >>> to round them ?so the output is Y = c(60000, 80000, 80000). >>> >> >> Under what notion of "rounding" would that be the result? >> >> ?I tried >>> Y<-round(X,-4), but it gives me Y = c(60000, 70000, 80000). Do anybody >>> know >>> how to round up a number to 10^4? >>> >>> Thank you in advance. >>> >> >> >> -- >> David Winsemius, MD >> Heritage Laboratories >> West Hartford, CT >> >> ______________________________**________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/**listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/** >> posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. From NordlDJ at dshs.wa.gov Tue Nov 1 21:57:17 2011 From: NordlDJ at dshs.wa.gov (Nordlund, Dan (DSHS/RDA)) Date: Tue, 1 Nov 2011 13:57:17 -0700 Subject: [R] round up a number to 10^4 In-Reply-To: References: <1320167230550-3964394.post@n4.nabble.com> <9E8C9CA0-22EE-4C0A-AB64-D83F839897FF@comcast.net> Message-ID: <941871A13165C2418EC144ACB212BDB0021B1B96@dshsmxoly1504g.dshs.wa.lcl> Bert, How do you define "elementary"? And, do we play this like "Name That Tune"? "Bert, I can do that calculation in X operations." Or, maybe like Jeopardy, "What is the number X?" Or maybe we could play "Are You Smarter Than a Seventh-Grader?" I'm just asking. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Bert Gunter > Sent: Tuesday, November 01, 2011 1:13 PM > To: David Winsemius > Cc: r-help at r-project.org; Wendy > Subject: Re: [R] round up a number to 10^4 > > Yes, I agree with David that this looks like an error. > > However, for fun, one might ask: what is the fewest number of R > elementary > math operations that would produce such a result -- this might be good > for > clever 6th or 7th graders, for example. > > For here, I leave this as an exercise for the reader. > > -- Bert > > On Tue, Nov 1, 2011 at 12:42 PM, David Winsemius > wrote: > > > > > On Nov 1, 2011, at 1:07 PM, Wendy wrote: > > > > Hi all, > >> > >> I have a list of numbers, e.g., X = c(60593.23, 71631.17, 75320.1), > and > >> want > >> to round them so the output is Y = c(60000, 80000, 80000). > >> > > > > Under what notion of "rounding" would that be the result? > > > > I tried > >> Y<-round(X,-4), but it gives me Y = c(60000, 70000, 80000). Do > anybody > >> know > >> how to round up a number to 10^4? > >> > >> Thank you in advance. > >> > > > > > > -- > > David Winsemius, MD > > Heritage Laboratories > > West Hartford, CT > > > > ______________________________**________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/**listinfo/r- > help > > PLEASE do read the posting guide http://www.R-project.org/** > > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb- > biostatistics/pdb-ncb-home.htm > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From michael.weylandt at gmail.com Tue Nov 1 22:16:00 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Tue, 1 Nov 2011 17:16:00 -0400 Subject: [R] Removal/selecting specific rows in a dataframe conditional on 2 columns In-Reply-To: <7F02926F-A8B4-42B8-89F1-864CE20B5648@dal.ca> References: <7F02926F-A8B4-42B8-89F1-864CE20B5648@dal.ca> Message-ID: Perhaps use tapply() to split by the survey unit and write a little identity function that returns only those rows you want, then patch them all back together with something like simplify2array(). Michael On Tue, Nov 1, 2011 at 1:16 PM, Aurelie Cosandey Godin wrote: > Dear list, > > After reading different mails, blogs, and tried a few different codes without any success, I am asking your help! > I have the following data frame where each row represent a survey unit with the following variables: > >> names(RV09) > ?[1] "record.t" ?"trip" ? ? ?"set" ? ? ? "month" ? ? "stratum" ? "NAFO" > ?[7] "unit.area" "time" ? ? ?"dur.set" ? "distance" ?"operation" "mean.d" > [13] "min.d" ? ? "max.d" ? ? "temp.d" ? ?"slat" ? ? ?"slong" ? ? "spp" > [19] "number" ? ?"weight" ? ?"elat" ? ? ?"elong" > > Each survey unit generates one set record, denoted by a 5 in column "record.t". Each species identified in this particular survey unit generates an additional set record, denoted by a 6. > >> unique(RV09$record.t) > [1] 5 6 > > Each survey unit are identified by a specific "trip" and "set" number, so if there is a 5 record type with no associated 6 records, it means that no species were observed in that survey unit. I would like to be able to select all and only these survey units, which represent my zeros. > > So as an exemple, in this trip number 913, set 1, 3, and 4 would be part of my "zeros" data.frame as they appear with no record.t 6, such that no species were observed in this survey unit. > >> head(RV09) > ? record.t trip set month stratum NAFO unit.area time dur.set distance > 585 ? ? ? ?5 ?913 ? 1 ? ?10 ? ? 351 ? 3O ? ? ? R31 1044 ? ? ?17 ? ? ? ?9 > 586 ? ? ? ?5 ?913 ? 2 ? ?10 ? ? 351 ? 3O ? ? ? R31 1440 ? ? ?17 ? ? ? ?9 > 587 ? ? ? ?6 ?913 ? 2 ? ?10 ? ? 351 ? 3O ? ? ? R31 1440 ? ? ?17 ? ? ? ?9 > 588 ? ? ? ?5 ?913 ? 3 ? ?10 ? ? 340 ? 3O ? ? ? Q31 1800 ? ? ?18 ? ? ? ?9 > 589 ? ? ? ?5 ?913 ? 4 ? ?10 ? ? 340 ? 3O ? ? ? Q32 2142 ? ? ?17 ? ? ? ?9 > > Any tips on how extract this "zero" data.frame in R? > Thank you very much in advance! > > Best, > ~Aurelie > > > Aurelie Cosandey-Godin > Ph.D. student, Department of Biology > Industrial Graduate Fellow, WWF-Canada > Dalhousie University | Email: godina at dal.ca > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From Stefan.Schreiber at ales.ualberta.ca Tue Nov 1 22:28:47 2011 From: Stefan.Schreiber at ales.ualberta.ca (Schreiber, Stefan) Date: Tue, 1 Nov 2011 15:28:47 -0600 Subject: [R] factor level issue after subsetting Message-ID: <70F02259E17B6242B15D81E58EB7EB11064C97C0@afhe-ex.afhe.ualberta.ca> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From saschaview at gmail.com Tue Nov 1 22:29:53 2011 From: saschaview at gmail.com (Sascha Vieweg) Date: Tue, 1 Nov 2011 22:29:53 +0100 (CET) Subject: [R] Imputing Missing Data: A Good Starting Point? Message-ID: Hello I am working on my first attempt to impute missing data of a data set with systematically incomplete answers (school performance tests). I was googling around for some information and found Amelia (Honaker et al., 2010) and the mi package (Yu-Sung et al., n.d.). However, since I am new to this field, I was wondering whether some experts could give a good recommendation of a starting point for me, that is a point that combines theory as well as practical examples. Of course, My primary interest is to complete the task in time (1 week), however, I want to acquire skills for a program that provides some future, and of course I want some background on what I am doing (and what not). Could you help with some hints, experiences, and recommendations? Thank you. Regards *S* -- Sascha Vieweg, saschaview at gmail.com From murdoch.duncan at gmail.com Tue Nov 1 21:03:48 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Tue, 1 Nov 2011 16:03:48 -0400 Subject: [R] Windows binary of 2.14.0 Message-ID: <4EB050A4.2080803@gmail.com> To all Windows users: The binary build of 2.14.0 that was uploaded yesterday was missing Cairo support. I have rebuilt it, and uploaded a new copy. You can tell which one you have by running "svg()", which works on the new one, but not the old one. You can tell which one is on your CRAN mirror by looking at the last changed date: it is today (November 1) for the corrected build. Sorry for the inconvenience.... Duncan Murdoch _______________________________________________ R-announce at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-announce From vicvoncastle at gmail.com Tue Nov 1 22:48:14 2011 From: vicvoncastle at gmail.com (Ken) Date: Tue, 1 Nov 2011 17:48:14 -0400 Subject: [R] Imputing Missing Data: A Good Starting Point? In-Reply-To: References: Message-ID: <119D0C77-CA40-47BD-BDE1-FF8267C0087E@gmail.com> Hope this helps: http://rss.acs.unt.edu/Rdoc/library/randomForest/html/rfImpute.html Ken Hutchison On Nov 1, 2554 BE, at 5:29 PM, Sascha Vieweg wrote: > Hello > > I am working on my first attempt to impute missing data of a data set with systematically incomplete answers (school performance tests). I was googling around for some information and found Amelia (Honaker et al., 2010) and the mi package (Yu-Sung et al., n.d.). However, since I am new to this field, I was wondering whether some experts could give a good recommendation of a starting point for me, that is a point that combines theory as well as practical examples. Of course, My primary interest is to complete the task in time (1 week), however, I want to acquire skills for a program that provides some future, and of course I want some background on what I am doing (and what not). Could you help with some hints, experiences, and recommendations? > > Thank you. > > Regards > *S* > > -- > Sascha Vieweg, saschaview at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From NordlDJ at dshs.wa.gov Tue Nov 1 22:51:58 2011 From: NordlDJ at dshs.wa.gov (Nordlund, Dan (DSHS/RDA)) Date: Tue, 1 Nov 2011 14:51:58 -0700 Subject: [R] factor level issue after subsetting In-Reply-To: <70F02259E17B6242B15D81E58EB7EB11064C97C0@afhe-ex.afhe.ualberta.ca> References: <70F02259E17B6242B15D81E58EB7EB11064C97C0@afhe-ex.afhe.ualberta.ca> Message-ID: <941871A13165C2418EC144ACB212BDB0021B1BC0@dshsmxoly1504g.dshs.wa.lcl> > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Schreiber, Stefan > Sent: Tuesday, November 01, 2011 2:29 PM > To: r-help at r-project.org > Subject: [R] factor level issue after subsetting > > Dear list, > > I cannot figure out why, after sub-setting my data, that particular > item > which I don't want to plot is still in the newly created subset (please > see example below). R somehow remembers what was in the original data > set. That is the nature of factors. Once created, unused levels must be xplicitly dropped plot(droplevels(dat.sub$treat),dat.sub$yield) Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 A work around is exporting and importing the new subset. Then it's > all fine; but I don't like this idea and was wondering what am I > missing > here? > > Thanks! > Stefan > > P.S. I am using R 2.13.2 for Mac. > > > dat<-read.csv("~/MyFiles/data.csv") > > class(dat$treat) > [1] "factor" > > dat > treat yield > 1 cont 98.7 > 2 cont 97.2 > 3 cont 96.1 > 4 cont 98.1 > 5 10 103.0 > 6 10 101.3 > 7 10 102.1 > 8 10 101.9 > 9 30 121.1 > 10 30 123.1 > 11 30 119.7 > 12 30 118.9 > 13 60 109.9 > 14 60 110.1 > 15 60 113.1 > 16 60 112.3 > > plot(dat$treat,dat$yield) > > dat.sub<-dat[which(dat$treat!='cont')] > > class(dat.sub$treat) > [1] "factor" > > dat.sub > treat yield > 5 10 103.0 > 6 10 101.3 > 7 10 102.1 > 8 10 101.9 > 9 30 121.1 > 10 30 123.1 > 11 30 119.7 > 12 30 118.9 > 13 60 109.9 > 14 60 110.1 > 15 60 113.1 > 16 60 112.3 > > plot(dat.sub$treat,dat.sub$yield) > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From jtor14 at gmail.com Tue Nov 1 22:52:53 2011 From: jtor14 at gmail.com (Justin Haynes) Date: Tue, 1 Nov 2011 14:52:53 -0700 Subject: [R] factor level issue after subsetting In-Reply-To: <70F02259E17B6242B15D81E58EB7EB11064C97C0@afhe-ex.afhe.ualberta.ca> References: <70F02259E17B6242B15D81E58EB7EB11064C97C0@afhe-ex.afhe.ualberta.ca> Message-ID: first of all, the subsetting line is overly complicated. dat.sub<-dat[dat$treat!='cont',] will work just fine. R does exactly what you're describing. It knows the levels of the factor. Once you remove 'cont' from the data, that doesn't mean that the level is removed from the factor: > df<-data.frame(let=factor(sample(letters[1:5],100,replace=T)),num=rnorm(100)) > str(df) 'data.frame': 100 obs. of 2 variables: $ let: Factor w/ 5 levels "a","b","c","d",..: 1 5 1 4 3 5 2 2 1 3 ... $ num: num 0.224 -0.523 0.974 -0.268 -0.61 ... > df.sub<-df[df$let!='a',] > str(df.sub) 'data.frame': 82 obs. of 2 variables: $ let: Factor w/ 5 levels "a","b","c","d",..: 5 4 3 5 2 2 3 3 5 3 ... $ num: num -0.523 -0.268 -0.61 -1.383 -0.193 ... > unique(df.sub$let) [1] e d c b Levels: a b c d e > df.sub$let<-factor(df.sub$let) > unique(df.sub$let) [1] e d c b Levels: e d c b > str(df.sub$let) Factor w/ 4 levels "e","d","c","b": 1 2 3 1 4 4 3 3 1 3 ... > by redefining your factor you can eliminate the problem. the other option, if you don't want factors to begin with is: options(stringsAsFactors=FALSE) # to set the global option or dat<-read.csv("~/MyFiles/data.csv",stringsAsFactors=FALSE) # to set the option locally for this single read.csv call. On Tue, Nov 1, 2011 at 2:28 PM, Schreiber, Stefan wrote: > Dear list, > > I cannot figure out why, after sub-setting my data, that particular item > which I don't want to plot is still in the newly created subset (please > see example below). R somehow remembers what was in the original data > set. A work around is exporting and importing the new subset. Then it's > all fine; but I don't like this idea and was wondering what am I missing > here? > > Thanks! > Stefan > > P.S. I am using R 2.13.2 for Mac. > >> dat<-read.csv("~/MyFiles/data.csv") >> class(dat$treat) > [1] "factor" >> dat > ? treat yield > 1 ? cont ?98.7 > 2 ? cont ?97.2 > 3 ? cont ?96.1 > 4 ? cont ?98.1 > 5 ? ? 10 103.0 > 6 ? ? 10 101.3 > 7 ? ? 10 102.1 > 8 ? ? 10 101.9 > 9 ? ? 30 121.1 > 10 ? ?30 123.1 > 11 ? ?30 119.7 > 12 ? ?30 118.9 > 13 ? ?60 109.9 > 14 ? ?60 110.1 > 15 ? ?60 113.1 > 16 ? ?60 112.3 >> plot(dat$treat,dat$yield) >> dat.sub<-dat[which(dat$treat!='cont')] >> class(dat.sub$treat) > [1] "factor" >> dat.sub > ? treat yield > 5 ? ? 10 103.0 > 6 ? ? 10 101.3 > 7 ? ? 10 102.1 > 8 ? ? 10 101.9 > 9 ? ? 30 121.1 > 10 ? ?30 123.1 > 11 ? ?30 119.7 > 12 ? ?30 118.9 > 13 ? ?60 109.9 > 14 ? ?60 110.1 > 15 ? ?60 113.1 > 16 ? ?60 112.3 >> plot(dat.sub$treat,dat.sub$yield) > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From mazatlanmexico at yahoo.com Tue Nov 1 22:53:32 2011 From: mazatlanmexico at yahoo.com (Felipe Carrillo) Date: Tue, 1 Nov 2011 14:53:32 -0700 (PDT) Subject: [R] factor level issue after subsetting In-Reply-To: <70F02259E17B6242B15D81E58EB7EB11064C97C0@afhe-ex.afhe.ualberta.ca> References: <70F02259E17B6242B15D81E58EB7EB11064C97C0@afhe-ex.afhe.ualberta.ca> Message-ID: <1320184412.87115.YahooMailNeo@web125517.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Stefan.Schreiber at ales.ualberta.ca Tue Nov 1 22:59:48 2011 From: Stefan.Schreiber at ales.ualberta.ca (Schreiber, Stefan) Date: Tue, 1 Nov 2011 15:59:48 -0600 Subject: [R] factor level issue after subsetting In-Reply-To: <1320184412.87115.YahooMailNeo@web125517.mail.ne1.yahoo.com> References: <70F02259E17B6242B15D81E58EB7EB11064C97C0@afhe-ex.afhe.ualberta.ca> <1320184412.87115.YahooMailNeo@web125517.mail.ne1.yahoo.com> Message-ID: <70F02259E17B6242B15D81E58EB7EB11064C97C1@afhe-ex.afhe.ualberta.ca> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.weylandt at gmail.com Tue Nov 1 23:06:25 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Tue, 1 Nov 2011 18:06:25 -0400 Subject: [R] NROW doesn't equal length(x) In-Reply-To: <1320183447.31848.YahooMailNeo@web45210.mail.sp1.yahoo.com> References: <1319741070.3140.YahooMailNeo@web45209.mail.sp1.yahoo.com> <3C2DE2C7-0DD4-47B2-A402-BFBA4A503982@gmail.com> <1320183447.31848.YahooMailNeo@web45210.mail.sp1.yahoo.com> Message-ID: Your first problem is that you aren't using paste() properly: print out paste(ct3[, 1:2]) and take a look at it. This works: apply(head(ct3[,1:2]),1,paste,collapse = " ") You also have the format argument to POSIXct wrong. See ?strptime for details. So the whole line (if you want to do it in one) would be something like this: v = xts(ct3[,3], as.POSIXct(apply(ct3[, 1:2],1, paste, collapse = " "), format = "%m/%d/%Y %H:%M:%S")) head(v) 2011-02-22 09:31:13 19.46 2011-02-22 09:31:28 19.50 2011-02-22 09:31:43 19.55 2011-02-22 09:31:58 19.59 2011-02-22 09:32:13 19.67 2011-02-22 09:32:28 19.68 Michael PS -- It's best practice to cc the list as well as me in replies so that this gets archived. On Tue, Nov 1, 2011 at 5:37 PM, Muhammad Abuizzah wrote: > I attached a dput of my data file which I am trying to transform to xts. > > The name of my object is ct3, I am putting the generated info into "v" > The code I used to convert it to xts is as follows: > > v = xts(ct3[,3], as.POSIXct(paste (ct3 [,1:2]), format = "%MM/%DD/%YYYY %H:%M:%:S") > > I would appreciate any help in converting this data frame into xts. ?I am not sure is the NROW issue is the reason behind the failure or is it the data formate > > thanks > > > ----- Original Message ----- > From: "R. Michael Weylandt " > To: Muhammad Abuizzah > Cc: "r-help at R-project.org" > Sent: Thursday, October 27, 2011 3:23 PM > Subject: Re: [R] NROW doesn't equal length(x) > > Data frame is list internally so length(df) = ncol(df) > > M > > On Oct 27, 2011, at 2:44 PM, Muhammad Abuizzah wrote: > >> Hi, >> >> I am converting a data.frame to xts.? the data.frame is 4 columns and 1000 rows.? I get a message that "NROW (x) must match length(order.by) >> class is data.frame, mode is list >> >> when I run >> dim(x)? ?# I get >> 1000? ? ?4? ?#which is consistent with 1000 rows and 4 columns >> >> NROW (x)? # I get >> >> 1000? # which is the right answer >> >> When I run length on each of columns in x separately using the "$" I get 1000, which is the right number too. >> So length on each of the columns individually gives me the right answer, but length on the data.frame gives me the number of columns instead of the number of rows, is there an explanation >> >> >> thanks >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > From michael.weylandt at gmail.com Tue Nov 1 23:15:00 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Tue, 1 Nov 2011 18:15:00 -0400 Subject: [R] NROW doesn't equal length(x) In-Reply-To: References: <1319741070.3140.YahooMailNeo@web45209.mail.sp1.yahoo.com> <3C2DE2C7-0DD4-47B2-A402-BFBA4A503982@gmail.com> <1320183447.31848.YahooMailNeo@web45210.mail.sp1.yahoo.com> Message-ID: I perhaps made that a little too complicated: this will also work: v2 = xts(ct3[,3], as.POSIXct(paste(ct3[,1], ct3[,2]), format = "%m/%d/%Y %H:%M:%S")) identical(v, v2) TRUE On Tue, Nov 1, 2011 at 6:06 PM, R. Michael Weylandt wrote: > Your first problem is that you aren't using paste() properly: print > out paste(ct3[, 1:2]) and take a look at it. > > This works: > > apply(head(ct3[,1:2]),1,paste,collapse = " ") > > You also have the format argument to POSIXct wrong. See ?strptime for details. > > So the whole line (if you want to do it in one) would be something like this: > > v = xts(ct3[,3], as.POSIXct(apply(ct3[, 1:2],1, paste, collapse = " > "), format = "%m/%d/%Y %H:%M:%S")) > > head(v) > > 2011-02-22 09:31:13 19.46 > 2011-02-22 09:31:28 19.50 > 2011-02-22 09:31:43 19.55 > 2011-02-22 09:31:58 19.59 > 2011-02-22 09:32:13 19.67 > 2011-02-22 09:32:28 19.68 > > Michael > > PS -- It's best practice to cc the list as well as me in replies so > that this gets archived. > > On Tue, Nov 1, 2011 at 5:37 PM, Muhammad Abuizzah wrote: >> I attached a dput of my data file which I am trying to transform to xts. >> >> The name of my object is ct3, I am putting the generated info into "v" >> The code I used to convert it to xts is as follows: >> >> v = xts(ct3[,3], as.POSIXct(paste (ct3 [,1:2]), format = "%MM/%DD/%YYYY %H:%M:%:S") >> >> I would appreciate any help in converting this data frame into xts. ?I am not sure is the NROW issue is the reason behind the failure or is it the data formate >> >> thanks >> >> >> ----- Original Message ----- >> From: "R. Michael Weylandt " >> To: Muhammad Abuizzah >> Cc: "r-help at R-project.org" >> Sent: Thursday, October 27, 2011 3:23 PM >> Subject: Re: [R] NROW doesn't equal length(x) >> >> Data frame is list internally so length(df) = ncol(df) >> >> M >> >> On Oct 27, 2011, at 2:44 PM, Muhammad Abuizzah wrote: >> >>> Hi, >>> >>> I am converting a data.frame to xts.? the data.frame is 4 columns and 1000 rows.? I get a message that "NROW (x) must match length(order.by) >>> class is data.frame, mode is list >>> >>> when I run >>> dim(x)? ?# I get >>> 1000? ? ?4? ?#which is consistent with 1000 rows and 4 columns >>> >>> NROW (x)? # I get >>> >>> 1000? # which is the right answer >>> >>> When I run length on each of columns in x separately using the "$" I get 1000, which is the right number too. >>> So length on each of the columns individually gives me the right answer, but length on the data.frame gives me the number of columns instead of the number of rows, is there an explanation >>> >>> >>> thanks >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> > From jim.silverton at gmail.com Tue Nov 1 23:42:12 2011 From: jim.silverton at gmail.com (Jim Silverton) Date: Tue, 1 Nov 2011 18:42:12 -0400 Subject: [R] Superimpose xversus y plot on a histogram of x Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From shahab.mokari at gmail.com Tue Nov 1 23:58:08 2011 From: shahab.mokari at gmail.com (shahab) Date: Tue, 1 Nov 2011 23:58:08 +0100 Subject: [R] How to interpret Spearman Correlation Message-ID: Hi, I am not really familiar with Correlation foundations, although I read a lot. So maybe if someone kindly help me to interpret the following results. I had the following R commands: correlation <-cor( vector_CitationProximity , vector_Impact, method = "spearman", use="na.or.complete") cor_test<-cor.test(vector_CitationProximity, vector_Impact, method="spearman") and the results are: "correlation" Correlation = 0.04715686 "cor_test" Spearman's rank correlation rho data: vector_CitationProximity and vector_Impact S = 5581032104, p-value = 0.008736 alternative hypothesis: true rho is not equal to 0 sample estimates: rho 0.04582115 So apparently, there is positive correlation between two given variables since Correlation = 0.04715686 > 0 However I couldn't interpret the significance ?' what does "rho" say? Is there any simple sample that I can read and try to understand? I am do confused in understanding how significance can be interpreted. Thanks, /Shahab From bps0002 at auburn.edu Wed Nov 2 00:06:48 2011 From: bps0002 at auburn.edu (B77S) Date: Tue, 1 Nov 2011 16:06:48 -0700 (PDT) Subject: [R] Subsampling-oversampling from a data frame In-Reply-To: <1320187546520-3965771.post@n4.nabble.com> References: <1320187546520-3965771.post@n4.nabble.com> Message-ID: <1320188808068-3965827.post@n4.nabble.com> If no one has a better solution, split it, take a sample of size X from both and put it back together. hgwelec wrote: > > Dear members, > > Consider the following data frame (first 4 rows shown) > > > age sex class > 15 m low > 20 f high > 15 f low > 10 m low > > in my original data set i have 1200 rows and a class distribution of > low=0.3 and high=0.7 > > > My question : how can i create a new data frame as the one shown above but > with the 'high' class subsampled so that in the new data frame the class > distribution is low=0.5 and high=0.5? > > I tried looking at the sample function and prob option but all examples i > seen do not use an imbalanced class problem as the one shown above > > > Thank you in advance > > > Thank you in advance > -- View this message in context: http://r.789695.n4.nabble.com/Subsampling-oversampling-from-a-data-frame-tp3965771p3965827.html Sent from the R help mailing list archive at Nabble.com. From jthayn at ilstu.edu Tue Nov 1 20:40:15 2011 From: jthayn at ilstu.edu (Jonathan Thayn) Date: Tue, 1 Nov 2011 14:40:15 -0500 Subject: [R] Discrepancy with p.value from t.test Message-ID: <27092F59-A204-407A-9710-E711B740995D@ilstu.edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From phytophthorasb at yahoo.com Tue Nov 1 21:05:11 2011 From: phytophthorasb at yahoo.com (Empty Empty) Date: Tue, 1 Nov 2011 13:05:11 -0700 (PDT) Subject: [R] Counting entries to create a new table In-Reply-To: <1320177707.16311.YahooMailNeo@web125418.mail.ne1.yahoo.com> References: <1320177707.16311.YahooMailNeo@web125418.mail.ne1.yahoo.com> Message-ID: <1320177911.64198.YahooMailNeo@web125402.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michellev.tran at gmail.com Tue Nov 1 21:45:02 2011 From: michellev.tran at gmail.com (M. Tran) Date: Tue, 1 Nov 2011 13:45:02 -0700 (PDT) Subject: [R] condition has length > 1 for LL denominator Message-ID: <1320180302428-3965365.post@n4.nabble.com> I have a dataset called "results" that looks like this: arrive depart intercept 1 1 1 1 2 1 1 3 1 1 2 2 1 3 2 1 3 3 2 2 2 2 3 2 3 3 3 where arrive is the period of arrival, depart is the period of departure, and intercept is the period in which that person was counted. I'm trying to construct the denominator for a likelihood function using the following function. For the first row in "results", for example, I want the denominator to be the sum of all possible arrive/depart combinations an interceptor in period 1 could observe: exp(P_1_1) + exp(P_1_2) + exp(P_1_3) (i.e. P_arrive_depart). get_denominator = function(intercept, periods_per_day) { denominator = array("(", nrow(results)) for (arrival in 1:periods_per_day) { for (departure in arrival:periods_per_day) { while (arrival <= intercept & intercept <= departure) { addition_to_denom = paste("P", arrival, departure, sep = "_") if (nchar(denominator) == 1) { denominator = paste(denominator, "exp(", addition_to_denom, ")", sep = "") } else { denominator = paste(denominator, " + exp(", addition_to_denom, ")", sep = "") } } } } denominator = paste(denominator, ")") return(denominator) } denominator = get_denominator(intercept = results[,"intercept"], periods_per_day = 3) I'm getting the following warning message: In if (arrival <= intercept & intercept <= departure) { ... : the condition has length > 1 and only the first element will be used. As written, the code gives me the denominator for a period 1 interceptor for every single row! I'm having trouble figuring out how I should re-write this code. Any suggestions would be greatly appreciated. -- View this message in context: http://r.789695.n4.nabble.com/condition-has-length-1-for-LL-denominator-tp3965365p3965365.html Sent from the R help mailing list archive at Nabble.com. From stat.kk at gmail.com Tue Nov 1 23:15:04 2011 From: stat.kk at gmail.com (stat.kk) Date: Tue, 1 Nov 2011 15:15:04 -0700 (PDT) Subject: [R] Export to .txt Message-ID: <1320185704350-3965699.post@n4.nabble.com> Hi, I would like to export all my workspace (even with the evaluation of commands) to the text file. I know about the sink() function but it doesnt work as I would like. My R-function looks like this: there are instructions for user displayed by cat() command and browser() commands for fulfilling them. While using the sink() command the instructions dont display :( Can anyone help me with a equivalent command to File - Save to file... option? Thank you very much. -- View this message in context: http://r.789695.n4.nabble.com/Export-to-txt-tp3965699p3965699.html Sent from the R help mailing list archive at Nabble.com. From nfdisco at gmail.com Tue Nov 1 23:14:42 2011 From: nfdisco at gmail.com (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Tue, 1 Nov 2011 23:14:42 +0100 Subject: [R] building a subscript programatically Message-ID: <20111101221442.GA28257@doriath.local> Hi, On ocasion, you need to subscript an array that has an arbitrary (ie. not known in advance) number of dimensions. How do you deal with these situations? It appears that it is not possible use a list as an index, for instance this fails: > x <- array(NA, c(2,2,2)) > x[list(TRUE,TRUE,2)] Error in x[list(TRUE, TRUE, 2)] : invalid subscript type 'list' The only way I know is using do.call() but it's rather ugly. There must be a better way!! > do.call('[', c(list(x), TRUE, TRUE, 2)) [,1] [,2] [1,] NA NA [2,] NA NA Any idea? Regards, Ernest From assafweinstein at gmail.com Tue Nov 1 23:34:56 2011 From: assafweinstein at gmail.com (asafw) Date: Tue, 1 Nov 2011 15:34:56 -0700 (PDT) Subject: [R] predict for a cv.glmnet returns an error Message-ID: <1320186896921-3965744.post@n4.nabble.com> Hi there, I am trying to use predict() with an object returned by cv.glmnet(), and get the following error: no applicable method for 'predict' applied to an object of class "cv.glmnet" What's wrong? my code: x=matrix(rnorm(100*20),100,20) y=rnorm(100) cv.fit=cv.glmnet(x,y) predict(cv.fit,newx=x[1:5,]) coef(cv.fit) Thanks so much, Asaf -- View this message in context: http://r.789695.n4.nabble.com/predict-for-a-cv-glmnet-returns-an-error-tp3965744p3965744.html Sent from the R help mailing list archive at Nabble.com. From djmuser at gmail.com Wed Nov 2 00:15:00 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Tue, 1 Nov 2011 16:15:00 -0700 Subject: [R] round up a number to 10^4 In-Reply-To: <1320167230550-3964394.post@n4.nabble.com> References: <1320167230550-3964394.post@n4.nabble.com> Message-ID: Works in the newly released 2.14.0: > X = c(60593.23, 71631.17, 75320.1) > round(X, -4) [1] 60000 70000 80000 Dennis On Tue, Nov 1, 2011 at 10:07 AM, Wendy wrote: > Hi all, > > I have a list of numbers, e.g., X = c(60593.23, 71631.17, 75320.1), and want > to round them ?so the output is Y = c(60000, 80000, 80000). I tried > Y<-round(X,-4), but it gives me Y = c(60000, 70000, 80000). Do anybody know > how to round up a number to 10^4? > > Thank you in advance. > > Wendy > > > -- > View this message in context: http://r.789695.n4.nabble.com/round-up-a-number-to-10-4-tp3964394p3964394.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From rshepard at appl-ecosys.com Wed Nov 2 00:22:06 2011 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Tue, 1 Nov 2011 16:22:06 -0700 (PDT) Subject: [R] Export to .txt In-Reply-To: <1320185704350-3965699.post@n4.nabble.com> References: <1320185704350-3965699.post@n4.nabble.com> Message-ID: On Tue, 1 Nov 2011, stat.kk wrote: > I would like to export all my workspace (even with the evaluation of > commands) to the text file. Have you looked at .Rhistory? If you save your workspace when you quit a session with R it's put in that file. You can always read it anywhere you have a text editor. When I end work with more processing to be done, but it's repeating what I did in the current session, I open a second buffer in emacs, load .Rhistory, and use that to continue processing. This may do what you want. Rich From djmuser at gmail.com Wed Nov 2 00:32:20 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Tue, 1 Nov 2011 16:32:20 -0700 Subject: [R] Removal/selecting specific rows in a dataframe conditional on 2 columns In-Reply-To: <7F02926F-A8B4-42B8-89F1-864CE20B5648@dal.ca> References: <7F02926F-A8B4-42B8-89F1-864CE20B5648@dal.ca> Message-ID: Does this work? library('plyr') # Function to return a data frame if it has one row, else return NULL: f <- function(d) if(nrow(d) == 1L) d else NULL > ddply(RV09, .(set, month), f) record.t trip set month stratum NAFO unit.area time dur.set distance 1 5 913 1 10 351 3O R31 1044 17 9 2 5 913 3 10 340 3O Q31 1800 18 9 3 5 913 4 10 340 3O Q32 2142 17 9 ddply() is an apply-like function that takes a data frame as input and a data frame as output (hence the dd). The first argument is the data frame name, the second argument the set of grouping variables and the third is the function to be called (in this application). HTH, Dennis On Tue, Nov 1, 2011 at 10:16 AM, Aurelie Cosandey Godin wrote: > Dear list, > > After reading different mails, blogs, and tried a few different codes without any success, I am asking your help! > I have the following data frame where each row represent a survey unit with the following variables: > >> names(RV09) > ?[1] "record.t" ?"trip" ? ? ?"set" ? ? ? "month" ? ? "stratum" ? "NAFO" > ?[7] "unit.area" "time" ? ? ?"dur.set" ? "distance" ?"operation" "mean.d" > [13] "min.d" ? ? "max.d" ? ? "temp.d" ? ?"slat" ? ? ?"slong" ? ? "spp" > [19] "number" ? ?"weight" ? ?"elat" ? ? ?"elong" > > Each survey unit generates one set record, denoted by a 5 in column "record.t". Each species identified in this particular survey unit generates an additional set record, denoted by a 6. > >> unique(RV09$record.t) > [1] 5 6 > > Each survey unit are identified by a specific "trip" and "set" number, so if there is a 5 record type with no associated 6 records, it means that no species were observed in that survey unit. I would like to be able to select all and only these survey units, which represent my zeros. > > So as an exemple, in this trip number 913, set 1, 3, and 4 would be part of my "zeros" data.frame as they appear with no record.t 6, such that no species were observed in this survey unit. > >> head(RV09) > ? record.t trip set month stratum NAFO unit.area time dur.set distance > 585 ? ? ? ?5 ?913 ? 1 ? ?10 ? ? 351 ? 3O ? ? ? R31 1044 ? ? ?17 ? ? ? ?9 > 586 ? ? ? ?5 ?913 ? 2 ? ?10 ? ? 351 ? 3O ? ? ? R31 1440 ? ? ?17 ? ? ? ?9 > 587 ? ? ? ?6 ?913 ? 2 ? ?10 ? ? 351 ? 3O ? ? ? R31 1440 ? ? ?17 ? ? ? ?9 > 588 ? ? ? ?5 ?913 ? 3 ? ?10 ? ? 340 ? 3O ? ? ? Q31 1800 ? ? ?18 ? ? ? ?9 > 589 ? ? ? ?5 ?913 ? 4 ? ?10 ? ? 340 ? 3O ? ? ? Q32 2142 ? ? ?17 ? ? ? ?9 > > Any tips on how extract this "zero" data.frame in R? > Thank you very much in advance! > > Best, > ~Aurelie > > > Aurelie Cosandey-Godin > Ph.D. student, Department of Biology > Industrial Graduate Fellow, WWF-Canada > Dalhousie University | Email: godina at dal.ca > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From peter.langfelder at gmail.com Wed Nov 2 00:32:46 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Tue, 1 Nov 2011 16:32:46 -0700 Subject: [R] Discrepancy with p.value from t.test In-Reply-To: <27092F59-A204-407A-9710-E711B740995D@ilstu.edu> References: <27092F59-A204-407A-9710-E711B740995D@ilstu.edu> Message-ID: On Tue, Nov 1, 2011 at 12:40 PM, Jonathan Thayn wrote: > Sometimes the p.value returned by t.test() is the same that I calculate using pt() and sometimes it's not. I don't understand the difference. I'm sure there is a simple explanation but I haven't been able to find it, even after looking at the code for t.test.default. I apologize if this is a basic and obvious question. For example: > >> data(sleep) >> t.test(extra~group,data=sleep,var.equal=T) > > # the p.value returned is 0.07939 > >> 2*pt(-1.8608,18) ? # using the t.statistic and the df returned above > [1] 0.0791887 > > These p.values are the same. However, they are different when I use a different dataset: > >> data(beavers) >> b1 <- beaver1$temp >> b2 <- beaver2$temp >> t.test(b1,b2,var.equal=T) > > # the p.value returned is 2.2e-16 > >> 2*pt(-15.9366,212) ? # using the t.statistic and the df returned above > [1] 4.10686e-38 > > If you read the output of t.test carefully, you will find something like p-value < 2.2e-16 not p-value = 2.2e-16 so the results are not inconsistent. Not sure why t.test is coded that way, perhaps the p-value calculation is not very reliable below roughly 2e-16. This issue could also come up if the function doesn't use lower/upper tail of the distribution function as needed and then must subtract the calculated results from 1 to obtain the returned value. Here's an example: > x = rnorm(100) > y = x<0 > t.test(x~y) Welch Two Sample t-test data: x by y t = 12.9463, df = 97.424, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 1.397253 1.903200 sample estimates: mean in group FALSE mean in group TRUE 0.7596083 -0.8906181 Now do a naive pt: > pt(12.9463, df = 97.424) [1] 1 my desired p-value is 1-pt(12.9463, df = 97.424) but that's zero. Of course, I can get the p-value in a more intelligent way, > pt(12.9463, df = 97.424, lower.tail = FALSE) [1] 3.394337e-23 Peter From djmuser at gmail.com Wed Nov 2 01:09:41 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Tue, 1 Nov 2011 17:09:41 -0700 Subject: [R] Counting entries to create a new table In-Reply-To: <1320177911.64198.YahooMailNeo@web125402.mail.ne1.yahoo.com> References: <1320177707.16311.YahooMailNeo@web125418.mail.ne1.yahoo.com> <1320177911.64198.YahooMailNeo@web125402.mail.ne1.yahoo.com> Message-ID: Hi: After cleaning up your data, here's one way using the plyr and reshape packages: d <- read.csv(textConnection(" Individual, A, B, C, D Day1, 1,1,1,1 Day2, 1,3,4,2 Day3, 3,,6,4"), header = TRUE) closeAllConnections() d library('plyr') library('reshape') # Stack the variables dm <- melt(d, id = 'Individual') # Convert the new value column to factor as follows: dm$value <- factor(dm$value, levels = c(1:7, NA), exclude = NULL) # Use ddply() in conjunction with tabulate(): ddply(dm, .(variable), function(d) tabulate(d$value, nbins = 8)) variable V1 V2 V3 V4 V5 V6 V7 V8 1 A 2 0 1 0 0 0 0 0 2 B 1 0 1 0 0 0 0 1 3 C 1 0 0 1 0 1 0 0 4 D 1 1 0 1 0 0 0 0 This returns a data frame. If you want a matrix instead, use the daply() function rather than ddply() and leave everything else the same. HTH, Dennis On Tue, Nov 1, 2011 at 1:05 PM, Empty Empty wrote: > Hi, > > I am an R novice and I am trying to do something that it seems should be fairly simple, but I can't quite figure it out and I must not be using the right words when I search for answers. > > I have a dataset with a number of individuals and observations for each day (7 possible codes plus missing data) > So it looks something like this > > Individual A, B, C, D > Day1 1,1,1,1 > Day 2 1,3,4,2 > Day3 3,,6,4 > (I've also tried transposing it so that individuals are rows and days are columns) > > I want to summarize the total observation codes by individual so that I end up with a table something like this: > > ,? 1, 2 ,3 ,4, 5, 6,7, missing > A? 2,0,1,0,0,0,0,0 > B? 1,0,1,0,0,0,0,1 > C? 1,0,0,1,0,1,0,0 > D 1,1,0,1,0,0,0,0 > > If I can get that I'll be happy! Part two is > ?subsetting which days I include in the counts so that I could, say look at days 1-30 and 30-60 separately - create two different tables from the same original table. (Or I can do it manually and start with different subsets of the data). > > Thanks so much for any help. > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From rolf.turner at xtra.co.nz Wed Nov 2 01:10:09 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Wed, 02 Nov 2011 13:10:09 +1300 Subject: [R] building a subscript programatically In-Reply-To: <20111101221442.GA28257@doriath.local> References: <20111101221442.GA28257@doriath.local> Message-ID: <4EB08A61.405@xtra.co.nz> On 02/11/11 11:14, Ernest Adrogu? wrote: > Hi, > > On ocasion, you need to subscript an array that has an arbitrary > (ie. not known in advance) number of dimensions. How do you deal with > these situations? > It appears that it is not possible use a list as an index, for > instance this fails: > >> x<- array(NA, c(2,2,2)) >> x[list(TRUE,TRUE,2)] > Error in x[list(TRUE, TRUE, 2)] : invalid subscript type 'list' > > The only way I know is using do.call() but it's rather ugly. There > must be a better way!! > >> do.call('[', c(list(x), TRUE, TRUE, 2)) > [,1] [,2] > [1,] NA NA > [2,] NA NA > > Any idea? It's possible that matrix subscripting might help you. E.g.: a <- array(1:60,dim=c(3,4,5)) m <- matrix(c(1,1,1,2,2,2,3,4,5,1,2,5),byrow=TRUE,ncol=3) a[m] [1] 1 17 60 52 You can build "m" to have the same number of columns as your array has dimensions. It's not clear to me what result you want in your example. cheers, Rolf Turner From saldanha.plangeo at gmail.com Wed Nov 2 01:09:59 2011 From: saldanha.plangeo at gmail.com (Raphael Saldanha) Date: Tue, 1 Nov 2011 22:09:59 -0200 Subject: [R] How to interpret Spearman Correlation In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Wed Nov 2 01:22:05 2011 From: dwinsemius at comcast.net (Comcast) Date: Tue, 1 Nov 2011 20:22:05 -0400 Subject: [R] building a subscript programatically In-Reply-To: <20111101221442.GA28257@doriath.local> References: <20111101221442.GA28257@doriath.local> Message-ID: Leaving the indices empty should give you what I'm guessing you want/expect. x[,,2] #. TRUE would also work, just not in a list. David. On Nov 1, 2011, at 6:14 PM, Ernest Adrogu? wrote: > Hi, > > On ocasion, you need to subscript an array that has an arbitrary > (ie. not known in advance) number of dimensions. How do you deal with > these situations? > It appears that it is not possible use a list as an index, for > instance this fails: > >> x <- array(NA, c(2,2,2)) >> x[list(TRUE,TRUE,2)] > Error in x[list(TRUE, TRUE, 2)] : invalid subscript type 'list' > > The only way I know is using do.call() but it's rather ugly. There > must be a better way!! > >> do.call('[', c(list(x), TRUE, TRUE, 2)) > [,1] [,2] > [1,] NA NA > [2,] NA NA > > Any idea? > > Regards, > Ernest > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From djmuser at gmail.com Wed Nov 2 01:25:53 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Tue, 1 Nov 2011 17:25:53 -0700 Subject: [R] predict for a cv.glmnet returns an error In-Reply-To: <1320186896921-3965744.post@n4.nabble.com> References: <1320186896921-3965744.post@n4.nabble.com> Message-ID: Hi: Here's what I got when I ran your code: library('glmnet') > x=matrix(rnorm(100*20),100,20) > y=rnorm(100) > cv.fit=cv.glmnet(x,y) > predict(cv.fit,newx=x[1:5,]) 1 [1,] 0.1213114 [2,] 0.1213114 [3,] 0.1213114 [4,] 0.1213114 [5,] 0.1213114 > coef(cv.fit) 21 x 1 sparse Matrix of class "dgCMatrix" 1 (Intercept) 0.1213114 V1 0.0000000 V2 0.0000000 V3 0.0000000 V4 0.0000000 V5 0.0000000 V6 0.0000000 V7 0.0000000 V8 0.0000000 V9 0.0000000 V10 0.0000000 V11 0.0000000 V12 0.0000000 V13 0.0000000 V14 0.0000000 V15 0.0000000 V16 0.0000000 V17 0.0000000 V18 0.0000000 V19 0.0000000 V20 0.0000000 ### Check against the versions of the packages listed below: > sessionInfo() R version 2.14.0 (2011-10-31) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] grid stats graphics grDevices utils datasets methods [8] base other attached packages: [1] glmnet_1.7.1 Matrix_1.0-1 lattice_0.20-0 ggplot2_0.8.9 proto_0.3-9.2 [6] reshape_0.8.4 plyr_1.6 loaded via a namespace (and not attached): [1] tools_2.14.0 Dennis On Tue, Nov 1, 2011 at 3:34 PM, asafw wrote: > > Hi there, > > I am trying to use predict() with an object returned by cv.glmnet(), and get > the following error: > no applicable method for 'predict' applied to an object of class "cv.glmnet" > > What's wrong? > > my code: > > x=matrix(rnorm(100*20),100,20) > y=rnorm(100) > cv.fit=cv.glmnet(x,y) > predict(cv.fit,newx=x[1:5,]) > coef(cv.fit) > > Thanks so much, > > Asaf > > -- > View this message in context: http://r.789695.n4.nabble.com/predict-for-a-cv-glmnet-returns-an-error-tp3965744p3965744.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From dwinsemius at comcast.net Wed Nov 2 01:34:21 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 1 Nov 2011 20:34:21 -0400 Subject: [R] How to interpret Spearman Correlation In-Reply-To: References: Message-ID: Shahab; You would be well advised not to seek private tutoring from someone on the Internet who tells you that a p-value of 0.008736 is "not significant". On Nov 1, 2011, at 8:09 PM, Raphael Saldanha wrote: > Hi Shahab, > > This test shows that there is some positive statistical correlation, BUT > the p-value of the test - this is, the level of significance - shows that > the correlation is not statistically significant at 95% confidence level. > So, the correlation may be equal to zero. > > To understand this concepts in a good way, you need to be secure about > variance and hypothesis test. > > I can help you more if you need. Send me a direct mail (this list is for > doubts about R, not conceptual statistics). I will be happy to help you > with Statistics. > > My e-mail: saldanha.plangeo at gmail.com > > On Tue, Nov 1, 2011 at 8:58 PM, shahab wrote: > >> Hi, >> >> I am not really familiar with Correlation foundations, although I read >> a lot. So maybe if someone kindly help me to interpret the following >> results. >> I had the following R commands: >> >> correlation <-cor( vector_CitationProximity , vector_Impact, method = >> "spearman", use="na.or.complete") >> cor_test<-cor.test(vector_CitationProximity, vector_Impact, >> method="spearman") >> >> and the results are: >> "correlation" >> Correlation = 0.04715686 >> >> "cor_test" >> Spearman's rank correlation rho >> >> data: vector_CitationProximity and vector_Impact >> S = 5581032104, p-value = 0.008736 >> alternative hypothesis: true rho is not equal to 0 >> sample estimates: >> rho >> 0.04582115 >> >> >> So apparently, there is positive correlation between two given >> variables since Correlation = 0.04715686 > 0 >> However I couldn't interpret the significance ?' what does "rho" say? >> Is there any simple sample that I can read and try to understand? I am >> do confused in understanding how significance can be interpreted. >> >> Thanks, >> >> /Shahab >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Atenciosamente, > > Raphael Saldanha > saldanha.plangeo at gmail.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jennifer.s.lyon at gmail.com Wed Nov 2 01:35:07 2011 From: jennifer.s.lyon at gmail.com (Jennifer Lyon) Date: Tue, 1 Nov 2011 18:35:07 -0600 Subject: [R] Calling str() on mlogit object gives warnings Message-ID: Hi: When I call str() on an mlogit object, I seem to get warnings. This code is from an example provided in the mlogit documentation: library(mlogit) data("Train", package="mlogit") tr<-mlogit.data(Train, shape="wide", choice="choice", varying=4:11, sep="", alt.levels=c(1,2), id="id") ml.train<-mlogit(choice~price+time+change+comfort| -1, tr) str(ml.train) List of 12 $ coefficients : atomic [1:4] -0.00148 -0.02868 -0.32634 -0.94573 ..- attr(*, "fixed")= Named logi [1:4] FALSE FALSE FALSE FALSE .. ..- attr(*, "names")= chr [1:4] "price" "time" "change" "comfort" $ logLik :Class 'logLik' : -1724 (df=4) $ gradient : Named num [1:4] -3.94e-09 -1.40e-11 6.15e-13 -6.11e-13 ..- attr(*, "names")= chr [1:4] "price" "time" "change" "comfort" $ gradi : num [1:2929, 1:4] -136 -281 -309 -139 448 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:2929] "1.1" "2.1" "3.1" "4.1" ... .. ..$ : chr [1:4] "price" "time" "change" "comfort" $ hessian : num [1:4, 1:4] -2.75e+08 2.48e+06 5.10e+04 9.95e+04 2.48e+06 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:4] "price" "time" "change" "comfort" .. ..$ : chr [1:4] "price" "time" "change" "comfort" $ est.stat :List of 5 ..$ elaps.time:Class 'proc_time' Named num [1:5] 0.12 0 0.12 0 0 .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ... ..$ nb.iter : num 5 ..$ eps : num [1, 1] 0.00014 ..$ method : chr "nr" ..$ code : num 2 ..- attr(*, "class")= chr "est.stat" $ fitted.values: num [1:2929, 1:2] 0.915 0.649 0.807 0.174 0.56 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:2] "X1" "X2" $ residuals : num [1:2929, 1:2] 0.0851 0.3512 0.1932 -0.1737 -0.5602 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:2929] "1.1" "2.1" "3.1" "4.1" ... .. ..$ : chr [1:2] "X1" "X2" $ model :Classes ?mlogit.data? and 'data.frame': 5858 obs. of 5 variables: ..$ choice : logi [1:5858] TRUE FALSE TRUE FALSE TRUE FALSE ... ..$ price : int [1:5858] 2400 4000 2400 3200 2400 4000 4000 3200 2400 3200 ... ..$ time : int [1:5858] 150 150 150 130 115 115 130 150 150 150 ... ..$ change : int [1:5858] 0 0 0 0 0 0 0 0 0 0 ... ..$ comfort: int [1:5858] 1 1 1 1 1 0 1 0 1 0 ... ..- attr(*, "terms")=Classes 'terms', 'formula' length 3 choice ~ price + time + change + comfort + -1 .. .. ..- attr(*, "variables")= language list(choice, price, time, change, comfort) .. .. ..- attr(*, "factors")= int [1:5, 1:4] 0 1 0 0 0 0 0 1 0 0 ... .. .. .. ..- attr(*, "dimnames")=List of 2 .. .. .. .. ..$ : chr [1:5] "choice" "price" "time" "change" ... .. .. .. .. ..$ : chr [1:4] "price" "time" "change" "comfort" .. .. ..- attr(*, "term.labels")= chr [1:4] "price" "time" "change" "comfort" .. .. ..- attr(*, "order")= int [1:4] 1 1 1 1 .. .. ..- attr(*, "intercept")= int 0 .. .. ..- attr(*, "response")= int 1 .. .. ..- attr(*, ".Environment")= .. .. ..- attr(*, "predvars")= language list(choice, price, time, change, comfort) .. .. ..- attr(*, "dataClasses")= Named chr [1:5] "logical" "numeric" "numeric" "numeric" ... .. .. .. ..- attr(*, "names")= chr [1:5] "choice" "price" "time" "change" ... ..- attr(*, "index")='data.frame': 5858 obs. of 3 variables: .. ..$ chid: Factor w/ 2929 levels "1","2","3","4",..: 1 1 2 2 3 3 4 4 5 5 ... .. ..$ alt : Factor w/ 2 levels "1","2": 1 2 1 2 1 2 1 2 1 2 ... .. ..$ id : Factor w/ 235 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ... $ freq : 'table' int [1:2(1d)] 1474 1455 ..- attr(*, "dimnames")=List of 1 .. ..$ : chr [1:2] "1" "2" $ formula :Classes 'mFormula', 'Formula', 'formula' length 3 choice ~ price + time + change + comfort | -1 .. ..- attr(*, ".Environment")= .. ..- attr(*, "lhs")=List of 1 .. .. ..$ : symbol choice .. ..- attr(*, "rhs")=List of 2 .. .. ..$ : language price + time + change + comfort .. .. ..$ : language -1 $ call : language mlogit(formula = choice ~ price + time + change + comfort | -1, data = tr, method = "nr", print.level = 0) - attr(*, "class")= chr "mlogit" Warning messages: 1: In if (is.na(le)) " __no length(.)__ " else if (give.length) { : the condition has length > 1 and only the first element will be used 2: In if (le > 0) P0("[1:", paste(le), "]") else "(0)" : the condition has length > 1 and only the first element will be used I am surprised that I'm getting warnings. Am I doing something wrong here, or is the mlogit object misformed in some way, or is there a problem with str()? I don't think the str() is specific to mlogit, since if I do utils:::str.default(ml.train) I get the same warnings. Thanks. Jen sessionInfo() R version 2.14.0 (2011-10-31) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] mlogit_0.2-1 maxLik_1.0-2 miscTools_0.6-10 lmtest_0.9-29 [5] zoo_1.7-5 statmod_1.4.13 Formula_1.0-1 loaded via a namespace (and not attached): [1] grid_2.14.0 lattice_0.20-0 sandwich_2.2-8 tools_2.14.0 From dwinsemius at comcast.net Wed Nov 2 01:44:21 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 1 Nov 2011 20:44:21 -0400 Subject: [R] building a subscript programatically In-Reply-To: <20111102003050.GA29053@doriath.local> References: <20111101221442.GA28257@doriath.local> <20111102003050.GA29053@doriath.local> Message-ID: <80B018C0-7149-4A7A-BFFF-164FEE705ABE@comcast.net> Yes,Ii did fail to read your post carefully and agree do.call seems roundabout, but alternatives look even more tortured. (You might want to include more context in the future.) On Nov 1, 2011, at 8:30 PM, Ernest Adrogu? wrote: > 1/11/11 @ 20:22 (-0400), Comcast escriu: >> Leaving the indices empty should give you what I'm guessing you want/expect. >> >> x[,,2] #. TRUE would also work, just not in a list. > > Exactly, but this only works if x has three dimensions. What I want is > x[,,2] if x has three dimensions, x[,,,2] if it has four, and so > forth. I cannot hard code [,,2] because I do not know how many > dimensions x will have, instead the subscript has to be built "on the > fly". > > Ernest From dwinsemius at comcast.net Wed Nov 2 01:49:37 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 1 Nov 2011 20:49:37 -0400 Subject: [R] Export to .txt In-Reply-To: References: <1320185704350-3965699.post@n4.nabble.com> Message-ID: <617C0638-18C8-443A-8C5F-A4DC78F2B487@comcast.net> The if function only takes an argument of length 1 (as the warning says): ?"if" Many such confusions are resolved by looking at : ?ifelse -- David On Nov 1, 2011, at 7:22 PM, Rich Shepard wrote: > On Tue, 1 Nov 2011, stat.kk wrote: > >> I would like to export all my workspace (even with the evaluation of >> commands) to the text file. > > Have you looked at .Rhistory? If you save your workspace when you quit a > session with R it's put in that file. You can always read it anywhere you > have a text editor. > > When I end work with more processing to be done, but it's repeating what I > did in the current session, I open a second buffer in emacs, load .Rhistory, > and use that to continue processing. > > This may do what you want. > > Rich > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From michael.weylandt at gmail.com Wed Nov 2 02:27:43 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt ) Date: Tue, 1 Nov 2011 21:27:43 -0400 Subject: [R] Superimpose xversus y plot on a histogram of x In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Wed Nov 2 02:29:34 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 1 Nov 2011 21:29:34 -0400 Subject: [R] condition has length > 1 for LL denominator In-Reply-To: <1320180302428-3965365.post@n4.nabble.com> References: <1320180302428-3965365.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jones at reed.edu Wed Nov 2 02:41:08 2011 From: jones at reed.edu (Albyn Jones) Date: Tue, 1 Nov 2011 18:41:08 -0700 Subject: [R] Discrepancy with p.value from t.test In-Reply-To: <27092F59-A204-407A-9710-E711B740995D@ilstu.edu> References: <27092F59-A204-407A-9710-E711B740995D@ilstu.edu> Message-ID: <20111102014108.GA15947@reed.edu> The print method is the issue: > t.out <- t.test(b1,b2,var.equal=T) > t.out$p.value [1] 4.108001e-38 > t.out$statistic t -15.93656 albyn On Tue, Nov 01, 2011 at 02:40:15PM -0500, Jonathan Thayn wrote: > Sometimes the p.value returned by t.test() is the same that I calculate using pt() and sometimes it's not. I don't understand the difference. I'm sure there is a simple explanation but I haven't been able to find it, even after looking at the code for t.test.default. I apologize if this is a basic and obvious question. For example: > > > data(sleep) > > t.test(extra~group,data=sleep,var.equal=T) > > # the p.value returned is 0.07939 > > > 2*pt(-1.8608,18) # using the t.statistic and the df returned above > [1] 0.0791887 > > These p.values are the same. However, they are different when I use a different dataset: > > > data(beavers) > > b1 <- beaver1$temp > > b2 <- beaver2$temp > > t.test(b1,b2,var.equal=T) > > # the p.value returned is 2.2e-16 > > > 2*pt(-15.9366,212) # using the t.statistic and the df returned above > [1] 4.10686e-38 > > > Jonathan B. Thayn, Ph.D. > Illinois State University > Department of Geography and Geology > 200A Felmley Hall > Normal, Illinois 61790 > > (309) 438-8112 > jthayn at ilstu.edu > my.ilstu.edu/~jthayn > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Albyn Jones Reed College jones at reed.edu From jholtman at gmail.com Wed Nov 2 02:41:28 2011 From: jholtman at gmail.com (jim holtman) Date: Tue, 1 Nov 2011 21:41:28 -0400 Subject: [R] condition has length > 1 for LL denominator In-Reply-To: References: <1320180302428-3965365.post@n4.nabble.com> Message-ID: will this do it: > x <- read.table(textConnection("arrive depart intercept + 1 1 1 + 1 2 1 + 1 3 1 + 1 2 2 + 1 3 2 + 1 3 3 + 2 2 2 + 2 3 2 + 3 3 3"), header = TRUE) > closeAllConnections() > denom <- lapply(split(x, x$intercept), function(.int){ + paste( + sprintf("exp(P_%d_%d)", .int$arrive, .int$depart) + , collapse = "+" + ) + }) > > denom $`1` [1] "exp(P_1_1)+exp(P_1_2)+exp(P_1_3)" $`2` [1] "exp(P_1_2)+exp(P_1_3)+exp(P_2_2)+exp(P_2_3)" $`3` [1] "exp(P_1_3)+exp(P_3_3)" On Tue, Nov 1, 2011 at 9:29 PM, David Winsemius wrote: > ?Posted to another thread a response to this posting ( and to all those who wanted R on an iPad, I say "forget it" > --------- > > The if function only takes an argument of length 1 (as the warning says): > > ?"if" > > Many such confusions are resolved by looking at : > > ?ifelse > > -- > David > On Nov 1, 2011, at 4:45 PM, "M. Tran" wrote: > >> I have a dataset called "results" that looks like this: >> >> arrive ?depart ?intercept >> ?1 ? ? ? ?1 ? ? ? ? ?1 >> ?1 ? ? ? ?2 ? ? ? ? ?1 >> ?1 ? ? ? ?3 ? ? ? ? ?1 >> ?1 ? ? ? ?2 ? ? ? ? ?2 >> ?1 ? ? ? ?3 ? ? ? ? ?2 >> ?1 ? ? ? ?3 ? ? ? ? ?3 >> ?2 ? ? ? ?2 ? ? ? ? ?2 >> ?2 ? ? ? ?3 ? ? ? ? ?2 >> ?3 ? ? ? ?3 ? ? ? ? ?3 >> >> where arrive is the period of arrival, depart is the period of departure, >> and intercept is the period in which that person was counted. ?I'm trying to >> construct the denominator for a likelihood function using the following >> function. ?For the first row in "results", for example, I want the >> denominator to be the sum of all possible arrive/depart combinations an >> interceptor in period 1 could observe: exp(P_1_1) + exp(P_1_2) + exp(P_1_3) >> (i.e. P_arrive_depart). >> >> get_denominator = function(intercept, periods_per_day) >> ? ?{ >> ? ?denominator ? ?= ? ?array("(", nrow(results)) >> ? ?for (arrival in 1:periods_per_day) >> ? ?{ >> ? ? ? ?for (departure in arrival:periods_per_day) >> ? ? ? ?{ >> ? ? ? ? ? ?while (arrival <= intercept & intercept <= departure) >> ? ? ? ? ? ?{ >> ? ? ? ?addition_to_denom ? ?= ? ?paste("P", arrival, departure, sep = "_") >> ? ? ? ? ? ? ? ?if (nchar(denominator) == 1) >> ? ? ? ? ? ? ? ?{ >> ? ? ? ?denominator ? ? ? ?= ? ?paste(denominator, "exp(", addition_to_denom, ")", sep = >> "") >> ? ? ? ? ? ? ? ?} >> ? ? ? ? ? ? ? ?else >> ? ? ? ? ? ? ? ?{ >> ? ? ? ?denominator ? ? ? ?= ? ?paste(denominator, " + exp(", addition_to_denom, ")", sep = >> "") >> ? ? ? ? ? ? ? ?} >> ? ? ? ? ? ?} >> ? ? ? ?} >> ? ?} >> ? ? ? ?denominator ? ? ? ?= ? ?paste(denominator, ")") >> ? ? ? ?return(denominator) >> ? ?} >> >> >> denominator ? ?= ? ?get_denominator(intercept ? ? ? ?= ? ?results[,"intercept"], >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?periods_per_day ? ?= ? ?3) >> >> >> I'm getting the following warning message: >> >> In if (arrival <= intercept & intercept <= departure) { ... : >> ?the condition has length > 1 and only the first element will be used. >> >> As written, the code gives me the denominator for a period 1 interceptor for >> every single row! >> >> I'm having trouble figuring out how I should re-write this code. ?Any >> suggestions would be greatly appreciated. >> >> >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/condition-has-length-1-for-LL-denominator-tp3965365p3965365.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. From djmuser at gmail.com Wed Nov 2 02:57:35 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Tue, 1 Nov 2011 18:57:35 -0700 Subject: [R] building a subscript programatically In-Reply-To: <20111101221442.GA28257@doriath.local> References: <20111101221442.GA28257@doriath.local> Message-ID: Here's a hack, but perhaps you might want to rethink what type of output you want. # Function: g <- function(arr, lastSubscript = 1) { n <- length(dim(arr)) commas <- paste(rep(',', n - 1), collapse = '') .call <- paste('arr[', commas, lastSubscript, ']', sep = '') eval(parse(text = .call)) } # Examples: a1 <- array(1:8, c(2, 2, 2)) a2 <- array(1:16, c(2, 2, 2, 2)) a3 <- array(1:32, c(2, 2, 2, 2, 2)) g(a1, 2) g(a2, 2) g(a3, 2) Notice the subscripting in the last two examples - if you only want one submatrix returned, then try this: h <- function(arr, lastSubscript = c(1)) { n <- length(dim(arr)) subs <- if(length(lastSubscript) > 1) paste(lastSubscript, collapse = ',') else lastSubscript .call <- paste('arr[,,', subs, ']', sep = '') eval(parse(text = .call)) } h(a2, c(1, 1)) h(a3, c(2, 1, 1)) These functions have some ugly code, but I think it does what you were looking for. Hopefully someone can devise a more elegant solution. Dennis HTH 2011/11/1 Ernest Adrogu? : > Hi, > > On ocasion, you need to subscript an array that has an arbitrary > (ie. not known in advance) number of dimensions. How do you deal with > these situations? > It appears that it is not possible use a list as an index, for > instance this fails: > >> x <- array(NA, c(2,2,2)) >> x[list(TRUE,TRUE,2)] > Error in x[list(TRUE, TRUE, 2)] : invalid subscript type 'list' > > The only way I know is using do.call() but it's rather ugly. There > must be a better way!! > >> do.call('[', c(list(x), TRUE, TRUE, 2)) > ? ? [,1] [,2] > [1,] ? NA ? NA > [2,] ? NA ? NA > > Any idea? > > Regards, > Ernest > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From rolf.turner at xtra.co.nz Wed Nov 2 03:23:17 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Wed, 02 Nov 2011 15:23:17 +1300 Subject: [R] building a subscript programatically In-Reply-To: <20111102004304.GB29053@doriath.local> References: <20111101221442.GA28257@doriath.local> <4EB08A61.405@xtra.co.nz> <20111102004304.GB29053@doriath.local> Message-ID: <4EB0A995.6030803@xtra.co.nz> On 02/11/11 13:43, Ernest Adrogu? wrote: > Sorry for not stating my problem in a more clear way. What I want is, > given an array of n dimensions, overwrite it by iteratating over its > "outermost" dimension... OK, in the previous example, I would like > to do > > x<- array(NA, c(2,2,2)) > for (i in 1:2) { > x[,,i]<- 0 > } > > As you can see, the index I used in the loop only works in the case of > three-dimensional arrays, if x was two dimensional I would have had to > write > > for (i in 1:2) { > x[,i]<- 0 > } Uhhhh, how does this differ from just setting *all* entries of x equal to 0, e.g.: x[] <- 0 ??? cheers, Rolf Turner From tonyandersn at gmail.com Wed Nov 2 01:32:40 2011 From: tonyandersn at gmail.com (Tony) Date: Tue, 1 Nov 2011 17:32:40 -0700 (PDT) Subject: [R] I really need help to merge two data frames Message-ID: <5419565e-b889-4f84-b014-ecffb7d62239@r7g2000vbg.googlegroups.com> Hello, I need help getting two data sets to merge. The structure of my two data sets are: > str(bcusip) 'data.frame': 1391 obs. of 3 variables: $ bond_id : Factor w/ 1391 levels "AAGH","AAGI",..: 1 2 3 4 5 6 $ Freq : num 41361 4126 5206 10125 45536 ... $ CUSIP_ID: Factor w/ 1391 levels "00184AAC9","00184AAF2",..: > str(bdescr) 'data.frame': 3674 obs. of 7 variables: $ bond_id : Factor w/ 3674 levels "AAGH ","AAGI ",..: $ Issuer.Name: Factor w/ 635 levels "3M CO ","ABBOTT LABORAT $ Coupon : num 6 6.75 6.5 5.95 5.55 5.9 5.72 5.87 6 6.75 ... $ Maturity : Factor w/ 1076 levels "1/1/2015","1/1/2016",..: $ Callable : Factor w/ 2 levels "No ","Yes ": 2 2 2 2 2 $ Moody.s : Factor w/ 20 levels "A1 ","A2 ","A3 ",..: 16 16 16 $ S.P : Factor w/ 22 levels "- ","A- ","A ",..: 15 15 15 15 1 > I am trying to attach the descriptive variables in the first data set to the sample variables in the second data set. My code worked in an example that I re-created from a tutorial, but it will not work on my data Here is my data code: ### bond description bdescr <-read.table(file="index3705.R.csv",header=TRUE,sep=",") bdescr <- bdescr[!duplicated(bdescr$bond_id),] ### bond cusip number bcusip = read.table(file="selected1526.R.csv",header=TRUE,sep=",") bcusip <- bcusip[!duplicated(bcusip$bond_id),] bcusip$Freq = as.numeric(bcusip$Freq) And here is my attempt to merge: (I tried a few) merge (bdescr,bcusip,by="bond_id",all=TRUE) merge (bdescr,bcusip,by="bond_id") merge (bdescr,bcusip) superfile <- merge(bdescr,bcusip,by="bond_id",all=TRUE) Thank you for any help. I am new and going crazy at the moment. Sincerely, Tony From eadrogue at gmx.net Wed Nov 2 01:30:50 2011 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Wed, 2 Nov 2011 01:30:50 +0100 Subject: [R] building a subscript programatically In-Reply-To: References: <20111101221442.GA28257@doriath.local> Message-ID: <20111102003050.GA29053@doriath.local> 1/11/11 @ 20:22 (-0400), Comcast escriu: > Leaving the indices empty should give you what I'm guessing you want/expect. > > x[,,2] #. TRUE would also work, just not in a list. Exactly, but this only works if x has three dimensions. What I want is x[,,2] if x has three dimensions, x[,,,2] if it has four, and so forth. I cannot hard code [,,2] because I do not know how many dimensions x will have, instead the subscript has to be built "on the fly". Ernest From o.mannion at auckland.ac.nz Wed Nov 2 01:31:14 2011 From: o.mannion at auckland.ac.nz (Oliver Mannion (COMPASS)) Date: Wed, 2 Nov 2011 00:31:14 +0000 Subject: [R] Removing or ignoring package version for generic function in locked environment Message-ID: <5CE629D08E0E904F8645FAEFD2AEB5950107DE@ARTSMAIL8.ARTSNET.AUCKLAND.AC.NZ> Hi, I use the epicalc package which provides the function aggregate.numeric. Unfortunately aggregate.numeric produces warnings when aggregate is used by functions not under my control on a numeric value. If I don't load epicalc, aggregate.default is used instead by these functions and does not produce any warning. However I need epicalc. So to get around this, what I would do is firstly remove aggregate.numeric: rm(aggregate.numeric, pos=which(search() == "package:epicalc")) This worked fine in R 2.13.1. However in R 2.14.0 I am getting the following: Error in rm(aggregate.numeric, pos = which(search() == "package:epicalc")) : cannot remove bindings from a locked environment Is there some way I can remove aggregate.numeric, or otherwise prevent it from being used? Thanks in advance, Oliver From eadrogue at gmx.net Wed Nov 2 01:43:04 2011 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Wed, 2 Nov 2011 01:43:04 +0100 Subject: [R] building a subscript programatically In-Reply-To: <4EB08A61.405@xtra.co.nz> References: <20111101221442.GA28257@doriath.local> <4EB08A61.405@xtra.co.nz> Message-ID: <20111102004304.GB29053@doriath.local> 2/11/11 @ 13:10 (+1300), Rolf Turner escriu: > On 02/11/11 11:14, Ernest Adrogu? wrote: > >Hi, > > > >On ocasion, you need to subscript an array that has an arbitrary > >(ie. not known in advance) number of dimensions. How do you deal with > >these situations? > >It appears that it is not possible use a list as an index, for > >instance this fails: > > > >>x<- array(NA, c(2,2,2)) > >>x[list(TRUE,TRUE,2)] > >Error in x[list(TRUE, TRUE, 2)] : invalid subscript type 'list' > > > >The only way I know is using do.call() but it's rather ugly. There > >must be a better way!! > > > >>do.call('[', c(list(x), TRUE, TRUE, 2)) > > [,1] [,2] > >[1,] NA NA > >[2,] NA NA > > > >Any idea? > > It's possible that matrix subscripting might help you. E.g.: > > a <- array(1:60,dim=c(3,4,5)) > m <- matrix(c(1,1,1,2,2,2,3,4,5,1,2,5),byrow=TRUE,ncol=3) > a[m] > [1] 1 17 60 52 > > You can build "m" to have the same number of columns as your array > has dimensions. > > It's not clear to me what result you want in your example. Sorry for not stating my problem in a more clear way. What I want is, given an array of n dimensions, overwrite it by iteratating over its "outermost" dimension... OK, in the previous example, I would like to do x <- array(NA, c(2,2,2)) for (i in 1:2) { x[,,i] <- 0 } As you can see, the index I used in the loop only works in the case of three-dimensional arrays, if x was two dimensional I would have had to write for (i in 1:2) { x[,i] <- 0 } So, when the dimensions of x are not known in advance, how would you write such a loop? Your solution of using a matrix might work (I haven't been able to check it yet). Cheers, Ernest From erinm.hodgess at gmail.com Wed Nov 2 03:42:42 2011 From: erinm.hodgess at gmail.com (Erin Hodgess) Date: Tue, 1 Nov 2011 21:42:42 -0500 Subject: [R] using a file name in a system call Message-ID: Dear R People: I have a variable named file1 which contains the name of a file. I would like to copy that file to a different directory. Can I do that via the system command or is there a better way, please? Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodgess at gmail.com From erinm.hodgess at gmail.com Wed Nov 2 03:45:37 2011 From: erinm.hodgess at gmail.com (Erin Hodgess) Date: Tue, 1 Nov 2011 21:45:37 -0500 Subject: [R] Using a file name in a system call Message-ID: Never mind...I used paste and all is well. sorry for the trouble. -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodgess at gmail.com From hb at biostat.ucsf.edu Wed Nov 2 04:10:47 2011 From: hb at biostat.ucsf.edu (Henrik Bengtsson) Date: Tue, 1 Nov 2011 20:10:47 -0700 Subject: [R] building a subscript programatically In-Reply-To: <20111101221442.GA28257@doriath.local> References: <20111101221442.GA28257@doriath.local> Message-ID: 2011/11/1 Ernest Adrogu? : > Hi, > > On ocasion, you need to subscript an array that has an arbitrary > (ie. not known in advance) number of dimensions. How do you deal with > these situations? > It appears that it is not possible use a list as an index, for > instance this fails: > >> x <- array(NA, c(2,2,2)) >> x[list(TRUE,TRUE,2)] > Error in x[list(TRUE, TRUE, 2)] : invalid subscript type 'list' > > The only way I know is using do.call() but it's rather ugly. There > must be a better way!! > >> do.call('[', c(list(x), TRUE, TRUE, 2)) > ? ? [,1] [,2] > [1,] ? NA ? NA > [2,] ? NA ? NA > > Any idea? > library("R.utils") > x <- array(1:8, dim=c(2,2,2)) > x , , 1 [,1] [,2] [1,] 1 3 [2,] 2 4 , , 2 [,1] [,2] [1,] 5 7 [2,] 6 8 > extract(x, "3"=2) , , 1 [,1] [,2] [1,] 5 7 [2,] 6 8 For more details/examples, see help("extract.array", package="R.utils"). It doesn't to assignments, but could be achieved by: idxs <- array(seq(along=x), dim=dim(x)) idxs <- extract(idxs, "3"=2) x[idxs] <- ... base::arrayInd() may also be useful. /Henrik > > Regards, > Ernest > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From rolf.turner at xtra.co.nz Wed Nov 2 04:16:56 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Wed, 2 Nov 2011 16:16:56 +1300 Subject: [R] Using a file name in a system call In-Reply-To: References: Message-ID: <4EB0B628.10904@xtra.co.nz> On 02/11/11 15:45, Erin Hodgess wrote: > Never mind...I used paste and all is well. > > sorry for the trouble. You *might* want to consider using file.rename() rather than system(). Dunno what the pro's and con's might be. Still have to use paste(), but! :-) cheers, Rolf From michael.weylandt at gmail.com Wed Nov 2 04:38:40 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Tue, 1 Nov 2011 23:38:40 -0400 Subject: [R] Export to .txt In-Reply-To: <1320185704350-3965699.post@n4.nabble.com> References: <1320185704350-3965699.post@n4.nabble.com> Message-ID: I'm somewhat confused on how you intend to use browser() after rerouting output to a sink.... Beyond that, how are you running your script? source() has some arguments that encourage it to "say" alot more to the sink() command. Michael On Tue, Nov 1, 2011 at 6:15 PM, stat.kk wrote: > Hi, > > I would like to export all my workspace (even with the evaluation of > commands) to the text file. I know about the sink() function but it doesnt > work as I would like. My R-function looks like this: there are instructions > for user displayed by cat() command and browser() commands for fulfilling > them. While using the sink() command the instructions dont display :( > Can anyone help me with a equivalent command to File - Save to file... > option? > > Thank you very much. > > -- > View this message in context: http://r.789695.n4.nabble.com/Export-to-txt-tp3965699p3965699.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From bbolker at gmail.com Wed Nov 2 04:40:30 2011 From: bbolker at gmail.com (Ben Bolker) Date: Wed, 2 Nov 2011 03:40:30 +0000 Subject: [R] predict lmer References: Message-ID: Natalia Vizca?no Palomar gmail.com> writes: > I've been reading for many days trying to predict with lmer but I haven't > managed to do it. > I've fitted an allometric model for trees where I have included climatic > variables and diameter in the fixed part and > in the random part I've included the experimental sites where trees are and > also their provenance region. [snip] > f431<-lmer(log(H05)~log(DN05)*(PwS+ PoS+ PpS+ > TpS)+PoP:log(DN05)+PwP:log(DN05)+(log(DN05)-1|P)+ (log(DN05)-1|S/B)+(log(DN05)|SP), > data=data) [snip snip] Have you looked at the code on http://glmm.wikidot.com/faq about this ... ? If you look at it and still can't work it out I would suggest that requesting help on the r-sig-mixed-models list will be more useful ... Ben Bolker From jwiley.psych at gmail.com Wed Nov 2 05:11:36 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Tue, 1 Nov 2011 21:11:36 -0700 Subject: [R] using a file name in a system call In-Reply-To: References: Message-ID: system is one option, as it turns out, this is common enough that there are special functions for it, see ?file.copy More generally, this is a great use of apropos: apropos("file") Shows a lot of options. An alternative work flow would be a shell script that runs r scripts and also interacts with the system (depending how much system interaction is needed). Cheers, Josh On Nov 1, 2011, at 19:42, Erin Hodgess wrote: > Dear R People: > > I have a variable named file1 which contains the name of a file. I > would like to copy that file to a different directory. Can I do that > via the system command or is there a better way, please? > > Thanks, > Erin > > > -- > Erin Hodgess > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: erinm.hodgess at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From bps0002 at auburn.edu Wed Nov 2 05:53:07 2011 From: bps0002 at auburn.edu (B77S) Date: Tue, 1 Nov 2011 21:53:07 -0700 (PDT) Subject: [R] Subsampling-oversampling from a data frame In-Reply-To: <1320190655258-3965913.post@n4.nabble.com> References: <1320187546520-3965771.post@n4.nabble.com> <1320188808068-3965827.post@n4.nabble.com> <1320190655258-3965913.post@n4.nabble.com> Message-ID: <1320209587816-3971840.post@n4.nabble.com> # Perhaps I misunderstand your original need, but.... ## I added a few lines to your data and used dput() to get the below data (I named "df") df<- structure(list(age = c(15L, 20L, 15L, 10L, 10L, 12L, 17L, 17L, 11L, 12L, 16L, 20L, 23L, 14L, 22L, 16L, 10L, 11L, 21L, 10L, 13L, 17L), sex = structure(c(2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 1L), .Label = c("f", "m"), class = "factor"), class = structure(c(2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L), .Label = c("high", "low"), class = "factor")), .Names = c("age", "sex", "class"), class = "data.frame", row.names = c(NA, -22L )) ## the following line uses which(), sample(), and rbind(), along with some indexing to get a new dataframe; see ?which, ?sample, and ?rbind for more info # For the "indexing", play with it, ... type in df[1:3,1:2] as an example new_df <- rbind(df[sample(which(df$class=="low"), 4),], df[sample(which(df$class=="high"), 4),]) Now replace 4 with the the size of each you want. hgwelec wrote: > > Thank you for your answer. > > The problem is that i am learning R now, so i do not know how i could do > this. > > > I have found the following code but it does not work unfortunately > (=create distribution 0.1 "low" class - 0.9 high) : > > > > data[c(rownames(data.df[data.df$class=="high",]), > sample(rownames(data[data.df$class=="low"]), 0.1)) , ] > 2 posts This post has NOT been accepted by the mailing list yet. Dear members, Consider the following data frame (first 4 rows shown) age sex class 15 m low 20 f high 15 f low 10 m low in my original data set i have 1200 rows and a class distribution of low=0.3 and high=0.7 My question : how can i create a new data frame as the one shown above but with the 'high' class subsampled so that in the new data frame the class distribution is low=0.5 and high=0.5? I tried looking at the sample function and prob option but all examples i seen do not use an imbalanced class problem as the one shown above Thank you in advance Thank you in advance -- View this message in context: http://r.789695.n4.nabble.com/Subsampling-oversampling-from-a-data-frame-tp3965771p3971840.html Sent from the R help mailing list archive at Nabble.com. From stat.kk at gmail.com Wed Nov 2 07:14:01 2011 From: stat.kk at gmail.com (stat.kk) Date: Tue, 1 Nov 2011 23:14:01 -0700 (PDT) Subject: [R] Export to .txt In-Reply-To: References: <1320185704350-3965699.post@n4.nabble.com> Message-ID: <1320214441004-3971924.post@n4.nabble.com> Oh, Im sorry. My file isnt a function but script 'script.R' which looks something like that: cat('Instruction no 1', '\n') browser() # place for fulfilling it cat('Instruction no 2', '\n') browser() # place for fulfilling it etc. I am running it by sink(file='output.txt') source('script.R') sink(NULL) but it doesnt work as I would like. I cant see the output also via saving workaspace into .Rhistory file. The goal I would like to achieve is the same file as via File - Save to file... option - but I work in command line. -- View this message in context: http://r.789695.n4.nabble.com/Export-to-txt-tp3965699p3971924.html Sent from the R help mailing list archive at Nabble.com. From shirley0818 at gmail.com Tue Nov 1 21:51:08 2011 From: shirley0818 at gmail.com (shirley zhang) Date: Tue, 1 Nov 2011 16:51:08 -0400 Subject: [R] where to get chr_rpts file for dbSNP human 36.3 assembly Message-ID: Dear list, In terms of dbSNP database in NCBI, I can get the chr_rpts files for the most recent 37.3 assembly from the following FTP site, ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/chr_rpts/ My question is how/where I can get these chr_rpts files based on the 36.3 assembly Thanks, Shirley From zb.picb at gmail.com Wed Nov 2 06:43:46 2011 From: zb.picb at gmail.com (christear) Date: Tue, 1 Nov 2011 22:43:46 -0700 (PDT) Subject: [R] 'tcltk' does not have a name space In-Reply-To: References: Message-ID: <1320212626133-3971898.post@n4.nabble.com> It also have this problem when I install qvalue package ... -- View this message in context: http://r.789695.n4.nabble.com/tcltk-does-not-have-a-name-space-tp3020504p3971898.html Sent from the R help mailing list archive at Nabble.com. From jwiley.psych at gmail.com Wed Nov 2 08:34:10 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Wed, 2 Nov 2011 00:34:10 -0700 Subject: [R] Removing or ignoring package version for generic function in locked environment In-Reply-To: <5CE629D08E0E904F8645FAEFD2AEB5950107DE@ARTSMAIL8.ARTSNET.AUCKLAND.AC.NZ> References: <5CE629D08E0E904F8645FAEFD2AEB5950107DE@ARTSMAIL8.ARTSNET.AUCKLAND.AC.NZ> Message-ID: <97166F0E-2E27-44F3-A79D-E6B5681C7556@gmail.com> Interesting. I have a few (untested) thoughts. Before I get into those though, this seem to me like a case where contacting either the maintainer of epicalc or of the functions not under your control that give warnings. I think either would be appropriate because if the default method works correctly with no error, I really do not think aggregate.numeric should give a warning. The onus seems somewhat on the writer of methods for classes as common as numeric to write something that works. That said, without any idea what it is being used on there are endless possibilities for why a warning is being generated. But supposing neither of those are options, here are some ideas. 1) if you happen to be using this in your own package, try just importing aggregate numeric, rather than fully loading the epicalc package. 2) create a method that mimics aggregate but is for numeric, and make sure it is in an environment between the out of control functions and epicalc so it is called rather than epicalcs version. 3) use epicalc and then unload it rather than just removing that function (may not fly) 4) copy the epicalc aggregate numeric and just use that code and never load the package 5) you may be able to unlock() the name space so you can remove the methods (I think the function is unlock but there may be caps somewhere in there) 6) if the offending functions have a class that is not numeric but inherits from numeric, you could write a method for their particular class that would then supersede the inherited numeric method All of these are highly unsatisfactory in one way another. I am not in a position to test anything out at the moment (iPhone, well that's not true, I could ssh to my cluster, start r there and try via the terminal but that is truly painful on a phones keyboard) Good luck, Josh On Nov 1, 2011, at 17:31, "Oliver Mannion (COMPASS)" wrote: > Hi, > > I use the epicalc package which provides the function aggregate.numeric. > > Unfortunately aggregate.numeric produces warnings when aggregate is used by functions not under my control on a numeric value. If I don't load epicalc, aggregate.default is used instead by these functions and does not produce any warning. > > However I need epicalc. So to get around this, what I would do is firstly remove aggregate.numeric: > > rm(aggregate.numeric, pos=which(search() == "package:epicalc")) > > This worked fine in R 2.13.1. However in R 2.14.0 I am getting the following: > > Error in rm(aggregate.numeric, pos = which(search() == "package:epicalc")) : > cannot remove bindings from a locked environment > > Is there some way I can remove aggregate.numeric, or otherwise prevent it from being used? > > Thanks in advance, > > Oliver > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Wed Nov 2 09:47:13 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Wed, 02 Nov 2011 09:47:13 +0100 Subject: [R] 'tcltk' does not have a name space In-Reply-To: <1320212626133-3971898.post@n4.nabble.com> References: <1320212626133-3971898.post@n4.nabble.com> Message-ID: <4EB10391.5010901@statistik.tu-dortmund.de> Please quote the original message you are replying to and read the posting guide! Please update your version of R to R-2.14.0. Looks like the package assumes a more recent version of R without declaring the dependency. You may want to inform the package maintainer. Uwe Ligges On 02.11.2011 06:43, christear wrote: > It also have this problem when I install qvalue package ... > > -- > View this message in context: http://r.789695.n4.nabble.com/tcltk-does-not-have-a-name-space-tp3020504p3971898.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Wed Nov 2 09:50:31 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Wed, 02 Nov 2011 09:50:31 +0100 Subject: [R] I really need help to merge two data frames In-Reply-To: <5419565e-b889-4f84-b014-ecffb7d62239@r7g2000vbg.googlegroups.com> References: <5419565e-b889-4f84-b014-ecffb7d62239@r7g2000vbg.googlegroups.com> Message-ID: <4EB10457.1000008@statistik.tu-dortmund.de> On 02.11.2011 01:32, Tony wrote: > Hello, I need help getting two data sets to merge. The structure of > my > two data sets are: > >> str(bcusip) > 'data.frame': 1391 obs. of 3 variables: > $ bond_id : Factor w/ 1391 levels "AAGH","AAGI",..: 1 2 3 4 5 6 > $ Freq : num 41361 4126 5206 10125 45536 ... > $ CUSIP_ID: Factor w/ 1391 levels "00184AAC9","00184AAF2",..: > >> str(bdescr) > 'data.frame': 3674 obs. of 7 variables: > $ bond_id : Factor w/ 3674 levels "AAGH ","AAGI ",..: > $ Issuer.Name: Factor w/ 635 levels "3M CO ","ABBOTT LABORAT > $ Coupon : num 6 6.75 6.5 5.95 5.55 5.9 5.72 5.87 6 6.75 ... > $ Maturity : Factor w/ 1076 levels "1/1/2015","1/1/2016",..: > $ Callable : Factor w/ 2 levels "No ","Yes ": 2 2 2 2 2 > $ Moody.s : Factor w/ 20 levels "A1 ","A2 ","A3 ",..: 16 16 16 > $ S.P : Factor w/ 22 levels "- ","A- ","A ",..: 15 15 15 15 1 Look at the levels above, in "bcusip" ist is "AAGH", on "bdescr" it is "AAGH " (note the blanks! etc. Uwe Ligges > I am trying to attach the descriptive variables in the first data set > to the > sample variables in the second data set. > My code worked in an example that I re-created from a tutorial, but > it > will not work on my data > Here is my data code: > > ### bond description > bdescr<-read.table(file="index3705.R.csv",header=TRUE,sep=",") > bdescr<- bdescr[!duplicated(bdescr$bond_id),] > > ### bond cusip number > bcusip = read.table(file="selected1526.R.csv",header=TRUE,sep=",") > bcusip<- bcusip[!duplicated(bcusip$bond_id),] > bcusip$Freq = as.numeric(bcusip$Freq) > > And here is my attempt to merge: (I tried a few) > > merge (bdescr,bcusip,by="bond_id",all=TRUE) > merge (bdescr,bcusip,by="bond_id") > merge (bdescr,bcusip) > superfile<- merge(bdescr,bcusip,by="bond_id",all=TRUE) > > Thank you for any help. I am new and going crazy at the moment. > Sincerely, > Tony > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From john.maindonald at anu.edu.au Wed Nov 2 10:16:47 2011 From: john.maindonald at anu.edu.au (John Maindonald) Date: Wed, 2 Nov 2011 20:16:47 +1100 Subject: [R] RC33 8th Int Conf on Social Science Methodology -- The R System ... Message-ID: I wish to draw attention to an R-related session that is planned for the RC33 Eighth International Conference on Social Science Methodology, to be held over July 9 - July 13 2012, at the University of Sydney. " The focus of the conference is on innovations and current best practice in all aspects of social science research methodology. It provides an opportunity to reflect on contemporary methods, as applied in a range of settings and disciplinary contexts, to hear about emerging methods, tools, techniques and technologies, and to discover what resources are available to social science researchers and users of research. " The title for the planned session is: "The R System as a Platform for Analysis and Development of Analysis Methodology" http://conference.acspri.org.au/index.php/rc33/2012/schedConf/trackPolicies John Maindonald email: john.maindonald at anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Mathematics & Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm From erich.neuwirth at univie.ac.at Wed Nov 2 11:50:38 2011 From: erich.neuwirth at univie.ac.at (Erich Neuwirth) Date: Wed, 02 Nov 2011 11:50:38 +0100 Subject: [R] Size of windows graphics device Message-ID: <4EB1207E.7080305@univie.ac.at> R for Windows 2.14.0 Is there a function reporting the size of the current "windows" device after it has been resized manually? From saldanha.plangeo at gmail.com Wed Nov 2 11:56:29 2011 From: saldanha.plangeo at gmail.com (Raphael Saldanha) Date: Wed, 2 Nov 2011 08:56:29 -0200 Subject: [R] How to interpret Spearman Correlation In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jim at bitwrit.com.au Wed Nov 2 12:23:39 2011 From: jim at bitwrit.com.au (Jim Lemon) Date: Wed, 02 Nov 2011 22:23:39 +1100 Subject: [R] triangles point left, filled? In-Reply-To: <239734346.246556.1320138339838.JavaMail.apache@mail21.abv.bg> References: <239734346.246556.1320138339838.JavaMail.apache@mail21.abv.bg> Message-ID: <4EB1283B.9080902@bitwrit.com.au> On 11/01/2011 08:05 PM, Martin Ivanov wrote: > Dear R users, > > I want to plot not only triangles point up and triangles point down, > which is easy using the "pch" argument to "points". I want to plot left and right pointing triangles as well. They must be fillable with colour. > > I browsed a little in the documentation, tried rotating the up and down pointing triangles, but of no avail. Any suggestions will be appreciated. > Hi Martin, Have a look at the "my.symbols" function in the TeachingDemos package. Jim From S.Ellison at LGCGroup.com Wed Nov 2 12:33:36 2011 From: S.Ellison at LGCGroup.com (S Ellison) Date: Wed, 2 Nov 2011 11:33:36 +0000 Subject: [R] why the a[-indx] does not work? In-Reply-To: <1320008969.30713.YahooMailNeo@web120115.mail.ne1.yahoo.com> References: <1320000722.64455.YahooMailNeo@web120111.mail.ne1.yahoo.com> <1320004207.63930.YahooMailNeo@web120110.mail.ne1.yahoo.com> <1320007196.66545.YahooMailNeo@web120114.mail.ne1.yahoo.com> <1320008969.30713.YahooMailNeo@web120115.mail.ne1.yahoo.com> Message-ID: > -----Original Message----- > From:Alaios > Sent: 30 October 2011 21:09 > To: William Dunlap; andrija djurovic > Cc: R-help at r-project.org > Subject: Re: [R] why the a[-indx] does not work? > > What is the difference between though > > !numericVector==0 and > > -numericVector==0 > Er... you need to be (a lot) more careful with operator precendence. See ?Syntax for operator precedence. -numericVector==0 will usually* give the same answer as numericVector==0 because unary minus has higher precedence than ==, so this is read implicitly as (-numericvector)==0. -1 and 1 are still both nozero, while -0 and 0 are both still zero. ( *'usually' because you may be comparing a double precision nearly-zero with another double precision nearly-zero, and that is _always_ asking for trouble.) !numericVector==0 behaves quite differently because unary negation (!, or NOT) has _lower_ precedence than ==, so this one is read as !(numericVector==0) Operatopr preference rules for programmers: Rule 1: If in doubt about operator precedence, use parentheses Rule 2: Always have doubts about operator precedence unless you have looked it up for _that_ version of _that_ language _that day_. Rule 3: Check the operator precedence of parentheses. S Ellison ******************************************************************* This email and any attachments are confidential. Any use...{{dropped:8}} From dejian.zhao at gmail.com Wed Nov 2 13:01:04 2011 From: dejian.zhao at gmail.com (Dejian Zhao) Date: Wed, 02 Nov 2011 20:01:04 +0800 Subject: [R] Size of windows graphics device In-Reply-To: <4EB1207E.7080305@univie.ac.at> References: <4EB1207E.7080305@univie.ac.at> Message-ID: <4EB13100.4010608@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From S.Ellison at LGCGroup.com Wed Nov 2 13:09:19 2011 From: S.Ellison at LGCGroup.com (S Ellison) Date: Wed, 2 Nov 2011 12:09:19 +0000 Subject: [R] Size of windows graphics device In-Reply-To: <4EB13100.4010608@gmail.com> References: <4EB1207E.7080305@univie.ac.at> <4EB13100.4010608@gmail.com> Message-ID: > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Dejian Zhao > Sent: 02 November 2011 12:01 > To: r-help at r-project.org > Subject: Re: [R] Size of windows graphics device > > par("fin") : The figure region dimensions, |(width,height)|, > in inches. > par("din") : the device dimensions, |(width,height)|, in inches. ... except between windows() and plot.new(). But reports accurately after plot.new(), plot() and the like. S Ellison ******************************************************************* This email and any attachments are confidential. Any use...{{dropped:8}} From f.harrell at vanderbilt.edu Wed Nov 2 13:23:19 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Wed, 2 Nov 2011 05:23:19 -0700 (PDT) Subject: [R] How to interpret Spearman Correlation In-Reply-To: References: Message-ID: <1320236599396-3972797.post@n4.nabble.com> What David was getting at is that you interpreted the P-value as one minus the P-value, not a safe practice. There is also some question about whether it would have been better to recommend a good book or course. Frank Raphael Saldanha wrote: > > Hi David, > > This is not private tutoring, just someone trying to help, and I'm sorry > for my distraction. > > On Tue, Nov 1, 2011 at 10:34 PM, David Winsemius <dwinsemius@>wrote: > >> Shahab; >> >> You would be well advised not to seek private tutoring from someone on >> the >> Internet who tells you that a p-value of 0.008736 is "not significant". >> >> >> >> On Nov 1, 2011, at 8:09 PM, Raphael Saldanha <saldanha.plangeo@> >> wrote: >> >> > Hi Shahab, >> > >> > This test shows that there is some positive statistical correlation, >> BUT >> > the p-value of the test - this is, the level of significance - shows >> that >> > the correlation is not statistically significant at 95% confidence >> level. >> > So, the correlation may be equal to zero. >> > >> > To understand this concepts in a good way, you need to be secure about >> > variance and hypothesis test. >> > >> > I can help you more if you need. Send me a direct mail (this list is >> for >> > doubts about R, not conceptual statistics). I will be happy to help you >> > with Statistics. >> > >> > My e-mail: saldanha.plangeo@ >> > >> > On Tue, Nov 1, 2011 at 8:58 PM, shahab <shahab.mokari@> wrote: >> > >> >> Hi, >> >> >> >> I am not really familiar with Correlation foundations, although I read >> >> a lot. So maybe if someone kindly help me to interpret the following >> >> results. >> >> I had the following R commands: >> >> >> >> correlation <-cor( vector_CitationProximity , vector_Impact, method = >> >> "spearman", use="na.or.complete") >> >> cor_test<-cor.test(vector_CitationProximity, vector_Impact, >> >> method="spearman") >> >> >> >> and the results are: >> >> "correlation" >> >> Correlation = 0.04715686 >> >> >> >> "cor_test" >> >> Spearman's rank correlation rho >> >> >> >> data: vector_CitationProximity and vector_Impact >> >> S = 5581032104, p-value = 0.008736 >> >> alternative hypothesis: true rho is not equal to 0 >> >> sample estimates: >> >> rho >> >> 0.04582115 >> >> >> >> >> >> So apparently, there is positive correlation between two given >> >> variables since Correlation = 0.04715686 > 0 >> >> However I couldn't interpret the significance ?' what does "rho" say? >> >> Is there any simple sample that I can read and try to understand? I am >> >> do confused in understanding how significance can be interpreted. >> >> >> >> Thanks, >> >> >> >> /Shahab >> >> >> >> ______________________________________________ >> >> R-help@ mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> >> >> > >> > >> > >> > -- >> > Atenciosamente, >> > >> > Raphael Saldanha >> > saldanha.plangeo@ >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help@ mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Atenciosamente, > > Raphael Saldanha > saldanha.plangeo@ > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@ mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/How-to-interpret-Spearman-Correlation-tp3965809p3972797.html Sent from the R help mailing list archive at Nabble.com. From christian.langkamp at gmxpro.de Wed Nov 2 14:56:05 2011 From: christian.langkamp at gmxpro.de (Christian Langkamp) Date: Wed, 2 Nov 2011 14:56:05 +0100 Subject: [R] Generate a sequence of vectors of different length Message-ID: <000601cc9967$2aceaa10$806bfe30$@langkamp@gmxpro.de> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From d.rizopoulos at erasmusmc.nl Wed Nov 2 15:07:26 2011 From: d.rizopoulos at erasmusmc.nl (Dimitris Rizopoulos) Date: Wed, 02 Nov 2011 15:07:26 +0100 Subject: [R] Generate a sequence of vectors of different length In-Reply-To: <000601cc9967$2aceaa10$806bfe30$@langkamp@gmxpro.de> References: <000601cc9967$2aceaa10$806bfe30$@langkamp@gmxpro.de> Message-ID: <4EB14E9E.7070907@erasmusmc.nl> One approach is: sectors <- 2 namSec <- LETTERS[seq_len(sectors)] nSec <- round(3 / runif(sectors)) mapply(paste, namSec, sapply(nSec, seq_len), MoreArgs = list(sep = "")) I hope it helps. Best, Dimitris On 11/2/2011 2:56 PM, Christian Langkamp wrote: > Hi everyone > After the following setup > > sector=2 # Define Number of Sectors > > sectors=LETTERS[seq( from = 1, to = sector )] # Name sectors > > No_ent=round(3/runif(sector)) # Number of entities per sector > > #Tot_No_ent=sum(No_ent) > > > > Goal is to get a List like > > (A1, A2, A3, B1, B2, B3, B4) where A is denoting an industrial sector and > then a numbered sequence of companies within the sector. > > > > The step I am missing is how to generate a sequence of vectors (one for each > sector) with individual length being determined by No_ent. > > The goal is to generate a set of entities from different sectors. One simple > way out of it would be to set the number of entities equal per sector and > have a matrix, but I am quite sure it should also be possible for having a > different number of entities in each sector. > > > > Once this is done, I can bind them together as vector with > "as.vector(rbind(?))" (both as an (A,A,A,B,B,B,B) and (1,2,3,1,2,3,4) and > then concatenate) > > > > Thanks, Christian > > > > > > Trials included the following bits > > A=for (i in 1:sector){ > > rep(i,No_ent[i]) > > } > > paste(LETTERS[i], seq(from =1, to =No_ent[i]), sep = "") > > but I don't get the correct object definition right. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ From dcarlson at tamu.edu Wed Nov 2 15:18:03 2011 From: dcarlson at tamu.edu (David L Carlson) Date: Wed, 2 Nov 2011 09:18:03 -0500 Subject: [R] Imputing Missing Data: A Good Starting Point? In-Reply-To: <119D0C77-CA40-47BD-BDE1-FF8267C0087E@gmail.com> References: <119D0C77-CA40-47BD-BDE1-FF8267C0087E@gmail.com> Message-ID: <003801cc996a$3bd500e0$b37f02a0$@edu> You might look at Allison's short book for a quick introduction to the issues: Paul D. Allison. 2002. Missing Data. Sage Quantitative Applications in the Social Sciences No. 136. Online there is http://www.multiple-imputation.com/ which provides a bibliography (with links to articles that are available online). Chapter 25 on Missing Data Imputation (Andrew Gelman and Jennifer Hill. 2006. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press) which is available at http://lane.compbio.cmu.edu/courses/gelmanmissing.pdf provides examples of several approaches and provides R code for them. ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352 -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Ken Sent: Tuesday, November 01, 2011 4:48 PM To: Sascha Vieweg Cc: r-help at r-project.org Subject: Re: [R] Imputing Missing Data: A Good Starting Point? Hope this helps: http://rss.acs.unt.edu/Rdoc/library/randomForest/html/rfImpute.html Ken Hutchison On Nov 1, 2554 BE, at 5:29 PM, Sascha Vieweg wrote: > Hello > > I am working on my first attempt to impute missing data of a data set with systematically incomplete answers (school performance tests). I was googling around for some information and found Amelia (Honaker et al., 2010) and the mi package (Yu-Sung et al., n.d.). However, since I am new to this field, I was wondering whether some experts could give a good recommendation of a starting point for me, that is a point that combines theory as well as practical examples. Of course, My primary interest is to complete the task in time (1 week), however, I want to acquire skills for a program that provides some future, and of course I want some background on what I am doing (and what not). Could you help with some hints, experiences, and recommendations? > > Thank you. > > Regards > *S* > > -- > Sascha Vieweg, saschaview at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From michael.weylandt at gmail.com Wed Nov 2 16:40:20 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Wed, 2 Nov 2011 11:40:20 -0400 Subject: [R] Export to .txt In-Reply-To: <1320214441004-3971924.post@n4.nabble.com> References: <1320185704350-3965699.post@n4.nabble.com> <1320214441004-3971924.post@n4.nabble.com> Message-ID: So playing around with it quickly, it seems that print() works with sink but cat() doesn't unless you put the sink call in the script at which point it does as you would expect. Does that help? Michael On Wed, Nov 2, 2011 at 2:14 AM, stat.kk wrote: > Oh, Im sorry. My file isnt a function but script 'script.R' which looks > something like that: > > cat('Instruction no 1', '\n') > browser() ? # place for fulfilling it > > cat('Instruction no 2', '\n') > browser() ? # place for fulfilling it > > etc. > > I am running it by > sink(file='output.txt') > source('script.R') > sink(NULL) > > but it doesnt work as I would like. I cant see the output also via saving > workaspace into .Rhistory file. The goal I would like to achieve is the same > file as via File - Save to file... option - but I work in command line. > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Export-to-txt-tp3965699p3971924.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From rshepard at appl-ecosys.com Wed Nov 2 17:39:13 2011 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Wed, 2 Nov 2011 09:39:13 -0700 (PDT) Subject: [R] Proper Syntax for Logical Subset in Subset() Message-ID: I have measured values for 47 chemicals in a stream. After processing the original data frame through reshape2, the recast data frame has this structure: 'data.frame': 256 obs. of 47 variables: $ site : Factor w/ 143 levels "BC-0.5","BC-1",..: 1 1 1 2 2 2 2 2 2 2 ... $ sampdate : Date, format: "1996-04-19" "1996-05-21" ... $ Acid : num NA NA NA NA NA NA NA NA NA NA ... $ Ag : num NA NA NA NA NA NA NA NA NA NA ... $ Al : num 0.07 NA NA NA NA NA NA NA NA NA ... $ Alk-HO : num NA NA NA NA NA NA NA NA NA NA ... $ Alk-Tot : num 162 152 212 NA NA NA NA NA NA NA ... $ As : num 0.01 NA NA 0 0 0 0 0.01 0 0.01 ... $ Ba : num 0.18 NA NA NA NA NA NA NA NA NA ... $ Be : num NA NA NA NA NA NA NA NA NA NA ... $ Bo : num NA NA NA NA NA NA NA NA NA NA ... $ CO3 : num NA NA NA NA NA NA NA NA NA NA ... $ Ca : num 76.6 NA NA NA NA ... $ Cd : num NA NA NA NA NA NA NA NA NA NA ... $ Cl : num 12 NA NA NA NA NA NA NA NA NA ... $ Cn : num NA NA NA NA NA NA NA NA NA NA ... $ Co : num NA NA NA NA NA NA NA NA NA NA ... $ Cond : num 712 403 731 NA NA NA NA NA NA NA ... $ Cr : num NA NA NA NA NA NA NA NA NA NA ... $ DO : num NA NA NA NA NA NA NA NA NA NA ... $ F : num NA NA NA NA NA NA NA NA NA NA ... $ Fe : num 0.06 NA NA NA NA NA NA NA NA NA ... $ Flow : num NA NA NA NA NA NA NA NA NA NA ... $ HCO3 : num 162 152 212 NA NA NA NA NA NA NA ... $ Hg : num 0 NA NA NA NA NA NA NA NA NA ... $ K : num 1.7 NA NA NA NA NA NA NA NA NA ... $ Mg : num 43.2 NA NA NA NA ... $ Mn : num NA NA NA NA NA NA NA NA NA NA ... $ NO2-N : num NA NA NA NA NA NA NA NA NA NA ... $ NO3-N : num NA 0.47 0.09 NA NA NA NA NA NA NA ... $ NO3-NO2-N: num 1.97 NA NA NA NA NA NA NA NA NA ... $ Na : num NA NA NA NA NA NA NA NA NA NA ... $ Ni : num NA NA NA NA NA NA NA NA NA NA ... $ OH : num NA NA NA NA NA NA NA NA NA NA ... $ P : num 0.03 NA NA NA NA NA NA NA NA NA ... $ Pb : num NA NA NA NA NA NA NA NA NA NA ... $ SO4 : num 175 57 194 NA NA NA NA NA NA NA ... $ Sb : num 0 NA NA NA NA NA NA NA NA NA ... $ Se : num 0.01 NA NA NA NA NA NA NA NA NA ... $ Si : num NA NA NA NA NA NA NA NA NA NA ... $ TDS : num 460 212 530 NA NA NA NA NA NA NA ... $ TSS : num NA 26 NA NA NA NA NA NA NA NA ... $ Temp : num NA NA NA NA NA NA NA NA NA NA ... $ Tl : num NA NA NA NA NA NA NA NA NA NA ... $ Turb : num 2.2 NA NA NA NA NA NA NA NA NA ... $ Zn : num 0.02 NA NA NA NA NA NA NA NA NA ... $ pH : num 8.12 8.19 8.46 NA NA NA NA NA NA NA ... I want a subset of this with only 7 chemicals: Ca, Cl, Cond, Mg, Na, SO4, and TDS. The subset help page tells me that I can use a logical subset to extract these 7 rows while keeping all columns, but I do not know how to write that logical subset. I tried emulating the example on the help page of avoiding the subset but R didn't like the '%in%' as I wrote it; putting the desired row names in a subset vector fails: burns.tds <- subset(burns.cast, subset(c('Ca', 'Cl', 'Cond', 'Mg', 'Na', 'SO4', 'TDS'))) Error in subset.default(c("Ca", "Cl", "Cond", "Mg", "Na", "SO4", "TDS")) : argument "subset" is missing, with no default What is the proper syntax to extract only these rows into a new data frame? And, is the recast data frame the appropriate format as the source? Rich From rshepard at appl-ecosys.com Wed Nov 2 17:46:54 2011 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Wed, 2 Nov 2011 09:46:54 -0700 (PDT) Subject: [R] Proper Syntax for Logical Subset in Subset() In-Reply-To: References: Message-ID: On Wed, 2 Nov 2011, Rich Shepard wrote: > I want a subset of this with only 7 chemicals: Ca, Cl, Cond, Mg, Na, SO4, > and TDS. I should have also written that what I ultimately want is to create a box-and-whisker plot for these 7 chemicals in a single panel. If that can be done directly from the source data frame without creating another subset, I want to learn the syntax for that. Rich From Seeliger.Curt at epamail.epa.gov Wed Nov 2 17:57:03 2011 From: Seeliger.Curt at epamail.epa.gov (Seeliger.Curt at epamail.epa.gov) Date: Wed, 2 Nov 2011 09:57:03 -0700 Subject: [R] Proper Syntax for Logical Subset in Subset() In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From gunter.berton at gene.com Wed Nov 2 18:08:45 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Wed, 2 Nov 2011 10:08:45 -0700 Subject: [R] Proper Syntax for Logical Subset in Subset() In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From gunter.berton at gene.com Wed Nov 2 18:12:04 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Wed, 2 Nov 2011 10:12:04 -0700 Subject: [R] Proper Syntax for Logical Subset in Subset() In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From s.karmv at gmail.com Wed Nov 2 17:54:50 2011 From: s.karmv at gmail.com (Sl K) Date: Wed, 2 Nov 2011 12:54:50 -0400 Subject: [R] how to count number of occurrences Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Stephen.Benigno at bgpa.wa.gov.au Wed Nov 2 10:02:49 2011 From: Stephen.Benigno at bgpa.wa.gov.au (Stephen Benigno) Date: Wed, 2 Nov 2011 17:02:49 +0800 Subject: [R] 2nd parameter Power Curves/ANCOVAs Message-ID: <5287BD27B7A1624C9853B852CB163E11019ED47B@bgpaexch03.bgpa.local> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From elena.guijarro at vi.ieo.es Wed Nov 2 10:30:47 2011 From: elena.guijarro at vi.ieo.es (Elena Guijarro) Date: Wed, 02 Nov 2011 10:30:47 +0100 Subject: [R] mapping bathymetries and species abundances Message-ID: <4EB10DC7.7080407@vi.ieo.es> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bonda at hsu-hh.de Wed Nov 2 11:10:08 2011 From: bonda at hsu-hh.de (bonda) Date: Wed, 2 Nov 2011 03:10:08 -0700 (PDT) Subject: [R] nproc parameter in efpFunctional Message-ID: <1320228608589-3972419.post@n4.nabble.com> Hello all, could anyone explain the exact meaning of parameter nproc? Why different values of nproc give so different critical values, i.e. meanL2BB$computeCritval(0.05,nproc=3) [1] 0.9984853 meanL2BB$computeCritval(0.05,nproc=1) [1] 0.4594827 The strucchange-package description gives "integer specifying for which number of processes Brownian motions should be simulated" - do I need nproc-dimensional Brownian bridge? Thank you in advance! Julia -- View this message in context: http://r.789695.n4.nabble.com/nproc-parameter-in-efpFunctional-tp3972419p3972419.html Sent from the R help mailing list archive at Nabble.com. From bellard.celine at gmail.com Wed Nov 2 13:44:22 2011 From: bellard.celine at gmail.com (Celine) Date: Wed, 2 Nov 2011 05:44:22 -0700 (PDT) Subject: [R] Sum with condition Message-ID: <1320237862644-3972839.post@n4.nabble.com> I guess my problem is simple for most of you but I am new with R and I need some help, I have a dataframe : CELLCD AreaProtected 8928 52.39389 8928 41.91511 8929 21.21975 8929 63.65925 8930 26.08547 8930 14.04602 I wouldlike to sum the AreaProtected if it is the same CELLCD in another column : CELLCD AreaProtected SumAreaProtected 8928 52.39389 94.309 8928 41.91511 8929 21.21975 84,879 8929 63.65925 8930 26.08547 8930 14.04602 I am just started with R and I don't know how I can do that. Do you have any ideas ? Thanks a lot for your help, -- View this message in context: http://r.789695.n4.nabble.com/Sum-with-condition-tp3972839p3972839.html Sent from the R help mailing list archive at Nabble.com. From bellard.celine at gmail.com Wed Nov 2 14:06:33 2011 From: bellard.celine at gmail.com (Celine) Date: Wed, 2 Nov 2011 06:06:33 -0700 (PDT) Subject: [R] does there any function like sumif in excel? In-Reply-To: <972822.20441.qm@web38408.mail.mud.yahoo.com> References: <972822.20441.qm@web38408.mail.mud.yahoo.com> Message-ID: <1320239193254-3972963.post@n4.nabble.com> I have a similar problem I have a dataframe : CELLCD AreaProtected 8928 52.39389 8928 41.91511 8929 21.21975 8929 63.65925 8930 26.08547 8930 14.04602 I wouldlike to sum the AreaProtected if it is the same CELLCD in another column : CELLCD AreaProtected SumAreaProtected 8928 52.39389 94.309 8928 41.91511 8929 21.21975 84,879 8929 63.65925 8930 26.08547 8930 14.04602 I am just started with R and I don't know how I can do that. Do you have any ideas ? -- View this message in context: http://r.789695.n4.nabble.com/does-there-any-function-like-sumif-in-excel-tp858444p3972963.html Sent from the R help mailing list archive at Nabble.com. From aajit75 at yahoo.co.in Wed Nov 2 14:19:48 2011 From: aajit75 at yahoo.co.in (aajit75) Date: Wed, 2 Nov 2011 06:19:48 -0700 (PDT) Subject: [R] Creating deciles on data using one variable Message-ID: <1320239988595-3973086.post@n4.nabble.com> I need to deciles data containing more than one variables using any one variable. I am using script below : id <-c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20) tot <-c(1230, 1230, 2345, 3456, 456, 4356, 123, 124, 987, 785, 5646, 345, 2345, 3456, 456, 4356, 123, 124, 987, 785) data <- data.frame ( cbind(id , tot)) data$decile<-cut(data$tot,quantile(data$tot,(0:10)/10),include.lowest=TRUE,lable=TRUE) data$decile New variable "decile" taking values as below where as I need it should take values from 1,2..10, Where I am going wrong? data$decile [1] (987,1.23e+03] (987,1.23e+03] (1.23e+03,2.34e+03] [4] (2.34e+03,3.46e+03] (301,456] (3.46e+03,4.36e+03] [7] [123,124] (124,301] (785,987] [10] (456,785] (4.36e+03,5.65e+03] (301,456] [13] (1.23e+03,2.34e+03] (2.34e+03,3.46e+03] (301,456] [16] (3.46e+03,4.36e+03] [123,124] (124,301] [19] (785,987] (456,785] -Ajit -- View this message in context: http://r.789695.n4.nabble.com/Creating-deciles-on-data-using-one-variable-tp3973086p3973086.html Sent from the R help mailing list archive at Nabble.com. From fomcl at yahoo.com Wed Nov 2 14:52:04 2011 From: fomcl at yahoo.com (Albert-Jan Roskam) Date: Wed, 2 Nov 2011 06:52:04 -0700 (PDT) Subject: [R] overloading + operator for chars Message-ID: <1320241924.8401.YahooMailNeo@web110714.mail.gq1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Wed Nov 2 15:06:31 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 2 Nov 2011 07:06:31 -0700 (PDT) Subject: [R] Creating deciles on data using one variable In-Reply-To: <1320239988595-3973086.post@n4.nabble.com> References: <1320239988595-3973086.post@n4.nabble.com> Message-ID: <1320242791779-3973412.post@n4.nabble.com> I need to deciles data containing more than one variables using any one variable. I am using script below : id <-c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20) tot <-c(1230, 1230, 2345, 3456, 456, 4356, 123, 124, 987, 785, 5646, 345, 2345, 3456, 456, 4356, 123, 124, 987, 785) data <- data.frame ( cbind(id , tot)) data$decile<-cut(data$tot,quantile(data$tot,(0:10)/10),include.lowest=TRUE,lable=TRUE) data$decile New variable "decile" taking values as below where as I need it should take values from 1,2..10, Where I am going wrong? ----------------- You have a factor with labels, but if you use as.numeric(data$decile) you will get what you were aiming for. -- david -- View this message in context: http://r.789695.n4.nabble.com/Creating-deciles-on-data-using-one-variable-tp3973086p3973412.html Sent from the R help mailing list archive at Nabble.com. From hillerrv at gmail.com Wed Nov 2 17:12:17 2011 From: hillerrv at gmail.com (Rebecca Hiller) Date: Wed, 2 Nov 2011 17:12:17 +0100 Subject: [R] difference between foo$a[2] <- 1 and foo[2,"a"] <- 1 Message-ID: <6DF57D41-2621-4B37-AC0E-DC1325EAD172@gmail.com> Hallo Can anyone tell me the difference between foo$a[2] <- 1 and foo[2,"a"] <- 1 ? I thought that both expressions are equivalent, but when I run the following example, there is obviously a difference. > foo <- data.frame(a=NA,b=NA) > foo a b 1 NA NA > foo$a[1] <- 1 > foo$b[1] <- 2 > foo$a[2] <- 1 Error in `$<-.data.frame`(`*tmp*`, "a", value = c(1, 1)) : replacement has 2 rows, data has 1 > foo[2,"a"] <- 1 > foo a b 1 1 2 2 1 NA Thanks, Rebecca Hiller -- ETH Z?rich Rebecca Hiller Institute of Agricultural Sciences LFW A2 Universit?tsstrasse 2 8092 Z?rich SWITZERLAND rebecca.hiller at ipw.agrl.ethz.ch http://www.gl.ethz.ch/ +41 44 632 31 90 Telefon +41 44 632 11 53 Fax From simone.salvadei at gmail.com Wed Nov 2 17:16:17 2011 From: simone.salvadei at gmail.com (Simone Salvadei) Date: Wed, 2 Nov 2011 17:16:17 +0100 Subject: [R] array manipulation Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Community at MMHein.at Wed Nov 2 18:14:15 2011 From: Community at MMHein.at (Community) Date: Wed, 2 Nov 2011 18:14:15 +0100 Subject: [R] Time Series w/ daily or stochastic observation prediction Message-ID: <94989673-B58A-4337-A55D-0B31CED43857@MMHein.at> Hi, I've got two different types of behaviors for which I have to predict the future development. One is a set of historical data with daily observations and which are following some kind of a seasonal pattern, the second one a set of historical data (measure points), with two observation occurring each week, consisting of ten unique (within one single observation) values in parallel, and which doesn't appear to follow any seasonal pattern. The former one I've aggregated on a monthly level and calculated a prediction model based on Holt Winters, which appears to be pretty nicely fitting when comparing the predicted history with the actual one, and now I have to accomplish the very same on a daily basis (including February 29th in case of a leap year). The latter one I have not clue at all as how to prepare the data and run it with any potetially qualifying model. Cheers, Martin From separent at yahoo.com Wed Nov 2 18:47:35 2011 From: separent at yahoo.com (=?ISO-8859-1?Q?Serge=2D=C9tienne_Parent?=) Date: Wed, 2 Nov 2011 13:47:35 -0400 Subject: [R] Multiple comparison test about whole population, not about mean Message-ID: Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible URL : From sarah.goslee at gmail.com Wed Nov 2 19:08:42 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Wed, 2 Nov 2011 14:08:42 -0400 Subject: [R] how to count number of occurrences In-Reply-To: References: Message-ID: Hi, On Wed, Nov 2, 2011 at 12:54 PM, Sl K wrote: > Dear R users, > > I have this data frame, > ? ? ? ? ? y samp > 8 0.03060419 ? ?X > 18 0.06120838 ? ?Y > 10 0.23588374 ? ?X > 3 0.32809965 ? ?X > 1 ?0.36007100 ? ?X > 7 0.36730571 ? ?X > 20 0.47176748 ? ?Y > 13 0.65619929 ? ?Y > 11 0.72014201 ? ?Y > 17 0.73461142 ? ?Y > 6 0.76221313 ? ?X > 2 0.77005691 ? ?X > 4 0.92477243 ? ?X > 9 0.93837591 ? ?X > 5 0.98883581 ? ?X > 16 1.52442626 ? ?Y > 12 1.54011381 ? ?Y > 14 1.84954487 ? ?Y > 19 1.87675183 ? ?Y > 15 1.97767162 ? ?Y > > and I am trying to find the number of X's that occur before ith Y occurs. > For example, there is 1 X before the first Y, so I get 1. There are 4 X's > before the second Y, so I get 4, there is no X between second and third Y, > so I get 0 and so on. Any hint to at least help me to start this will be > appreciated. Thanks a lot! Using dput() to provide reproducible data would be nice, but failing that here's a simple example with sample data: > testdata <- c("x", "y", "x", "x", "x", "y", "x", "x", "x", "x", "x", "y", "y") > rle(testdata) Run Length Encoding lengths: int [1:6] 1 1 3 1 5 2 values : chr [1:6] "x" "y" "x" "y" "x" "y" You can use the values component of the list returned by rle to subset the lengths component of the list to get only the x values if that's what you need to end up with. > rle(testdata)$lengths[rle(testdata)$values == "x"] [1] 1 3 5 -- Sarah Goslee http://www.functionaldiversity.org From sarah.goslee at gmail.com Wed Nov 2 19:11:34 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Wed, 2 Nov 2011 14:11:34 -0400 Subject: [R] Sum with condition In-Reply-To: <1320237862644-3972839.post@n4.nabble.com> References: <1320237862644-3972839.post@n4.nabble.com> Message-ID: Hi, On Wed, Nov 2, 2011 at 8:44 AM, Celine wrote: > CELLCD AreaProtected > ? 8928 ? ? ?52.39389 > ? 8928 ? ? ?41.91511 > ? 8929 ? ? ?21.21975 > ? 8929 ? ? ?63.65925 > ? 8930 ? ? ?26.08547 > ? 8930 ? ? ?14.04602 You'll need to figure out how you want it to be combined with the original data frame, since there can't be empty cells, but: > dput(testdata) structure(list(CELLCD = c(8928L, 8928L, 8929L, 8929L, 8930L, 8930L), AreaProtected = c(52.39389, 41.91511, 21.21975, 63.65925, 26.08547, 14.04602)), .Names = c("CELLCD", "AreaProtected"), class = "data.frame", row.names = c(NA, -6L)) > > aggregate(testdata$AreaProtected, by=list(CELLCD=testdata$CELLCD), FUN="sum") CELLCD x 1 8928 94.30900 2 8929 84.87900 3 8930 40.13149 -- Sarah Goslee http://www.functionaldiversity.org From wdunlap at tibco.com Wed Nov 2 19:13:05 2011 From: wdunlap at tibco.com (William Dunlap) Date: Wed, 2 Nov 2011 18:13:05 +0000 Subject: [R] how to count number of occurrences In-Reply-To: References: Message-ID: Is the following what you want? It should give the number of "X"s immediately preceding each "Y". > samp <- c("X", "Y", "X", "X", "X", "X", "Y", "Y", "Y", "Y", "X", "X", "X", "X", "X", "Y", "Y", "Y", "Y", "Y") > diff((seq_along(samp) - cumsum(samp=="Y"))[samp=="Y"]) [1] 4 0 0 0 5 0 0 0 0 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Sl K > Sent: Wednesday, November 02, 2011 9:55 AM > To: r-help at r-project.org > Subject: [R] how to count number of occurrences > > Dear R users, > > I have this data frame, > y samp > 8 0.03060419 X > 18 0.06120838 Y > 10 0.23588374 X > 3 0.32809965 X > 1 0.36007100 X > 7 0.36730571 X > 20 0.47176748 Y > 13 0.65619929 Y > 11 0.72014201 Y > 17 0.73461142 Y > 6 0.76221313 X > 2 0.77005691 X > 4 0.92477243 X > 9 0.93837591 X > 5 0.98883581 X > 16 1.52442626 Y > 12 1.54011381 Y > 14 1.84954487 Y > 19 1.87675183 Y > 15 1.97767162 Y > > and I am trying to find the number of X's that occur before ith Y occurs. > For example, there is 1 X before the first Y, so I get 1. There are 4 X's > before the second Y, so I get 4, there is no X between second and third Y, > so I get 0 and so on. Any hint to at least help me to start this will be > appreciated. Thanks a lot! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From rshepard at appl-ecosys.com Wed Nov 2 19:14:12 2011 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Wed, 2 Nov 2011 11:14:12 -0700 (PDT) Subject: [R] Proper Syntax for Logical Subset in Subset() In-Reply-To: References: Message-ID: On Wed, 2 Nov 2011, Bert Gunter wrote: > To make such a plot, I would have thought your want your data structure to > be: > Column A: Date > Column B; Chemical > Column C: Result Thanks, Bert. I have a data frame in that format. > After subsetting this to the chemicals you want or doing the subsetting in > your plot command, something like (base R) > > boxplot(Result ~ Chemical, subset=yourdat$Chemical %in% c("Ca","Cl", > "Cond","Mg","Na","SO2","TDS")) Great! I'll work with this. I tend to use lattice so I can learn its capabilities better. > Whether I'm correct or not in my understanding of your situation, an > important message is: you should choose your data structure to facilitate > the analysis that you have in mind. IMHO, this is one of R's great > strengths: it provides rich facilities for manipulating data tightly > integrated with plotting and analytical capabilities. I recognize this and continue to learn how best to represent the same data for different analyses and plots. Much appreciated, Rich From dwinsemius at comcast.net Wed Nov 2 19:15:44 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 2 Nov 2011 14:15:44 -0400 Subject: [R] how to count number of occurrences In-Reply-To: References: Message-ID: <19A31890-BE8B-4943-8A16-8C24FC342BD9@comcast.net> On Nov 2, 2011, at 12:54 PM, Sl K wrote: > Dear R users, > > I have this data frame, > y samp > 8 0.03060419 X > 18 0.06120838 Y > 10 0.23588374 X > 3 0.32809965 X > 1 0.36007100 X > 7 0.36730571 X > 20 0.47176748 Y > 13 0.65619929 Y > 11 0.72014201 Y > 17 0.73461142 Y > 6 0.76221313 X > 2 0.77005691 X > 4 0.92477243 X > 9 0.93837591 X > 5 0.98883581 X > 16 1.52442626 Y > 12 1.54011381 Y > 14 1.84954487 Y > 19 1.87675183 Y > 15 1.97767162 Y dat$nXs <- cumsum(dat$samp=="X") dat$nYs <- cumsum(dat$samp=="Y") dat # y samp nXs nYs 8 0.03060419 X 1 0 18 0.06120838 Y 1 1 10 0.23588374 X 2 1 3 0.32809965 X 3 1 1 0.36007100 X 4 1 7 0.36730571 X 5 1 20 0.47176748 Y 5 2 13 0.65619929 Y 5 3 11 0.72014201 Y 5 4 17 0.73461142 Y 5 5 6 0.76221313 X 6 5 2 0.77005691 X 7 5 4 0.92477243 X 8 5 9 0.93837591 X 9 5 5 0.98883581 X 10 5 16 1.52442626 Y 10 6 12 1.54011381 Y 10 7 14 1.84954487 Y 10 8 19 1.87675183 Y 10 9 15 1.97767162 Y 10 10 I find that there are 5 X's before the second Y. > nXbefore_mthY <- function(m) dat[which(dat$nYs==m), "nXs"] > nXbefore_mthY(2) [1] 5 > > and I am trying to find the number of X's that occur before ith Y > occurs. > For example, there is 1 X before the first Y, so I get 1. There are > 4 X's > before the second Y, so I get 4, there is no X between second and > third Y, > so I get 0 and so on. Any hint to at least help me to start this > will be > appreciated. Thanks a lot! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From Achim.Zeileis at uibk.ac.at Wed Nov 2 19:15:50 2011 From: Achim.Zeileis at uibk.ac.at (Achim Zeileis) Date: Wed, 2 Nov 2011 19:15:50 +0100 (CET) Subject: [R] nproc parameter in efpFunctional In-Reply-To: <1320228608589-3972419.post@n4.nabble.com> References: <1320228608589-3972419.post@n4.nabble.com> Message-ID: On Wed, 2 Nov 2011, bonda wrote: > Hello all, > could anyone explain the exact meaning of parameter nproc? Why different > values of nproc give so different critical values, i.e. > > meanL2BB$computeCritval(0.05,nproc=3) > [1] 0.9984853 > meanL2BB$computeCritval(0.05,nproc=1) > [1] 0.4594827 > > The strucchange-package description gives "integer specifying for which > number of processes Brownian motions should be simulated" - do I need > nproc-dimensional Brownian bridge? Yes, see the 2006 CSDA paper, especially pages 2998/9. > Thank you in advance! > Julia > > -- > View this message in context: http://r.789695.n4.nabble.com/nproc-parameter-in-efpFunctional-tp3972419p3972419.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From gunter.berton at gene.com Wed Nov 2 19:29:47 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Wed, 2 Nov 2011 11:29:47 -0700 Subject: [R] Multiple comparison test about whole population, not about mean In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Wed Nov 2 20:10:40 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 2 Nov 2011 15:10:40 -0400 Subject: [R] does there any function like sumif in excel? In-Reply-To: <1320239193254-3972963.post@n4.nabble.com> References: <972822.20441.qm@web38408.mail.mud.yahoo.com> <1320239193254-3972963.post@n4.nabble.com> Message-ID: On Nov 2, 2011, at 9:06 AM, Celine wrote: > I have a similar problem > I have a dataframe : > > CELLCD AreaProtected > 8928 52.39389 > 8928 41.91511 > 8929 21.21975 > 8929 63.65925 > 8930 26.08547 > 8930 14.04602 > > I wouldlike to sum the AreaProtected if it is the same CELLCD in > another > column : > > CELLCD AreaProtected SumAreaProtected > 8928 52.39389 94.309 > 8928 41.91511 > 8929 21.21975 84,879 > 8929 63.65925 > 8930 26.08547 > 8930 14.04602 > > I am just started with R and I don't know how I can do that. > Do you have any ideas ? You could get sums within groups using the ave() function. It would put values on all rows. Was there a reason you only wanted values on the first? Then if you wanted to make the non-first elements NA that could also be done. > > > -- > View this message in context: http://r.789695.n4.nabble.com/does-there-any-function-like-sumif-in-excel-tp858444p3972963.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From dwinsemius at comcast.net Wed Nov 2 20:13:55 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 2 Nov 2011 15:13:55 -0400 Subject: [R] Sum with condition In-Reply-To: References: <1320237862644-3972839.post@n4.nabble.com> Message-ID: On Nov 2, 2011, at 2:11 PM, Sarah Goslee wrote: > Hi, > > On Wed, Nov 2, 2011 at 8:44 AM, Celine > wrote: Celine. Please stop posting duplicates. >> CELLCD AreaProtected >> 8928 52.39389 >> 8928 41.91511 >> 8929 21.21975 >> 8929 63.65925 >> 8930 26.08547 >> 8930 14.04602 > > You'll need to figure out how you want it to be combined with the > original data frame, since there can't be empty cells, but: > >> dput(testdata) > structure(list(CELLCD = c(8928L, 8928L, 8929L, 8929L, 8930L, > 8930L), AreaProtected = c(52.39389, 41.91511, 21.21975, 63.65925, > 26.08547, 14.04602)), .Names = c("CELLCD", "AreaProtected"), class = > "data.frame", row.names = c(NA, > -6L)) >> >> aggregate(testdata$AreaProtected, by=list(CELLCD=testdata$CELLCD), >> FUN="sum") > CELLCD x > 1 8928 94.30900 > 2 8929 84.87900 > 3 8930 40.13149 Or: > testdata$SumArea <- with(testdata, ave(AreaProtected, CELLCD, FUN=sum)) > testdata CELLCD AreaProtected SumArea 1 8928 52.39389 94.30900 2 8928 41.91511 94.30900 3 8929 21.21975 84.87900 4 8929 63.65925 84.87900 5 8930 26.08547 40.13149 6 8930 14.04602 40.13149 -- David Winsemius, MD Heritage Laboratories West Hartford, CT From dwinsemius at comcast.net Wed Nov 2 20:15:53 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 2 Nov 2011 15:15:53 -0400 Subject: [R] array manipulation In-Reply-To: References: Message-ID: On Nov 2, 2011, at 12:16 PM, Simone Salvadei wrote: > Hello, > I'm at the very beginning of the learning process of this language. > Sorry in advance for the (possible but plausible) stupidity of my > question. > > I would like to find a way to permute the DIMENSIONS of an array. > Something that sounds like the function "permute()" in matlab. > > Given an array C of dimensions c x d x T , for instance, the command > > permute(C, [2 1 3]) > ?aperm > would provide (in Matlab) an array very similar to C, but this time > each > one of the T matrices c x d has changed into its transposed. > Any alternatives to the following (and primitive) 'for' cycle? > > *# (previously defined) phi=array with dimensions c(c,d,T)* > * > * > *temp=array(0,dim=c(c,d,T))* > * for(i in 1:T)* > * {* > * temp[,,i]=t(phi[,,i])* > * }* > * phi=temp* > * > * David Winsemius, MD Heritage Laboratories West Hartford, CT From S.Ellison at LGCGroup.com Wed Nov 2 19:48:14 2011 From: S.Ellison at LGCGroup.com (S Ellison) Date: Wed, 2 Nov 2011 18:48:14 +0000 Subject: [R] Sum with condition In-Reply-To: <1320237862644-3972839.post@n4.nabble.com> References: <1320237862644-3972839.post@n4.nabble.com> Message-ID: If you used aggregate() on the data frame you would have a new data frame containing the sum of all AreaProtected for each CELLCD. For your mini-example, using d as your data frame, d2<-aggregate(d[,2], by=list(CELLCD=d$CELLCD),sum) d2 # CELLCD x # 1 8928 94.30900 # 2 8929 84.87900 # 3 8930 40.13149 If you then use merge() you get merge(d,d2) # CELLCD AreaProtected x #1 8928 52.39389 94.30900 #2 8928 41.91511 94.30900 #3 8929 21.21975 84.87900 #4 8929 63.65925 84.87900 #5 8930 26.08547 40.13149 #6 8930 14.04602 40.13149 Maybe one of those is what you want? > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Celine > Sent: 02 November 2011 12:44 > To: r-help at r-project.org > Subject: [R] Sum with condition > > I guess my problem is simple for most of you but I am new > with R and I need some help, I have a dataframe : > > CELLCD AreaProtected > 8928 52.39389 > 8928 41.91511 > 8929 21.21975 > 8929 63.65925 > 8930 26.08547 > 8930 14.04602 > > I wouldlike to sum the AreaProtected if it is the same CELLCD > in another column : > > CELLCD AreaProtected SumAreaProtected > 8928 52.39389 94.309 > 8928 41.91511 > 8929 21.21975 84,879 > 8929 63.65925 > 8930 26.08547 > 8930 14.04602 > > I am just started with R and I don't know how I can do that. > Do you have any ideas ? > > > Thanks a lot for your help, > > -- > View this message in context: > http://r.789695.n4.nabble.com/Sum-with-condition-tp3972839p397 > 2839.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ******************************************************************* This email and any attachments are confidential. Any use...{{dropped:8}} From torgrimsby at gmail.com Wed Nov 2 20:15:04 2011 From: torgrimsby at gmail.com (prinzOfNorway) Date: Wed, 2 Nov 2011 12:15:04 -0700 (PDT) Subject: [R] HOW TO REMOVE MTEXT FROM PLOT, plotting changing populations with titles in loop Message-ID: <1320261304220-3981757.post@n4.nabble.com> is there a way to hide/undraw mtext (or lines etc.) in a loop like plot(runif(10)) iterCol <- rainbowPalette(10) for(i in 1:10){ mtext(paste("this is iteration ", i, sep="")) points(runif(10),col=iterCol[i]) Sys.sleep(1) ## UNDRAW/HIDE the text so that it does not mess up the plot in the next iteration } -- View this message in context: http://r.789695.n4.nabble.com/HOW-TO-REMOVE-MTEXT-FROM-PLOT-plotting-changing-populations-with-titles-in-loop-tp3981757p3981757.html Sent from the R help mailing list archive at Nabble.com. From Werner.Poschenrieder at lrz.tum.de Wed Nov 2 20:16:56 2011 From: Werner.Poschenrieder at lrz.tum.de (Vern) Date: Wed, 2 Nov 2011 12:16:56 -0700 (PDT) Subject: [R] heteroscedastic bivariate distribution with linear regression - prediction interval Message-ID: <1320261416193-3981793.post@n4.nabble.com> Dear forum, which is the most suitable method to get the prediction interval of a bivariate normal distribution which is consistent with a linear model y = ax + b? I assume it is gls + predict. Am I correct? I'm rather new to R. Is there some reliable sample code for that problem? Thank you best regards -- View this message in context: http://r.789695.n4.nabble.com/heteroscedastic-bivariate-distribution-with-linear-regression-prediction-interval-tp3981793p3981793.html Sent from the R help mailing list archive at Nabble.com. From sarah.goslee at gmail.com Wed Nov 2 20:29:43 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Wed, 2 Nov 2011 15:29:43 -0400 Subject: [R] HOW TO REMOVE MTEXT FROM PLOT, plotting changing populations with titles in loop In-Reply-To: <1320261304220-3981757.post@n4.nabble.com> References: <1320261304220-3981757.post@n4.nabble.com> Message-ID: It's not perfect, but you could use: mtext(paste("this is iteration ", i, sep=""), col="white") to overwrite it, or polygon() to draw a white rectangle over the text each time. Sarah On Wed, Nov 2, 2011 at 3:15 PM, prinzOfNorway wrote: > is there a way to hide/undraw mtext (or lines etc.) in a loop like > > plot(runif(10)) > iterCol <- rainbowPalette(10) > > for(i in 1:10){ > > mtext(paste("this is iteration ", i, sep="")) > points(runif(10),col=iterCol[i]) > Sys.sleep(1) > > ## UNDRAW/HIDE the text so that it does not mess up the plot in the next > iteration > > } > -- Sarah Goslee http://www.functionaldiversity.org From mtmorgan at fhcrc.org Wed Nov 2 21:24:49 2011 From: mtmorgan at fhcrc.org (Martin Morgan) Date: Wed, 02 Nov 2011 13:24:49 -0700 Subject: [R] overloading + operator for chars In-Reply-To: <1320241924.8401.YahooMailNeo@web110714.mail.gq1.yahoo.com> References: <1320241924.8401.YahooMailNeo@web110714.mail.gq1.yahoo.com> Message-ID: <4EB1A711.5070809@fhcrc.org> On 11/02/2011 06:52 AM, Albert-Jan Roskam wrote: > Hello, > > I would like to overload the "+" operator so that it can be used to concatenate two strings, e.g "John" + "Doe" = "JohnDoe". > How can I 'unseal' the "+" method? >> setMethod("+", signature(e1="character", e2="character"), function(e1, e2) paste(e1, e2, sep="") ) > Error in setMethod("+", signature(e1 = "character", e2 = "character"), : > the method for function "+" and signature e1="character", e2="character" is sealed and cannot be re-defined >> > Hi -- I think the two issues are that "+" is part of the "Arith" group generic (?Methods, ?Arith) and that `+` (actually, members of the Ops group) for primitive types dispatches directly without doing method look-up. Personally I might setClass("Character", contains="character") Character <- function(...) new("Character", ...) setMethod("Arith", c("Character", "Character"), function(e1, e2) { switch(.Generic, "+"=Character(paste(e1, e2, sep="")), stop("unhandled 'Arith' operator '", .Generic, "'")) }) and then > Character(c("foo", "bar")) + Character("baz") [1] "foobaz" "barbaz" Some might point to > `%+%` <- function(e1, e2) paste(e1, e2, sep="") > "foo" %+% "bar" [1] "foobar" Martin > Cheers!! > Albert-Jan > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > [[alternative HTML version deleted]] > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 From skhanvil at qualcomm.com Wed Nov 2 22:14:58 2011 From: skhanvil at qualcomm.com (Khanvilkar, Shashank) Date: Wed, 2 Nov 2011 21:14:58 +0000 Subject: [R] Help with curr directory in R Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mxkuhn at gmail.com Wed Nov 2 22:26:49 2011 From: mxkuhn at gmail.com (Max Kuhn) Date: Wed, 2 Nov 2011 17:26:49 -0400 Subject: [R] palettes for the color-blind Message-ID: Everyone, I'm working with scatter plots with different colored symbols (via lattice). I'm currently using these colors for points and lines: col1 <- c(rgb(1, 0, 0), rgb(0, 0, 1), rgb(0, 1, 0), rgb(0.55482458, 0.40350876, 0.04166666), rgb(0, 0, 0)) plot(seq(along = col1), pch = 16, col = col1, cex = 1.5) I'm also using these with transparency (alpha between .5-.8 depending on the number of points). I'd like to make sure that these colors are interpretable by the color bind. Doing a little looking around, this might be a good palette: col2 <- c(rgb(0, 0.4470588, 0.6980392), rgb(0.8352941, 0.3686275, 0, ), rgb(0.8000000, 0.4745098, 0.6549020), rgb(0.1686275, 0.6235294, 0.4705882), rgb(0.9019608, 0.6235294, 0.0000000)) plot(seq(along = col2), pch = 16, col = col2, cex = 1.5) but to be honest, I'd like to use something a little more vibrant. First, can anyone verify that these the colors in col2 are differentiable to someone who is color blind? Second, are there any other specific palettes that can be recommended? How do the RColorBrewer palettes rate in this respect? Thanks, Max From baptiste.auguie at googlemail.com Wed Nov 2 22:40:12 2011 From: baptiste.auguie at googlemail.com (baptiste auguie) Date: Thu, 3 Nov 2011 10:40:12 +1300 Subject: [R] palettes for the color-blind In-Reply-To: References: Message-ID: Hi, Try the dichromat package (also dichromat_pal in the scales package). HTH, baptiste On 3 November 2011 10:26, Max Kuhn wrote: > Everyone, > > I'm working with scatter plots with different colored symbols (via > lattice). I'm currently using these colors for points and lines: > > col1 <- c(rgb(1, 0, 0), rgb(0, 0, 1), > ? ? ? ? rgb(0, 1, 0), > ? ? ? ? rgb(0.55482458, 0.40350876, 0.04166666), > ? ? ? ? rgb(0, 0, 0)) > plot(seq(along = col1), pch = 16, col = col1, cex = 1.5) > > I'm also using these with transparency (alpha between .5-.8 depending > on the number of points). > > I'd like to make sure that these colors are interpretable by the color > bind. Doing a little looking around, this might be a good palette: > > col2 <- c(rgb(0, ? ? ? ? 0.4470588, 0.6980392), > ? ? ? ? ?rgb(0.8352941, 0.3686275, 0, ? ? ? ), > ? ? ? ? ?rgb(0.8000000, 0.4745098, 0.6549020), > ? ? ? ? ?rgb(0.1686275, 0.6235294, 0.4705882), > ? ? ? ? ?rgb(0.9019608, 0.6235294, 0.0000000)) > > plot(seq(along = col2), pch = 16, col = col2, cex = 1.5) > > but to be honest, I'd like to use something a little more vibrant. > > First, can anyone verify that these the colors in col2 are > differentiable to someone who is color blind? > > Second, are there any other specific palettes that can be recommended? > How do the RColorBrewer palettes rate in this respect? > > Thanks, > > Max > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From tlumley at uw.edu Wed Nov 2 23:03:59 2011 From: tlumley at uw.edu (Thomas Lumley) Date: Thu, 3 Nov 2011 11:03:59 +1300 Subject: [R] palettes for the color-blind In-Reply-To: References: Message-ID: On Thu, Nov 3, 2011 at 10:26 AM, Max Kuhn wrote: > First, can anyone verify that these the colors in col2 are > differentiable to someone who is color blind? > > Second, are there any other specific palettes that can be recommended? > How do the RColorBrewer palettes rate in this respect? If you go to www.colorbrewer.org, the ColorBrewer site, it has ratings of the palettes for visibility under a variety of conditions, including red-green color blindness. Some of them are good, but not all of them. The dichromat package attempts to show the impact of both sorts of red:green anomalous vision on color visibility. It isn't quite right because of gamma correction, but people have told me that it is a fairly good representation, and it does have the right impact on clustering of pixels in some of the Ishihara color vision tests. It suggests that your colors 1 and 3 will be too similar and 2 and 4 will also be too similar for someone with protanopia. You aren't going to be able to get five colors that are equal luminance, equal chroma, and distinguishable to dichromats: you're putting three constraints on a three-dimensional space and you will end up with just two points. For three colors I would suggest orange, blue, gray. More than three will be hard. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland From carl at witthoft.com Wed Nov 2 23:04:25 2011 From: carl at witthoft.com (Carl Witthoft) Date: Wed, 02 Nov 2011 18:04:25 -0400 Subject: [R] palettes for the color-blind Message-ID: <4EB1BE69.7030700@witthoft.com> Before you pick out a palette: you are aware that their are several different types of color-blindness, aren't you? http://en.wikipedia.org/wiki/Color_blind -- Sent from my Cray XK6 "Pendeo-navem mei anguillae plena est." From jdnewmil at dcn.davis.ca.us Wed Nov 2 23:05:01 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Wed, 02 Nov 2011 15:05:01 -0700 Subject: [R] difference between foo$a[2] <- 1 and foo[2,"a"] <- 1 In-Reply-To: <6DF57D41-2621-4B37-AC0E-DC1325EAD172@gmail.com> References: <6DF57D41-2621-4B37-AC0E-DC1325EAD172@gmail.com> Message-ID: <5ac3ea16-fc63-45e3-95bf-94b0e12bce42@email.android.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From skhanvil at qualcomm.com Wed Nov 2 23:12:14 2011 From: skhanvil at qualcomm.com (Khanvilkar, Shashank) Date: Wed, 2 Nov 2011 22:12:14 +0000 Subject: [R] Help with curr directory in R In-Reply-To: References: Message-ID: I think I have solved this problem The issue was: some user had changed the registry entry for HKEY_CURRENT_USER\Software\Microsoft\Command Processor (added Autorun and set it to C:\) So the current dir was always pointing to C:\ I deleted that and everything works. Thanks Shank -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Khanvilkar, Shashank Sent: Wednesday, November 02, 2011 2:15 PM To: R Mailing List Subject: [R] Help with curr directory in R Hello All Thanks for all responses in advance, I am invoking R from command line from C:\TEMP as C:\Program Files\R\R-2.9.1\bin\R.exe --vanilla -f C:\temp\test.R On two different machines The test.R looks like: --SNIP- print(c("CurrDir=", getwd())) proc.time() warnings() --SNIP- On one machine I get the currDir Correctly printed as C:/TEMP But on another it gets printed as C:/ Does anyone know why this can be.. Is there some env variable that R is giving priority over current dir? Thanks Shank [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From alaios at yahoo.com Wed Nov 2 23:37:07 2011 From: alaios at yahoo.com (Alaios) Date: Wed, 2 Nov 2011 15:37:07 -0700 (PDT) Subject: [R] Error: serialization is too large to store in a raw vector Message-ID: <1320273427.56875.YahooMailNeo@web120104.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From tlumley at uw.edu Wed Nov 2 23:38:49 2011 From: tlumley at uw.edu (Thomas Lumley) Date: Thu, 3 Nov 2011 11:38:49 +1300 Subject: [R] palettes for the color-blind In-Reply-To: <4EB1BE69.7030700@witthoft.com> References: <4EB1BE69.7030700@witthoft.com> Message-ID: On Thu, Nov 3, 2011 at 11:04 AM, Carl Witthoft wrote: > > Before you pick out a palette: ?you are aware that their are several > different types of color-blindness, aren't you? Yes, but to first approximation there are only two, and they have broadly similar, though not identical impact on choice of color palettes. The dichromat package knows about them, and so does Professor Brewer. More people will be unable to read your graphs due to some kind of gross visual impairment (cataracts, uncorrected focusing problems, macular degeneration, etc) than will have tritanopia or monochromacy. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland From gunter.berton at gene.com Wed Nov 2 23:45:14 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Wed, 2 Nov 2011 15:45:14 -0700 Subject: [R] Error: serialization is too large to store in a raw vector In-Reply-To: <1320273427.56875.YahooMailNeo@web120104.mail.ne1.yahoo.com> References: <1320273427.56875.YahooMailNeo@web120104.mail.ne1.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jtor14 at gmail.com Wed Nov 2 23:51:39 2011 From: jtor14 at gmail.com (Justin Haynes) Date: Wed, 2 Nov 2011 15:51:39 -0700 Subject: [R] mysterious warning message regarding bytecode... Message-ID: While running a long script which source()s other scripts I get the following warning: Warning message: In t(object$S[[1]]) : bytecode version mismatch; using eval I cannot replicate it if I run the sourced files line by line though... What is that error? And do I care about it? It doesn't seem to affect my output as far as I can tell. Thanks! Justin > sessionInfo() R version 2.13.2 (2011-09-30) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] mgcv_1.7-9 stringr_0.5 RPostgreSQL_0.2-0 biglm_0.8 DBI_0.2-5 doMC_1.2.3 multicore_0.1-7 [8] foreach_1.3.2 codetools_0.2-8 iterators_1.0.5 cairoDevice_2.19 pixmap_0.4-11 gridExtra_0.8.5 splancs_2.01-29 [15] sp_0.9-91 ellipse_0.3-5 ggplot2_0.8.9 proto_0.3-9.2 reshape_0.8.4 plyr_1.6 MASS_7.3-14 loaded via a namespace (and not attached): [1] compiler_2.13.2 digest_0.5.1 lattice_0.19-33 Matrix_1.0-1 nlme_3.1-102 From mentor_ at gmx.net Thu Nov 3 00:15:21 2011 From: mentor_ at gmx.net (syrvn) Date: Wed, 2 Nov 2011 16:15:21 -0700 (PDT) Subject: [R] StatET: Commands are not submitted to the console Message-ID: <1320275721382-3983662.post@n4.nabble.com> Hello! I am working for quite a while now with the Eclipse/StatET approach and it always worked very well until I updated to the 2.0 version of StatET. After the official release the RJ console did not start. After they released another update a couple of days later it worked fine again. I did not do any programming for a week now and today I realised that when I execute the selected lines in an R file they are not submitted to the console. I also tried RTerm instead of RJ but it's the same problem. Commands are not submitted to the console. Did anyone come across the same problem and knows how to fix that. Googling does not result in any useful pages I am running Mac OS X Lion + Newest version of Eclipse and StatET. Everything is up to date. Cheers, syrvn -- View this message in context: http://r.789695.n4.nabble.com/StatET-Commands-are-not-submitted-to-the-console-tp3983662p3983662.html Sent from the R help mailing list archive at Nabble.com. From david.evans at alaska.gov Thu Nov 3 00:24:29 2011 From: david.evans at alaska.gov (Evans, David G (DFG)) Date: Wed, 02 Nov 2011 15:24:29 -0800 Subject: [R] Lattice plots and missing x-axis labels on second page Message-ID: <00A769DD3A6D594C9902BC12EF1CAF8E064CE6EC@SOAANCMSG01.soa.alaska.gov> Hello, I'm trying to make a lattice plot (using xyplot()). I have included a "layout=c(3,4)" statement, giving me 12 plots per page and an "as.table=TRUE" statement, directing the way the plots are laid out. I have 18 plots altogether and so 6 of them end up on the second page. Everything looks fine for the first page, but the x-axis labels (e.g. 1993, 1994...) are all missing on the second page. The x-axis variable name ("Year") is there at the bottom, however. Any help is appreciated. Thanks. David G. Evans Biometrician Division of Sport Fish Alaska Dept . of Fish and Game Anchorage, Ak 99518 From peter.langfelder at gmail.com Thu Nov 3 01:00:32 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Wed, 2 Nov 2011 17:00:32 -0700 Subject: [R] R 2.14.0 installation problem (?) Message-ID: Hi all, I downloaded R-2.14.0.tar.gz, unpacked, ran configure with --with-blas="-lgoto2", built, then issued make install rhome=/usr/local/lib/R-2.14.0-goto This produced the following error: [lots of output deleted] make[3]: Entering directory `/home/plangfelder/Download/R-2.14.0/src/modules/lapack' cp: cannot create regular file `/usr/local/lib/R-2.14.0-goto/lib/libRlapack.so': No such file or directory make[3]: *** [install] Error 1 make[3]: Leaving directory `/home/plangfelder/Download/R-2.14.0/src/modules/lapack' make[2]: *** [install] Error 1 make[2]: Leaving directory `/home/plangfelder/Download/R-2.14.0/src/modules' make[1]: *** [install] Error 1 make[1]: Leaving directory `/home/plangfelder/Download/R-2.14.0/src' make: *** [install] Error 1 Apparently the error was caused by the fact that the script neglected to create the directory /usr/local/lib/R-2.14.0-goto/lib/ before attempting to copy the libRlapack.so file into it. Creating the directory manually and re-running make install rhome=/usr/local/lib/R-2.14.0-goto solved the problem. The installed R seems to work fine. Is it a bug in the appropriate Makefile, or did I do something wrong? I'm on Fedora 9 (i686, i386, kernel 2.6.27.25-78.2.56.fc9.i686 if that makes any difference)... Thanks, Peter From info at jochen-bauer.net Thu Nov 3 01:08:19 2011 From: info at jochen-bauer.net (Jochen1980) Date: Wed, 2 Nov 2011 17:08:19 -0700 (PDT) Subject: [R] Kolmogorov-Smirnov-Test on binned data, I guess gumbel-distributed data Message-ID: <1320278899657-3983781.post@n4.nabble.com> Hi R-Users, I read some texts related to KS-tests. Most of those authors stated, that KS-Tests are not suitable for binned data, but some of them refer to 'other' authors who are claiming that KS-Tests are okay for binned data. I searched for sources and can't find examples which approve that it is okay to use KS-Tests for binned data - do you have any links to articles or tutorials? Anyway, I look for a test which backens me up that my data is gumbel-distributed. I estimated the gumbel-parameters mue and beta and after having a look on resulting plots, in my opinion: that looks quite good! You can the plot, related data, and the rscript here: www.jochen-bauer.net/downloads/kstest/Rplots-1000.pdf http://www.jochen-bauer.net/downloads/kstest/rm2700-1000.txt http://www.jochen-bauer.net/downloads/kstest/rcalc.R The story about the data: I am wondering what test I should choose if KS-Test is not appropriate? I get real high p-Values for data-row-1-histogram-heights and fitted-gumbel-distribution-function-to-bin-midth-vals. Most of the time, KS-test results in distances of 0.01 and p-Values of 0.99 or 1. This sounds strange to me, too high. Otherwise my plots are looking good and as you can see, in my first experiment I sampled 1000 values. In a second experiment I created only 50 random-values for the gumbel-parameter-estimation. I try to reduce permutations, so I will be able to create results faster, but I have to find out, when data fails for gumbel-distribution. The results surprised me, I expected that my tests and plots get worse, but I got still high p-values for the KS-Test and still a nice looking plot. www.jochen-bauer.net/downloads/kstest/Rplots-0050.pdf http://www.jochen-bauer.net/downloads/kstest/rm2700-0050.txt Moreover besides the shuffled data of my randomisation-test there are real-data-values. I calculated the p-value that my real data point occurs under estimated gumbel distribution. Those p-values between 1000permutation-experiment and 50-permutation-experiment are correlating enormously ... around 0.98. Pearson and Spearman-correlation-coefficients told me this. I guess that backens up the fact, that my plots are not getting worse nor the KS-Tests do. I hope I was able to state my current situation and you are able to give me some hints, for some literature or other tests or backen me up in my guess that my data is gumbel-distributed. Thanks in advance. Jochen I hope I was able to tell -- View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-Test-on-binned-data-I-guess-gumbel-distributed-data-tp3983781p3983781.html Sent from the R help mailing list archive at Nabble.com. From david.evans at alaska.gov Thu Nov 3 01:23:28 2011 From: david.evans at alaska.gov (Evans, David G (DFG)) Date: Wed, 02 Nov 2011 16:23:28 -0800 Subject: [R] Lattice plots and missing x-axis labels on second page In-Reply-To: <00A769DD3A6D594C9902BC12EF1CAF8E064CE6EC@SOAANCMSG01.soa.alaska.gov> References: <00A769DD3A6D594C9902BC12EF1CAF8E064CE6EC@SOAANCMSG01.soa.alaska.gov> Message-ID: <00A769DD3A6D594C9902BC12EF1CAF8E064CE722@SOAANCMSG01.soa.alaska.gov> I should say I'm using Windows-7, R version 2.13.0 and lattice version 0.19-33. I've pared down my code to this : pdat = read.table("RGRAPHSDGE.csv",header=T,sep=",",fill=T) print(xyplot(pdat$NITRATE~pdat$DATEYR|pdat$WELL, as.table=TRUE, layout=c(3,4), xlab="Year", ylab="Nitrate mg / litre", strip=FALSE )) First 3 lines of pdat looks like this: WELL DATEYR NITRATE 1 ALASKA CHILDRENS SERVICES 1993.836 0.81 2 ALASKA CHILDRENS SERVICES 1994.850 0.91 3 ALASKA CHILDRENS SERVICES 1995.803 0.94 .... Thanks again. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Evans, David G (DFG) Sent: Wednesday, November 02, 2011 3:24 PM To: r-help at r-project.org Subject: [R] Lattice plots and missing x-axis labels on second page Hello, I'm trying to make a lattice plot (using xyplot()). I have included a "layout=c(3,4)" statement, giving me 12 plots per page and an "as.table=TRUE" statement, directing the way the plots are laid out. I have 18 plots altogether and so 6 of them end up on the second page. Everything looks fine for the first page, but the x-axis labels (e.g. 1993, 1994...) are all missing on the second page. The x-axis variable name ("Year") is there at the bottom, however. Any help is appreciated. Thanks. David G. Evans Biometrician Division of Sport Fish Alaska Dept . of Fish and Game Anchorage, Ak 99518 ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From chris at trickysolutions.com.au Thu Nov 3 01:46:26 2011 From: chris at trickysolutions.com.au (Chris Howden) Date: Thu, 3 Nov 2011 11:46:26 +1100 Subject: [R] anova or liklihood ratio test from biglm output In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Thu Nov 3 01:56:54 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 2 Nov 2011 20:56:54 -0400 Subject: [R] Error: serialization is too large to store in a raw vector In-Reply-To: <1320273427.56875.YahooMailNeo@web120104.mail.ne1.yahoo.com> References: <1320273427.56875.YahooMailNeo@web120104.mail.ne1.yahoo.com> Message-ID: <583178AE-A23C-477D-B73F-69BF70C728C7@comcast.net> On Nov 2, 2011, at 6:37 PM, Alaios wrote: > Dear all, > I have quite large code (with lapply and mclapply) > and I am getting the following error. > > Error: serialization is too large to store in a raw vector > > Is it possible to ask from R to extend the Error messages with more details? > I would like to see where this problem exists. > ?traceback > B.R > Alex > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From mxkuhn at gmail.com Thu Nov 3 03:08:03 2011 From: mxkuhn at gmail.com (Max Kuhn) Date: Wed, 2 Nov 2011 22:08:03 -0400 Subject: [R] palettes for the color-blind In-Reply-To: References: <4EB1BE69.7030700@witthoft.com> Message-ID: Yes, I was aware of the different type and their respective prevalences. The dichromat package helped me find what I needed. Thanks, Max On Wed, Nov 2, 2011 at 6:38 PM, Thomas Lumley wrote: > On Thu, Nov 3, 2011 at 11:04 AM, Carl Witthoft wrote: >> >> Before you pick out a palette: ?you are aware that their are several >> different types of color-blindness, aren't you? > > Yes, but to first approximation there are only two, and they have > broadly similar, though not identical impact on choice of color > palettes. ?The dichromat package knows about them, and so does > Professor Brewer. > > More people will be unable to read your graphs due to some kind of > gross visual impairment (cataracts, uncorrected focusing problems, > macular degeneration, etc) than will have tritanopia or monochromacy. > > ? -thomas > > -- > Thomas Lumley > Professor of Biostatistics > University of Auckland > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Max From akpbond007 at gmail.com Thu Nov 3 07:41:46 2011 From: akpbond007 at gmail.com (arunkumar1111) Date: Wed, 2 Nov 2011 23:41:46 -0700 (PDT) Subject: [R] Help in ranef Function Message-ID: <1320302506607-3984436.post@n4.nabble.com> Hi I'm getting the intercepts of the Random effects as 0. Please help me to understand why this is coming Zero This is my R code Data<- read.csv("C:/FE and RE.csv") Formula="Y~X2+X3+X4 + (1|State) + (0+X5|State)" fit=lmer(formula=Formula,data=Data) ranef(fit). My sample Data State Year Y X2 X3 X4 X5 X6 S2 1960 27.8 397.5 42.2 50.7 78.3 65.8 S1 1960 29.9 413.3 38.1 52 79.2 66.9 S2 1961 29.8 439.2 40.3 54 79.2 67.8 S1 1961 30.8 459.7 39.5 55.3 79.2 69.6 S2 1962 31.2 492.9 37.3 54.7 77.4 68.7 S1 1962 33.3 528.6 38.1 63.7 80.2 73.6 S2 1963 35.6 560.3 39.3 69.8 80.4 76.3 S1 1963 36.4 624.6 37.8 65.9 83.9 77.2 S2 1964 36.7 666.4 38.4 64.5 85.5 78.1 S1 1964 38.4 717.8 40.1 70 93.7 84.7 S2 1965 40.4 768.2 38.6 73.2 106.1 93.3 S1 1965 40.3 843.3 39.8 67.8 104.8 89.7 S2 1966 41.8 911.6 39.7 79.1 114 100.7 S1 1966 40.4 931.1 52.1 95.4 124.1 113.5 S2 1967 40.7 1021.5 48.9 94.2 127.6 115.3 S1 1967 40.1 1165.9 58.3 123.5 142.9 136.7 S2 1968 42.7 1349.6 57.9 129.9 143.6 139.2 S1 1968 44.1 1449.4 56.5 117.6 139.2 132 S2 1969 46.7 1575.5 63.7 130.9 165.5 132.1 S1 1969 50.6 1759.1 61.6 129.8 203.3 154.4 S2 1970 50.1 1994.2 58.9 128 219.6 174.9 S1 1970 51.7 2258.1 66.4 141 221.6 180.8 S2 1971 52.9 2478.7 70.4 168.2 232.6 189.4 -- View this message in context: http://r.789695.n4.nabble.com/Help-in-ranef-Function-tp3984436p3984436.html Sent from the R help mailing list archive at Nabble.com. From rolf.turner at xtra.co.nz Thu Nov 3 08:37:34 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Thu, 03 Nov 2011 20:37:34 +1300 Subject: [R] Problem with R CMD check and the inconsolata font business. Message-ID: <4EB244BE.7010905@xtra.co.nz> I have just installed R version 2.14.0 and tried to re-build and re-check some of the packages that I maintain. I'm getting a warning (in the process of running R CMD check on my "deldir" package): > * checking PDF version of manual ... WARNING > LaTeX errors when creating PDF version. > This typically indicates Rd problems. > LaTeX errors found: > ! Font T1/fi4/m/n/10=ec-inconsolata at 10.0pt not loadable: Metric > (TFM) file n > ot found. > > relax > l.19 ...lf Turner }\email{r.turner at auckland.ac.nz} > ! \textfont 0 is undefined (character h). > \Url at FormatString ...\Url at String \UrlRight \m at th $ > > l.26 ...\AsIs{}\url{http://www.math.unb.ca/~rolf/} > \AsIs{} > ! \textfont 0 is undefined (character t). > \Url at FormatString ...\Url at String \UrlRight \m at th $ > > l.26 ...\AsIs{}\url{http://www.math.unb.ca/~rolf/} > \AsIs{} > ! \textfont 0 is undefined (character t). ...... etc., etc., etc., ad (almost) infinitum. So there's some problem with a font file not being "loadable". Can anyone tell me what the I should ***do*** about this? I managed to install the "inconsolata" package from CTAN. At least I think I managed; I downloaded the *.zip file and then unzipped it in /usr/share/texmf/tex/latex. And ran "texhash". This stopped R CMD check from complaining that the "inconsolata" package could not be found, but then led to the further complaint described above. So how do I make the required font "loadable"? What files do I need? Where do I get them? And where should I put them once I've got them? I would be grateful for any assistance that can be rendered. (I know it's "just a warning", but I *hate* to ignore warnings!) cheers, Rolf Turner P. S. I'm running Ubuntu; session info, in case it's of any relevance is: sessionInfo() R version 2.14.0 (2011-10-31) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] misc_0.0-15 From amtv.statpresi at gmail.com Thu Nov 3 07:09:19 2011 From: amtv.statpresi at gmail.com (amitava) Date: Wed, 2 Nov 2011 23:09:19 -0700 (PDT) Subject: [R] Problem with svyvar in survey package In-Reply-To: References: <1319455315982-3932818.post@n4.nabble.com> <1319537033336-3936205.post@n4.nabble.com> <1319792966972-3947306.post@n4.nabble.com> Message-ID: <1320300559994-3984390.post@n4.nabble.com> Thank you Sir.Now it is giving results correctly. --Amitava -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-svyvar-in-survey-package-tp3932818p3984390.html Sent from the R help mailing list archive at Nabble.com. From debmidya at yahoo.com Thu Nov 3 05:02:46 2011 From: debmidya at yahoo.com (Deb Midya) Date: Wed, 2 Nov 2011 21:02:46 -0700 (PDT) Subject: [R] Extract Data from Yahoo Finance Message-ID: <1320292966.73541.YahooMailNeo@web161421.mail.bf1.yahoo.com> Hi R ?users, ? I am using R-2.14.0 on Windows XP. ? May I request you to assist me for the following please. ? I like to extract all the fields (example: a : Ask, b : Bid, ??, w : 52-week Range, x: Stock Exchange) ?for certain period of time, say, 1 October 2011 to 31 October 2011. ? Is there any R-Package(s) & any R- script please? ? Once again, thank you very much for the time you have given. ? Regards, ? Deb From stefan.bienert at unibas.ch Thu Nov 3 00:56:22 2011 From: stefan.bienert at unibas.ch (Stefan Bienert) Date: Thu, 3 Nov 2011 00:56:22 +0100 Subject: [R] MD5 checksum, mirror Zuerich Message-ID: <51F2FA51-912F-494C-A2DE-75BE8083735D@unibas.ch> Hi there, I just downloaded the newest version of R for Mac from the mirror in Zuerich? checksums do not match. bye, stefan From flokke at live.de Wed Nov 2 21:46:10 2011 From: flokke at live.de (flokke) Date: Wed, 2 Nov 2011 13:46:10 -0700 (PDT) Subject: [R] problem with merging two matrices Message-ID: <1320266770417-3983136.post@n4.nabble.com> Dear all, I hope you can forgive me my stupid questions, but I am a very new R user (; So, this is my question: I have two matrices, those are: matrix1 <- matrix(cbind(vector1, vector2), 1,2, dimnames = list(c("values"), c("T value", "p value"))) matrix2 <- matrix(dcbind,2,6,dimnames = list(c("x", "y"), c("Min", "1st qu.", "Median", "Mean", "3rd qu.", "Max"))) Now, I would like to merge them, but I want to receive the following result: Min 1st qu. Median Mean 3rd qu. Max x 3 3 4 4 4 4 y 3 3 3 3 3 3 t value p value value 3 3 so both vectors should stand above each other... when I use merge() I dont get this result, also not with cbind or rbind. I neither can make a a data frame of the two matrices. I think that I should use the function array with dim(6,2,2), but I dont know how that is exactly working (I couldn make it working) I would be very glad if you could let me know how to solve this problem. Cheers, maria -- View this message in context: http://r.789695.n4.nabble.com/problem-with-merging-two-matrices-tp3983136p3983136.html Sent from the R help mailing list archive at Nabble.com. From GormleyA at landcareresearch.co.nz Thu Nov 3 04:47:04 2011 From: GormleyA at landcareresearch.co.nz (Andrew Gormley) Date: Thu, 3 Nov 2011 16:47:04 +1300 Subject: [R] text orientation in persp Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ntcaumeran at yahoo.com Thu Nov 3 03:59:43 2011 From: ntcaumeran at yahoo.com (Nicholay Anne Caumeran) Date: Wed, 2 Nov 2011 19:59:43 -0700 Subject: [R] How much data can R process? Message-ID: <1320289183.50370.YahooMailNeo@web161018.mail.bf1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pierre-etienne.lessard.1 at ulaval.ca Wed Nov 2 20:24:29 2011 From: pierre-etienne.lessard.1 at ulaval.ca (PEL) Date: Wed, 2 Nov 2011 12:24:29 -0700 (PDT) Subject: [R] Creating barplot using time as X Message-ID: <1320261869546-3981961.post@n4.nabble.com> Hello all, my data looks like this: "phaseno" / "activity" / "beg" / "end" / "phasetime" 1 / "L" / 2010-06-03 19:15:24 / 2010-06-03 21:18:14 / 7370 2 / "D" / 2010-06-03 21:18:15 / 2010-06-03 21:19:55 / 100 3 / "W" / 2010-06-03 21:19:56 / 2010-06-03 21:22:47 / 171 4 / "D" / 2010-06-03 21:22:48 / 2010-06-03 21:23:47 / 59 5 / "W" / 2010-06-03 21:23:48 / 2010-06-03 21:23:53 / 5 6 / "D" / 2010-06-03 21:23:54 / 2010-06-03 21:26:18 / 144 7 / "W" / 2010-06-03 21:26:19 / 2010-06-03 21:32:10 / 351 8 / "D" / 2010-06-03 21:32:11 / 2010-06-03 21:32:11 / 0 9 / "W" / 2010-06-03 21:32:12 / 2010-06-03 21:32:12 / 0 10 / "D" / 2010-06-03 21:32:13 / 2010-06-03 21:32:29 / 16 Please note that phasetime is in seconds and is only a difftime() of "beg" and "end". I want to create a stacked bar chart that gives me the percentage of time spent doing every activity (L,D or W) for every 24h period. Example: Day 1: 20% =L; 40% = W ; 40% = D. In a graph obviously. What I was thinking was dividing my dataframe into several smaller dataframes spanning 24h. I tried with by() but I doubt this is the correct function. If possible, I would also like to separate a row the phasetime of a row if it overlaps two 24 periods. I joined part of my data where *header=TRUE and sep="\t"* http://r.789695.n4.nabble.com/file/n3981961/example.txt example.txt Thank you very much for your time PEL -- View this message in context: http://r.789695.n4.nabble.com/Creating-barplot-using-time-as-X-tp3981961p3981961.html Sent from the R help mailing list archive at Nabble.com. From rz1991 at foxmail.com Thu Nov 3 06:17:53 2011 From: rz1991 at foxmail.com (=?gbk?B?yO7vow==?=) Date: Thu, 3 Nov 2011 13:17:53 +0800 Subject: [R] Question about Calculation of Cross Product Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bonda at hsu-hh.de Thu Nov 3 08:56:10 2011 From: bonda at hsu-hh.de (bonda) Date: Thu, 3 Nov 2011 00:56:10 -0700 (PDT) Subject: [R] nproc parameter in efpFunctional In-Reply-To: References: <1320228608589-3972419.post@n4.nabble.com> Message-ID: <1320306970603-3984605.post@n4.nabble.com> Thank you. I've understood, that it should be k (number of parameters) separate Brownian bridges. Is it possible, to get such separated/disaggregated processes also in function efp()? (one can take gefp(..., family=gaussian), or construct by myself residuals(lm.model)*X, but still interesting). And on the contrary, how can I get an aggregated Brownian bridge path for all parameters together, similar to efp()$process? It is made in plot.gefp, but only for graphical visualization... Thank you in advance! Julia -- View this message in context: http://r.789695.n4.nabble.com/nproc-parameter-in-efpFunctional-tp3972419p3984605.html Sent from the R help mailing list archive at Nabble.com. From pdalgd at gmail.com Thu Nov 3 09:31:53 2011 From: pdalgd at gmail.com (peter dalgaard) Date: Thu, 3 Nov 2011 09:31:53 +0100 Subject: [R] MD5 checksum, mirror Zuerich In-Reply-To: <51F2FA51-912F-494C-A2DE-75BE8083735D@unibas.ch> References: <51F2FA51-912F-494C-A2DE-75BE8083735D@unibas.ch> Message-ID: <8ADEE248-EA8F-4F02-B749-31A228AD9D0E@gmail.com> On Nov 3, 2011, at 00:56 , Stefan Bienert wrote: > Hi there, > > I just downloaded the newest version of R for Mac from the mirror in Zuerich? checksums do not match. They do for me. Perhaps you caught it in the middle of an update. Please retry. > > bye, > > stefan > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com From dasolexa at hotmail.com Thu Nov 3 09:47:52 2011 From: dasolexa at hotmail.com (David A.) Date: Thu, 3 Nov 2011 09:47:52 +0100 Subject: [R] non-parametric sample size calculation Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ken.knoblauch at inserm.fr Thu Nov 3 10:09:35 2011 From: ken.knoblauch at inserm.fr (Ken Knoblauch) Date: Thu, 3 Nov 2011 09:09:35 +0000 Subject: [R] palettes for the color-blind References: <4EB1BE69.7030700@witthoft.com> Message-ID: Max Kuhn gmail.com> writes: > > Yes, I was aware of the different type and their respective prevalences. > > The dichromat package helped me find what I needed. > > Thanks, > > Max > > On Wed, Nov 2, 2011 at 6:38 PM, Thomas Lumley uw.edu> wrote: > > On Thu, Nov 3, 2011 at 11:04 AM, Carl Witthoft witthoft.com> wrote: > >> > >> Before you pick out a palette: ?you are aware that their are several > >> different types of color-blindness, aren't you? > > > > Yes, but to first approximation there are only two, and they have > > broadly similar, though not identical impact on choice of color > > palettes. ?The dichromat package knows about them, and so does > > Professor Brewer. > > > > More people will be unable to read your graphs due to some kind of > > gross visual impairment (cataracts, uncorrected focusing problems, > > macular degeneration, etc) than will have tritanopia or monochromacy. > > > > ? -thomas > > Sorry to come into this late, but I was travelling. As indicated, the dichromat package will give you an excellent first order approximation as to what works or doesn't for the 3 types of congenital dichromacies, As indicated by THomas, the two most prevalent varieties, protanopia and deuteranopia, result in similar confusion axes and the third, tritanopia, is relatively rare, except in eye disease. That said, the most prevalent color deficiencies are not the dichromacies but the anomalous trichromacies. These will not necessarily lead to losses in chromatic discrimination but just shifts (i.e., one might see as orange or green what a normal trichromat sees as yellow). About 20 years ago, I was involved in an attempt to develop guidelines (or rules of thumb, at least) for display design for color deficient observers, that did not require any deep understanding of colorimetry. The distillation of this effort can be found here, for what it is worth;: http://www.lighthouse.org/accessibility/design/ accessible-print-design/effective-color-contrast The most important point, I think, was to make sure that there was a sufficient luminance contrast difference, so that in the absence of the capacity to make a chromatic discrimination, the differences would still be detectable. The principles necessary for optimizing color choices in a scatterplot will certainly be more complex, however. In this light (no pun intended), I would draw your attention to the seminal work of Berniece Rogowitz at IBM: http://www.research.ibm.com/people/l/lloydt/ color/color.HTM http://www.research.ibm.com/dx/proceedings/ pravda/truevis.htm who was (is) quite concerned with this issue, as well, as the excellent article by Zeileis, Hornik and Murrell http://statmath.wu.ac.at/~zeileis/papers/ Zeileis+Hornik+Murrell-2009.pdf HTH, Ken -- Ken Knoblauch Inserm U846 Stem-cell and Brain Research Institute Department of Integrative Neurosciences 18 avenue du Doyen L?pine 69500 Bron France tel: +33 (0)4 72 91 34 77 fax: +33 (0)4 72 91 34 61 portable: +33 (0)6 84 10 64 10 http://www.sbri.fr/members/kenneth-knoblauch.html From rolf.turner at xtra.co.nz Thu Nov 3 10:20:17 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Thu, 3 Nov 2011 22:20:17 +1300 Subject: [R] How much data can R process? In-Reply-To: <1320289183.50370.YahooMailNeo@web161018.mail.bf1.yahoo.com> References: <1320289183.50370.YahooMailNeo@web161018.mail.bf1.yahoo.com> Message-ID: <4EB25CD1.1070401@xtra.co.nz> On 03/11/11 15:59, Nicholay Anne Caumeran wrote: > Would like to know how much data can R process - number of rows and columns? How long is a piece of string? cheers, Rolf Turner From Achim.Zeileis at uibk.ac.at Thu Nov 3 10:22:51 2011 From: Achim.Zeileis at uibk.ac.at (Achim Zeileis) Date: Thu, 3 Nov 2011 10:22:51 +0100 (CET) Subject: [R] nproc parameter in efpFunctional In-Reply-To: <1320306970603-3984605.post@n4.nabble.com> References: <1320228608589-3972419.post@n4.nabble.com> <1320306970603-3984605.post@n4.nabble.com> Message-ID: On Thu, 3 Nov 2011, bonda wrote: > Thank you. I've understood, that it should be k (number of parameters) > separate Brownian bridges. Well, if you use a process based on OLS residuals, you have always a one-dimensional process even though your model has k parameters. Hence, the two parameters are really conceptually different.. > Is it possible, to get such separated/disaggregated processes also in > function efp()? (one can take gefp(..., family=gaussian), or construct by > myself residuals(lm.model)*X, but still interesting). Some processes that efp() computes are always 1-dimensional (namely those based on residuals) while some are k-dimensional (namely the estimates-based processes) and some are (k+1)-dimensional (the score-based processes). gefp() generalizes this concept and lets you construct the fluctuation processes fairly flexibly. > And on the contrary, how can I get an aggregated Brownian bridge path > for all parameters together, similar to efp()$process? It is made in > plot.gefp, but only for graphical visualization... For "gefp" objects all aggregation is done by the efpFunctional employed. But this is really described in a fair amount of detail in the accompanying papers. Specifically, for gefp/efpFunctional in the 2006 CSDA paper. > Thank you in advance! > Julia > > -- > View this message in context: http://r.789695.n4.nabble.com/nproc-parameter-in-efpFunctional-tp3972419p3984605.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From pburns at pburns.seanet.com Thu Nov 3 10:35:32 2011 From: pburns at pburns.seanet.com (Patrick Burns) Date: Thu, 03 Nov 2011 09:35:32 +0000 Subject: [R] array manipulation In-Reply-To: References: Message-ID: <4EB26064.3040106@pburns.seanet.com> You seem to be looking for 'aperm'. There is a chapter in 'S Poetry' (available on http://www.burns-stat.com) that talks about working with higher dimensional arrays. I don't think any changes need to be made for R. On 02/11/2011 16:16, Simone Salvadei wrote: > Hello, > I'm at the very beginning of the learning process of this language. > Sorry in advance for the (possible but plausible) stupidity of my question. > > I would like to find a way to permute the DIMENSIONS of an array. > Something that sounds like the function "permute()" in matlab. > > Given an array C of dimensions c x d x T , for instance, the command > > permute(C, [2 1 3]) > > would provide (in Matlab) an array very similar to C, but this time each > one of the T matrices c x d has changed into its transposed. > Any alternatives to the following (and primitive) 'for' cycle? > > *# (previously defined) phi=array with dimensions c(c,d,T)* > * > * > *temp=array(0,dim=c(c,d,T))* > * for(i in 1:T)* > * {* > * temp[,,i]=t(phi[,,i])* > * }* > * phi=temp* > * > * > > Thank you very much! > S > -- Patrick Burns pburns at pburns.seanet.com twitter: @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') From jim at bitwrit.com.au Thu Nov 3 11:32:17 2011 From: jim at bitwrit.com.au (Jim Lemon) Date: Thu, 3 Nov 2011 21:32:17 +1100 Subject: [R] How much data can R process? In-Reply-To: <4EB25CD1.1070401@xtra.co.nz> References: <1320289183.50370.YahooMailNeo@web161018.mail.bf1.yahoo.com> <4EB25CD1.1070401@xtra.co.nz> Message-ID: <4EB26DB1.8070002@bitwrit.com.au> On 11/03/2011 08:20 PM, Rolf Turner wrote: > On 03/11/11 15:59, Nicholay Anne Caumeran wrote: >> Would like to know how much data can R process - number of rows and >> columns? > > How long is a piece of string? > Around 10e-35 meters. Jim From tal.galili at gmail.com Thu Nov 3 11:55:17 2011 From: tal.galili at gmail.com (Tal Galili) Date: Thu, 3 Nov 2011 12:55:17 +0200 Subject: [R] RGoogleTrends error in "getGTrends" Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ligges at statistik.tu-dortmund.de Thu Nov 3 12:24:40 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Thu, 03 Nov 2011 12:24:40 +0100 Subject: [R] mysterious warning message regarding bytecode... In-Reply-To: References: Message-ID: <4EB279F8.3030103@statistik.tu-dortmund.de> On 02.11.2011 23:51, Justin Haynes wrote: > While running a long script which source()s other scripts I get the > following warning: > > Warning message: > In t(object$S[[1]]) : bytecode version mismatch; using eval Are you using byte compiled code from some package or is the source code byte compiled at runtime? If the former, I guess the packages was installed under another version of R than the version of R you are using. Anyway, nothing to reproduce here nor a suficient description what you really did. Best, Uwe Ligges > > I cannot replicate it if I run the sourced files line by line though... > > What is that error? And do I care about it? It doesn't seem to > affect my output as far as I can tell. > > > Thanks! > Justin > > >> sessionInfo() > R version 2.13.2 (2011-09-30) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] grid stats graphics grDevices utils datasets > methods base > > other attached packages: > [1] mgcv_1.7-9 stringr_0.5 RPostgreSQL_0.2-0 biglm_0.8 > DBI_0.2-5 doMC_1.2.3 multicore_0.1-7 > [8] foreach_1.3.2 codetools_0.2-8 iterators_1.0.5 > cairoDevice_2.19 pixmap_0.4-11 gridExtra_0.8.5 splancs_2.01-29 > [15] sp_0.9-91 ellipse_0.3-5 ggplot2_0.8.9 > proto_0.3-9.2 reshape_0.8.4 plyr_1.6 MASS_7.3-14 > > loaded via a namespace (and not attached): > [1] compiler_2.13.2 digest_0.5.1 lattice_0.19-33 Matrix_1.0-1 nlme_3.1-102 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From sarah.goslee at gmail.com Thu Nov 3 12:21:54 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Thu, 3 Nov 2011 07:21:54 -0400 Subject: [R] problem with merging two matrices In-Reply-To: <1320266770417-3983136.post@n4.nabble.com> References: <1320266770417-3983136.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mtmorgan at fhcrc.org Thu Nov 3 13:24:26 2011 From: mtmorgan at fhcrc.org (Martin Morgan) Date: Thu, 03 Nov 2011 05:24:26 -0700 Subject: [R] Error: serialization is too large to store in a raw vector In-Reply-To: <1320273427.56875.YahooMailNeo@web120104.mail.ne1.yahoo.com> References: <1320273427.56875.YahooMailNeo@web120104.mail.ne1.yahoo.com> Message-ID: <4EB287FA.9060305@fhcrc.org> On 11/02/2011 03:37 PM, Alaios wrote: > Dear all, > I have quite large code (with lapply and mclapply) > and I am getting the following error. > > Error: serialization is too large to store in a raw vector > > Is it possible to ask from R to extend the Error messages with more details? > I would like to see where this problem exists. This is likely from the return value of mclapply's FUN: parallel:::sendMaster tries to serialize it, and fails. serialize(integer(.Machine$integer.max / 4), NULL, TRUE) do further data reduction before trying to return the results (probably a parallel 'best practices' anyway). Neither traceback() nor options(error=recover) deal gracefully with mclapply errors like this. Hope that helps, Martin > > B.R > Alex > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 From sas0025 at auburn.edu Thu Nov 3 13:54:18 2011 From: sas0025 at auburn.edu (Stephen Sefick) Date: Thu, 03 Nov 2011 07:54:18 -0500 Subject: [R] grep fixed (?) in 2.14 Message-ID: <4EB28EFA.9030209@auburn.edu> #This is probably due to my incomplete understanding of grep, but the below code has been working for some time to #search for .R with anything in front of it and return a list of scripts to source. Likely, the syntax for the #grep statement has been wrong all along. scripts2source <- (c("/home/ssefick/R_scripts/Convert_package.R", "/home/ssefick/R_scripts/Convert_R_CODE", "/home/ssefick/R_scripts/CV.R", "/home/ssefick/R_scripts/cvs.out.R", "/home/ssefick/R_scripts/database_connect", "/home/ssefick/R_scripts/database_connect_package.R", "/home/ssefick/R_scripts/exit_db.R", "/home/ssefick/R_scripts/exit.R", "/home/ssefick/R_scripts/hourly_zoo.R", "/home/ssefick/R_scripts/model_diag.R", "/home/ssefick/R_scripts/not_numeric.R", "/home/ssefick/R_scripts/num_ecol_package.R", "/home/ssefick/R_scripts/NumEcolR_scripts", "/home/ssefick/R_scripts/old_scripts_DELETE_AFTER_DECEMBER", "/home/ssefick/R_scripts/only_numeric_dataframe.R", "/home/ssefick/R_scripts/only_numeric.R", "/home/ssefick/R_scripts/PCA.ve.R", "/home/ssefick/R_scripts/poster_ggplot2_theme.R", "/home/ssefick/R_scripts/pressure_transducer_package.R", "/home/ssefick/R_scripts/Pressure_Transducer_R_CODE", "/home/ssefick/R_scripts/publication_ggplot2_theme.R", "/home/ssefick/R_scripts/r2test.R", "/home/ssefick/R_scripts/recession_constant_package.R", "/home/ssefick/R_scripts/recession_constant_R_CODE", "/home/ssefick/R_scripts/serdp_name_split.R", "/home/ssefick/R_scripts/setup_R.R", "/home/ssefick/R_scripts/USGS.R")) scripts2source[grep("*.R", scripts2source)] #here is my problem; I would like these to be removed. scripts2source[c(2,5,13,14,20,24)] #Thanks for all of your help in advance #kindest regards, #Stephen Sefick From info at aghmed.fsnet.co.uk Thu Nov 3 14:02:45 2011 From: info at aghmed.fsnet.co.uk (Michael Dewey) Date: Thu, 3 Nov 2011 13:02:45 +0000 Subject: [R] How much data can R process? In-Reply-To: <1320289183.50370.YahooMailNeo@web161018.mail.bf1.yahoo.com > References: <1320289183.50370.YahooMailNeo@web161018.mail.bf1.yahoo.com> Message-ID: At 02:59 03/11/2011, Nicholay Anne Caumeran wrote: >Would like to know how much data can R process - number of rows and columns? Nicholay, You are going to have to give us a lot more information before anyone can even attempt to answer this. Assuming you have tried to analyse your data and failed for some reason you might want to look at the High Performance Computing task view on CRAN and examine the section entitled large memory and out of memory data. > [[alternative HTML version deleted]] Michael Dewey info at aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html From sarah.goslee at gmail.com Thu Nov 3 14:01:32 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Thu, 3 Nov 2011 09:01:32 -0400 Subject: [R] grep fixed (?) in 2.14 In-Reply-To: <4EB28EFA.9030209@auburn.edu> References: <4EB28EFA.9030209@auburn.edu> Message-ID: Hi, . and * both mean something different in regular expressions than in command-line wildcards. You need: grep(".*\\.R$", scripts2source) which parses to . - any character * - any number of times \\. - an actual . escaped as R requires R - the R denoting a script $ - at the end of the string > grep(".*\\.R$", scripts2source) [1] 1 3 4 6 7 8 9 10 11 12 15 16 17 18 19 21 22 23 25 26 27 Sarah On Thu, Nov 3, 2011 at 8:54 AM, Stephen Sefick wrote: > #This is probably due to my incomplete understanding of grep, but the below > code has been working for some time to > #search for .R with anything in front of it and return a list of scripts to > source. ?Likely, the syntax for the > #grep statement has been wrong all along. > > scripts2source <- (c("/home/ssefick/R_scripts/Convert_package.R", > "/home/ssefick/R_scripts/Convert_R_CODE", > "/home/ssefick/R_scripts/CV.R", "/home/ssefick/R_scripts/cvs.out.R", > "/home/ssefick/R_scripts/database_connect", > "/home/ssefick/R_scripts/database_connect_package.R", > "/home/ssefick/R_scripts/exit_db.R", "/home/ssefick/R_scripts/exit.R", > "/home/ssefick/R_scripts/hourly_zoo.R", > "/home/ssefick/R_scripts/model_diag.R", > "/home/ssefick/R_scripts/not_numeric.R", > "/home/ssefick/R_scripts/num_ecol_package.R", > "/home/ssefick/R_scripts/NumEcolR_scripts", > "/home/ssefick/R_scripts/old_scripts_DELETE_AFTER_DECEMBER", > "/home/ssefick/R_scripts/only_numeric_dataframe.R", > "/home/ssefick/R_scripts/only_numeric.R", > "/home/ssefick/R_scripts/PCA.ve.R", > "/home/ssefick/R_scripts/poster_ggplot2_theme.R", > "/home/ssefick/R_scripts/pressure_transducer_package.R", > "/home/ssefick/R_scripts/Pressure_Transducer_R_CODE", > "/home/ssefick/R_scripts/publication_ggplot2_theme.R", > "/home/ssefick/R_scripts/r2test.R", > "/home/ssefick/R_scripts/recession_constant_package.R", > "/home/ssefick/R_scripts/recession_constant_R_CODE", > "/home/ssefick/R_scripts/serdp_name_split.R", > "/home/ssefick/R_scripts/setup_R.R", > "/home/ssefick/R_scripts/USGS.R")) > > scripts2source[grep("*.R", scripts2source)] > > #here is my problem; ?I would like these to be removed. > scripts2source[c(2,5,13,14,20,24)] > > #Thanks for all of your help in advance > #kindest regards, > > #Stephen Sefick > -- Sarah Goslee http://www.functionaldiversity.org From jholtman at gmail.com Thu Nov 3 14:03:51 2011 From: jholtman at gmail.com (jim holtman) Date: Thu, 3 Nov 2011 09:03:51 -0400 Subject: [R] grep fixed (?) in 2.14 In-Reply-To: <4EB28EFA.9030209@auburn.edu> References: <4EB28EFA.9030209@auburn.edu> Message-ID: your syntax is wrong, you need: scripts2source[grep("*\\.R$", scripts2source)] Notice the '\\.' to escape the special meaning of '.', and the "$" to anchor to the end of the line. On Thu, Nov 3, 2011 at 8:54 AM, Stephen Sefick wrote: > #This is probably due to my incomplete understanding of grep, but the below > code has been working for some time to > #search for .R with anything in front of it and return a list of scripts to > source. ?Likely, the syntax for the > #grep statement has been wrong all along. > > scripts2source <- (c("/home/ssefick/R_scripts/Convert_package.R", > "/home/ssefick/R_scripts/Convert_R_CODE", > "/home/ssefick/R_scripts/CV.R", "/home/ssefick/R_scripts/cvs.out.R", > "/home/ssefick/R_scripts/database_connect", > "/home/ssefick/R_scripts/database_connect_package.R", > "/home/ssefick/R_scripts/exit_db.R", "/home/ssefick/R_scripts/exit.R", > "/home/ssefick/R_scripts/hourly_zoo.R", > "/home/ssefick/R_scripts/model_diag.R", > "/home/ssefick/R_scripts/not_numeric.R", > "/home/ssefick/R_scripts/num_ecol_package.R", > "/home/ssefick/R_scripts/NumEcolR_scripts", > "/home/ssefick/R_scripts/old_scripts_DELETE_AFTER_DECEMBER", > "/home/ssefick/R_scripts/only_numeric_dataframe.R", > "/home/ssefick/R_scripts/only_numeric.R", > "/home/ssefick/R_scripts/PCA.ve.R", > "/home/ssefick/R_scripts/poster_ggplot2_theme.R", > "/home/ssefick/R_scripts/pressure_transducer_package.R", > "/home/ssefick/R_scripts/Pressure_Transducer_R_CODE", > "/home/ssefick/R_scripts/publication_ggplot2_theme.R", > "/home/ssefick/R_scripts/r2test.R", > "/home/ssefick/R_scripts/recession_constant_package.R", > "/home/ssefick/R_scripts/recession_constant_R_CODE", > "/home/ssefick/R_scripts/serdp_name_split.R", > "/home/ssefick/R_scripts/setup_R.R", > "/home/ssefick/R_scripts/USGS.R")) > > scripts2source[grep("*.R", scripts2source)] > > #here is my problem; ?I would like these to be removed. > scripts2source[c(2,5,13,14,20,24)] > > #Thanks for all of your help in advance > #kindest regards, > > #Stephen Sefick > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. From michael.weylandt at gmail.com Thu Nov 3 14:12:22 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Thu, 3 Nov 2011 09:12:22 -0400 Subject: [R] Question about Calculation of Cross Product In-Reply-To: References: Message-ID: Try library(pracma) ? cross Michael Weylandt On Thu, Nov 3, 2011 at 1:17 AM, ?? wrote: > The function of crossprod in R puzzled me. > I would like to calculate the cross product of two vectors. According to my text book, it defines like this: > a = (ax, ay, az) > b = (bx, by, bz) > then, the cross product of a and b is: > a X b = (ay*bz-az*by, az*bx-ax*bz, ax*by-ay*bz) > It can also write in a determinant format. > > > But the crossprod or tcrossprod function in R appeared not calculate the cross product. Suppose I enter this: >> a = c(1, 2, 3) >> b = c(2, 3, 4) >> ab = crossprod(a, b) >> ab > ? ? [,1] > [1,] ? 20 >> ab = tcrossprod(a, b) >> ab > ? ? [,1] [,2] [,3] > [1,] ? ?2 ? ?3 ? ?4 > [2,] ? ?4 ? ?6 ? ?8 > [3,] ? ?6 ? ?9 ? 12 > > > Does any know how to calculate cross product in R. Or I have to write the function myself? > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From michael.weylandt at gmail.com Thu Nov 3 14:13:24 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Thu, 3 Nov 2011 09:13:24 -0400 Subject: [R] Extract Data from Yahoo Finance In-Reply-To: <1320292966.73541.YahooMailNeo@web161421.mail.bf1.yahoo.com> References: <1320292966.73541.YahooMailNeo@web161421.mail.bf1.yahoo.com> Message-ID: The quantmod package can probably do what you are asking, but it's a little hard to be certain since you provide neither a list of all the fields you are actually talking about nor a link to the page with the fields in question. Michael On Thu, Nov 3, 2011 at 12:02 AM, Deb Midya wrote: > Hi R ?users, > > I am using R-2.14.0 on Windows XP. > > May I request you to assist me for the following please. > > I like to extract all the fields (example: a : Ask, b : Bid, ??, w : 52-week Range, x: Stock Exchange) ?for certain period of time, say, 1 October 2011 to 31 October 2011. > > Is there any R-Package(s) & any R- script please? > > Once again, thank you very much for the time you have given. > > Regards, > > Deb > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From gunter.berton at gene.com Thu Nov 3 14:40:55 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Thu, 3 Nov 2011 06:40:55 -0700 Subject: [R] Lattice plots and missing x-axis labels on second page In-Reply-To: <00A769DD3A6D594C9902BC12EF1CAF8E064CE722@SOAANCMSG01.soa.alaska.gov> References: <00A769DD3A6D594C9902BC12EF1CAF8E064CE6EC@SOAANCMSG01.soa.alaska.gov> <00A769DD3A6D594C9902BC12EF1CAF8E064CE722@SOAANCMSG01.soa.alaska.gov> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Thorn.Thaler at rdls.nestle.com Thu Nov 3 14:48:26 2011 From: Thorn.Thaler at rdls.nestle.com (Thaler, Thorn, LAUSANNE, Applied Mathematics) Date: Thu, 3 Nov 2011 14:48:26 +0100 Subject: [R] Select columns of a data.frame by name OR index in a function Message-ID: Dear all, Sometimes I have the situation where a function takes a data.frame and an additional argument describing come columns. For greater flexibility I want to allow for either column names or column indices. What I usually do then is something like the following: -------------8<------------- f <- function(datf, cols) { nc <- seq_along(datf) cn <- colnames(datf) colOK <- (cols %in% nc) | (cols %in% cn) if (!all(colOK)) { badc <- paste(sQuote(cols[!colOK]), collapse = ", ") msg <- sprintf(ngettext(sum(!colOK), "%s is not a valid column selector", "%s are not valid column selectors"), badc) stop(msg) } which((nc %in% cols) | (cn %in% cols)) # with this set of indices I would work in the rest of the code } dd <- data.frame(a=1, b=1, c=1) f(dd, 2:3) # [1] 2 3 f(dd, 1:4) # Error in f(dd, 1:4) : '4' is not a valid column selector f(dd, "a") # [1] 1 f(dd, c("a", "d", "e")) # Error in f(dd, c("a", "d", "e")) : 'd', 'e' are not valid column selectors ------------->8------------- So my question is, whether there are smarter/better/easier/more R-like ways of doing that? Any input appreciated. KR, -Thorn From sas0025 at auburn.edu Thu Nov 3 15:01:36 2011 From: sas0025 at auburn.edu (Stephen Sefick) Date: Thu, 03 Nov 2011 09:01:36 -0500 Subject: [R] grep fixed (?) in 2.14 In-Reply-To: References: <4EB28EFA.9030209@auburn.edu> Message-ID: <4EB29EC0.7060207@auburn.edu> That did the trick. I have read about regular expressions often, and sometimes I get them right and sometimes I don't. Is there a good reference resource that anyone could suggest? Thanks for all of the help. Stephen Sefick On 11/03/2011 08:03 AM, jim holtman wrote: > your syntax is wrong, you need: > > scripts2source[grep("*\\.R$", scripts2source)] > > Notice the '\\.' to escape the special meaning of '.', and the "$" to > anchor to the end of the line. > > > On Thu, Nov 3, 2011 at 8:54 AM, Stephen Sefick wrote: >> #This is probably due to my incomplete understanding of grep, but the below >> code has been working for some time to >> #search for .R with anything in front of it and return a list of scripts to >> source. Likely, the syntax for the >> #grep statement has been wrong all along. >> >> scripts2source<- (c("/home/ssefick/R_scripts/Convert_package.R", >> "/home/ssefick/R_scripts/Convert_R_CODE", >> "/home/ssefick/R_scripts/CV.R", "/home/ssefick/R_scripts/cvs.out.R", >> "/home/ssefick/R_scripts/database_connect", >> "/home/ssefick/R_scripts/database_connect_package.R", >> "/home/ssefick/R_scripts/exit_db.R", "/home/ssefick/R_scripts/exit.R", >> "/home/ssefick/R_scripts/hourly_zoo.R", >> "/home/ssefick/R_scripts/model_diag.R", >> "/home/ssefick/R_scripts/not_numeric.R", >> "/home/ssefick/R_scripts/num_ecol_package.R", >> "/home/ssefick/R_scripts/NumEcolR_scripts", >> "/home/ssefick/R_scripts/old_scripts_DELETE_AFTER_DECEMBER", >> "/home/ssefick/R_scripts/only_numeric_dataframe.R", >> "/home/ssefick/R_scripts/only_numeric.R", >> "/home/ssefick/R_scripts/PCA.ve.R", >> "/home/ssefick/R_scripts/poster_ggplot2_theme.R", >> "/home/ssefick/R_scripts/pressure_transducer_package.R", >> "/home/ssefick/R_scripts/Pressure_Transducer_R_CODE", >> "/home/ssefick/R_scripts/publication_ggplot2_theme.R", >> "/home/ssefick/R_scripts/r2test.R", >> "/home/ssefick/R_scripts/recession_constant_package.R", >> "/home/ssefick/R_scripts/recession_constant_R_CODE", >> "/home/ssefick/R_scripts/serdp_name_split.R", >> "/home/ssefick/R_scripts/setup_R.R", >> "/home/ssefick/R_scripts/USGS.R")) >> >> scripts2source[grep("*.R", scripts2source)] >> >> #here is my problem; I would like these to be removed. >> scripts2source[c(2,5,13,14,20,24)] >> >> #Thanks for all of your help in advance >> #kindest regards, >> >> #Stephen Sefick >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > From marc_schwartz at me.com Thu Nov 3 15:03:56 2011 From: marc_schwartz at me.com (Marc Schwartz) Date: Thu, 03 Nov 2011 09:03:56 -0500 Subject: [R] non-parametric sample size calculation In-Reply-To: References: Message-ID: On Nov 3, 2011, at 3:47 AM, David A. wrote: > > Hi, > > I am trying to estimate the sample size needed for the comparison of two groups on a certain measurement, given some previous data at hand. I find that the data collected does not follow a normal distribution, so I would like to use a non-parametric option for sample size calculation. > > I found the pwr package but I don't think it has this option and on the internet found that http://www.epibiostat.ucsf.edu/biostat/sampsize.html says only PASS allows non-parametric sample size calculations (although the webpage is not updated). > > Any help would be greatly appreciated > > Thanks, > > Dave The first question is how "non normal" are your data? If you used some formal test for normality and the p value was <=0.05, I would suggest that you search the R-Help archives for a plethora of discussions on testing for normality. You will find that such tests should largely not be used in deference to the question "Are the data normal enough?". If they are or can be transformed reasonably, use standard functions for calculating power and sample size, such as power.t.test(). If you need to use a non-parametric test, you might want to review this page by Jerry Dallal: http://www.jerrydallal.com/LHSP/npar.htm which has some general guidelines for calculating sample size predicated upon using standard parametric tests and then adjusting the sample size using the ARE (asymptotic relative efficiency) based upon the non-parametric intended. HTH, Marc Schwartz From ligges at statistik.tu-dortmund.de Thu Nov 3 15:08:39 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Thu, 03 Nov 2011 15:08:39 +0100 Subject: [R] HOW TO REMOVE MTEXT FROM PLOT, plotting changing populations with titles in loop In-Reply-To: References: <1320261304220-3981757.post@n4.nabble.com> Message-ID: <4EB2A067.7010504@statistik.tu-dortmund.de> On 02.11.2011 20:29, Sarah Goslee wrote: > It's not perfect, but you could use: > > mtext(paste("this is iteration ", i, sep=""), col="white") > > to overwrite it, or polygon() to draw a white rectangle over the text each time. The question is if it is not better to do the whole plot again and just add the one text in the end. At least if you want to plot into non screen device: You end up with ll those layers of text in the output that makes it larger and additionally slows down the rendering of the whole plot. Uwe Ligges > Sarah > > On Wed, Nov 2, 2011 at 3:15 PM, prinzOfNorway wrote: >> is there a way to hide/undraw mtext (or lines etc.) in a loop like >> >> plot(runif(10)) >> iterCol<- rainbowPalette(10) >> >> for(i in 1:10){ >> >> mtext(paste("this is iteration ", i, sep="")) >> points(runif(10),col=iterCol[i]) >> Sys.sleep(1) >> >> ## UNDRAW/HIDE the text so that it does not mess up the plot in the next >> iteration >> >> } >> > From rmh at temple.edu Thu Nov 3 15:58:28 2011 From: rmh at temple.edu (Richard M. Heiberger) Date: Thu, 3 Nov 2011 10:58:28 -0400 Subject: [R] Creating barplot using time as X In-Reply-To: <1320261869546-3981961.post@n4.nabble.com> References: <1320261869546-3981961.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From delre at wisc.edu Thu Nov 3 16:54:52 2011 From: delre at wisc.edu (AC Del Re) Date: Thu, 3 Nov 2011 08:54:52 -0700 Subject: [R] Take variables in data.frame and create list of matrices Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From JRadinger at gmx.at Thu Nov 3 16:55:08 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Thu, 03 Nov 2011 16:55:08 +0100 Subject: [R] variable transformation for lm Message-ID: <20111103155508.249620@gmx.net> Hello, I am doing a simple regression using lm(Y~X). As my response and my predictor seemed to be skewed and I can't meet the model assumptions. Therefore I need to transform my variables. I wanted to ask what is the preferred way to find out if predictor and/or response needs to be transformed and if yes how (log-transform?). I found a procedure in "A modern approach to Regressoin in R" (Sheather, 2009): There they suggest an approach with the function bctrans from alr3...but it seems that it is deprecated. So what is the best way (box-cox test) find the best transformation for predictor and response simultaneously? AFAIK boxcox from MASS is used only used for transformation of the predictor? Thank you very much Johannes -- From marc_schwartz at me.com Thu Nov 3 17:12:58 2011 From: marc_schwartz at me.com (Marc Schwartz) Date: Thu, 03 Nov 2011 11:12:58 -0500 Subject: [R] Sample size calculations for one sided binomial exact test Message-ID: From: https://stat.ethz.ch/pipermail/r-help/2011-November/294329.html > I'm trying to compute sample size requirements for a binomial exact test. > we want to show that the proportion is at least 90% assuming that it is > 95%, with 80% power so any asymptotic approximations are out of the > questions. I was planning on using binom.test to perform the simple test > against a prespecified value, but cannot find any functions for computing > sample size. do any exist? > > Thanks, > Andrew Hi, I don't have the original e-mail, so this reply will be out of the thread in the archive. I am not aware of anything pre-existing in R for this application, but stand to be corrected on that point. There are at least two "non-R" related options: 1. The G*Power program which is available from: http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3/ 2. There is a paper by A'Hern which contains sample size tables here: Sample size tables for exact single-stage phase II designs R.P. A'Hern STATISTICS IN MEDICINE Statist. Med. 2001; 20:859?866 http://stat.ethz.ch/education/semesters/as2011/bio/ahernSampleSize.pdf The author used the BINOMDIST function in Excel to derive the tables. Notwithstanding criticisms of Excel, these tables, based upon a small test sample, agree with the G*Power program, as well as my own computations using R code below. He also uses normal approximations for sample sizes >300, given the limitations found in the BINOMDIST function. Here is my R code for deriving the critical value and sample size for a one sided exact binomial test, given an alpha, a null proportion, an alternate proportion and the desired power: # The possible sample size vector N needs to be selected in such a fashion # that it covers the possible range of values that include the true # minima. My example here does with a finite range and makes the # plot easier to visualize. N <- 100:200 Alpha <- 0.05 Pow <- 0.8 p0 <- 0.90 p1 <- 0.95 # Required number of events, given a vector of sample sizes (N) # to be considered at the null proportion, for the given Alpha CritVal <- qbinom(p = 1 - Alpha, size = N, prob = p0) # Get Beta (Type II error) for each N at the alternate hypothesis # proportion Beta <- pbinom(CritVal, N, p1) # Get the Power Power <- 1 - Beta # Find the smallest sample size yielding at least the required power SampSize <- min(which(Power > Pow)) # Get and print the required number of events to reject the null # given the sample size required (Res <- paste(CritVal[SampSize] + 1, "out of", N[SampSize])) # Plot it all plot(N, Power, type = "b", las = 1) title(paste("One Sided Sample Size and Critical Value for H0 =", p0, "versus HA = ", p1, "\n", "For Power = ", Pow), cex.main = 0.95) points(N[SampSize], Power[SampSize], col = "red", pch = 19) text(N[SampSize], Power[SampSize], col = "red", label = Res, pos = 3) abline(h = Pow, lty = "dashed") One thing to note here (see the plot) is the non-monotonic function describing the power at each of the values of the sample size. This is due to the discrete nature of the binomial distribution. It also generally means that you are powering the sample size calculation for an alpha at something lower than the value indicated. The G*Power program provides both the actual alpha and power, given the input values. So there is a need to search the vector of sample sizes where the power is greater than that desired, to obtain the smallest sample size required to satisfy the power desired. The above could of course be encapsulated in a function to make use easier, but the code yields values that agree with both the G*Power application and A'Hern's tables. Hope that this is helpful. Regards, Marc Schwartz From surfprjab at hotmail.com Thu Nov 3 17:20:53 2011 From: surfprjab at hotmail.com (jose Bartolomei) Date: Thu, 3 Nov 2011 16:20:53 +0000 Subject: [R] Take variables in data.frame and create list of matrices In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From surfprjab at hotmail.com Thu Nov 3 17:25:07 2011 From: surfprjab at hotmail.com (jose Bartolomei) Date: Thu, 3 Nov 2011 16:25:07 +0000 Subject: [R] Take variables in data.frame and create list of matrices In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From stefano.sofia at regione.marche.it Thu Nov 3 17:28:08 2011 From: stefano.sofia at regione.marche.it (Stefano Sofia) Date: Thu, 3 Nov 2011 17:28:08 +0100 Subject: [R] query about counting rows of a dataframe Message-ID: <631F8C7792124941838E6850D3A7802B032455D1DA6A@ERMES.regionemarche.intra> Dear R users, I have got the following data frame, called my_df: gender day_birth month_birth year_birth labour 1 F 22 10 2001 1 2 M 29 10 2001 2 3 M 1 11 2001 1 4 F 3 11 2001 1 5 M 3 11 2001 2 6 F 4 11 2001 1 7 F 4 11 2001 2 8 F 5 12 2001 2 9 M 22 14 2001 2 10 F 29 13 2001 2 ... I need to count data in different ways: 1. count the births for each day (having 0 when necessary) independently from the value of the "labour" column 2. count the births for each day (having 0 when necessary), divided by the value of "labour" (which can have two valuers, 1 or 2) 3. count the births for each day of all the years (i.e. the 22nd of October of all the years present in the data frame) independently from the value of "labour" 4. count the births for each day of all the years (i.e. the 22nd of October of all the years present in the data frame), divided by the value of "labour" I tried with the command table(my_df$year_birth, my_df$month_birth, my_df$day_birth) which satisfies (partially) question numer 1 (I am not able to have 0 in the not available days). Is there a smart way to do that without invoking too many loops? thank you for your help Stefano Sofia AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu? contenere informazioni confidenziali, pertanto ? destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si ? il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si ? ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell?art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit? ed urgenza, la risposta al presente messaggio di posta elettronica pu? essere visionata da persone estranee al destinatario. IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system. From gunter.berton at gene.com Thu Nov 3 17:52:23 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Thu, 3 Nov 2011 09:52:23 -0700 Subject: [R] Lattice plots and missing x-axis labels on second page In-Reply-To: References: <00A769DD3A6D594C9902BC12EF1CAF8E064CE6EC@SOAANCMSG01.soa.alaska.gov> <00A769DD3A6D594C9902BC12EF1CAF8E064CE722@SOAANCMSG01.soa.alaska.gov> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From gunter.berton at gene.com Thu Nov 3 18:01:01 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Thu, 3 Nov 2011 10:01:01 -0700 Subject: [R] variable transformation for lm In-Reply-To: <20111103155508.249620@gmx.net> References: <20111103155508.249620@gmx.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From delre at wisc.edu Thu Nov 3 18:08:46 2011 From: delre at wisc.edu (AC Del Re) Date: Thu, 3 Nov 2011 10:08:46 -0700 Subject: [R] Take variables in data.frame and create list of matrices In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Thu Nov 3 18:21:19 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Nov 2011 13:21:19 -0400 Subject: [R] variable transformation for lm In-Reply-To: <20111103155508.249620@gmx.net> References: <20111103155508.249620@gmx.net> Message-ID: On Nov 3, 2011, at 11:55 AM, "Johannes Radinger" wrote: > Hello, > > I am doing a simple regression using lm(Y~X). > As my response and my predictor seemed to be skewed > and I can't meet the model assumptions. Therefore > I need to transform my variables. The presence of skewness in either or both the response or predictors does NOT imply failure to meet model assumptions. The assumptions of linear regression regarding normality only apply to the residuals after the estimation of the model. -- David. > > I wanted to ask what is the preferred way to find out > if predictor and/or response needs to be transformed > and if yes how (log-transform?). > > I found a procedure in "A modern approach to Regressoin > in R" (Sheather, 2009): There they suggest an approach > with the function bctrans from alr3...but it seems that it > is deprecated. So what is the best way (box-cox test) find the best > transformation for predictor and response simultaneously? > AFAIK boxcox from MASS is used only used for transformation > of the predictor? > > Thank you very much > Johannes > > -- > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From Bastien.Ferland-Raymond at mrnf.gouv.qc.ca Thu Nov 3 18:25:18 2011 From: Bastien.Ferland-Raymond at mrnf.gouv.qc.ca (Bastien.Ferland-Raymond at mrnf.gouv.qc.ca) Date: Thu, 3 Nov 2011 13:25:18 -0400 Subject: [R] optimising a loop Message-ID: <161DC602615F6943A19BAEA3DBF2E4B90177CC8DD55A@HARFANG.intranet.mrn.gouv> Dear R community, I'm trying to remove a loop from my code but I'm stock and I can't find a good way to do it. Hopefully one of you will have something clever to propose. Here is a simplified example: I have a squared matrix: > nom.plac2 <- c("102", "103", "301", "303","304", "403") > poids2 <- matrix(NA, 6,6, dimnames=list(nom.plac2,nom.plac2)) > poids2 102 103 301 303 304 403 102 NA NA NA NA NA NA 103 NA NA NA NA NA NA 301 NA NA NA NA NA NA 303 NA NA NA NA NA NA 304 NA NA NA NA NA NA 403 NA NA NA NA NA NA I want to replace some of the NAs following specific criterion included in 2 others matrix: > wei2 <- matrix(c(.6,.4,.5,.5,.9,.1,.8,.2,.7,.3,.6,.4),6,2,dimnames=list(nom.plac2, c("p1","p2")),byrow=T) > wei2 p1 p2 102 0.6 0.4 103 0.5 0.5 301 0.9 0.1 303 0.8 0.2 304 0.7 0.3 403 0.6 0.4 > voisin <- matrix(c("103","304", "303", "102", "103" ,"303","403","304","303","102","103" ,"303"), 6,2,dimnames=list(nom.plac2, c("v1","v2")),byrow=T) > voisin v1 v2 102 "103" "304" 103 "303" "102" 301 "103" "303" 303 "403" "304" 304 "303" "102" 403 "103" "303" So my final result is: 102 103 301 303 304 403 102 NA 0.6 NA NA 0.4 NA 103 0.5 NA NA 0.5 NA NA 301 NA 0.9 NA 0.1 NA NA 303 NA NA NA NA 0.2 0.8 304 0.3 NA NA 0.7 NA NA 403 NA 0.6 NA 0.4 NA NA So, globally I want to fill for each line of "poids2" data from "wei2" associated with the good the good identifier found in "voisin". This can easily be done by a loop: > loop <- poids2 > for(i in 1:6){ + loop[i,voisin[i,]] <- wei2[i,] + } But I expect it to be quite slow with my larger dataset. Does any of you has an idea how I could remove the loop and speed up the operation? Best regards, Bastien Ferland-Raymond, M.Sc. Stat., M.Sc. Biol. Division des orientations et projets sp?ciaux Direction des inventaires forestiers Minist?re des Ressources naturelles et de la Faune du Qu?bec From ccquant at gmail.com Thu Nov 3 18:29:00 2011 From: ccquant at gmail.com (Ben quant) Date: Thu, 3 Nov 2011 11:29:00 -0600 Subject: [R] RpgSQL vs RPostgreSQL Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bellard.celine at gmail.com Thu Nov 3 11:07:21 2011 From: bellard.celine at gmail.com (Celine) Date: Thu, 3 Nov 2011 03:07:21 -0700 (PDT) Subject: [R] does there any function like sumif in excel? In-Reply-To: References: <972822.20441.qm@web38408.mail.mud.yahoo.com> <1320239193254-3972963.post@n4.nabble.com> Message-ID: <1320314841347-3984899.post@n4.nabble.com> Sorry for the duplicates message and thanks for your help, it works well now. C?line -- View this message in context: http://r.789695.n4.nabble.com/does-there-any-function-like-sumif-in-excel-tp858444p3984899.html Sent from the R help mailing list archive at Nabble.com. From bellard.celine at gmail.com Thu Nov 3 11:09:51 2011 From: bellard.celine at gmail.com (Celine) Date: Thu, 3 Nov 2011 03:09:51 -0700 (PDT) Subject: [R] Sum with condition In-Reply-To: References: <1320237862644-3972839.post@n4.nabble.com> Message-ID: <1320314991124-3984909.post@n4.nabble.com> Problem solved, thanks everyone for your help. C?line -- View this message in context: http://r.789695.n4.nabble.com/Sum-with-condition-tp3972839p3984909.html Sent from the R help mailing list archive at Nabble.com. From debmidya at yahoo.com Thu Nov 3 11:48:42 2011 From: debmidya at yahoo.com (Deb Midya) Date: Thu, 3 Nov 2011 03:48:42 -0700 (PDT) Subject: [R] Extract Data from Yahoo Finance In-Reply-To: References: <1320292966.73541.YahooMailNeo@web161421.mail.bf1.yahoo.com> Message-ID: <1320317322.77723.YahooMailNeo@web161402.mail.bf1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sarah_kej at yahoo.it Thu Nov 3 12:13:27 2011 From: sarah_kej at yahoo.it (Sara) Date: Thu, 3 Nov 2011 11:13:27 +0000 (GMT) Subject: [R] EMD arguments default values Message-ID: <1320318807.16766.YahooMailNeo@web26907.mail.ukl.yahoo.com> Hi, i'm using this command line with EMD package filename<-emd(y,x,boundary="wave",plot.imf=F) i'd like to know if this EMD uses default values for non-specified arguments. i'm especially interested in "stoprule", "max.imf", "tol" behavior in such cases. Thanks From davidgrimsey at hotmail.com Thu Nov 3 12:50:17 2011 From: davidgrimsey at hotmail.com (Davg) Date: Thu, 3 Nov 2011 04:50:17 -0700 (PDT) Subject: [R] Comparing negative binomial models Message-ID: <1320321017321-3985353.post@n4.nabble.com> Hi, I am trying to compare negative binomial models for the prediction of sports games (I know that Poisson models would be better but I'm just trying Negative Binomial at the moment). But, to compare the models I need them to have the same theta value. How can I change the explanatory variables while maintaining the theta value? Thanks for the help. David -- View this message in context: http://r.789695.n4.nabble.com/Comparing-negative-binomial-models-tp3985353p3985353.html Sent from the R help mailing list archive at Nabble.com. From michelemazzucco at gmail.com Thu Nov 3 12:54:48 2011 From: michelemazzucco at gmail.com (Michele Mazzucco) Date: Thu, 3 Nov 2011 13:54:48 +0200 Subject: [R] Fit continuous distribution to truncated empirical values Message-ID: Hi all, I am trying to fit a distribution to some data about survival times. I am interested only in a specific interval, e.g., while the data lies in the interval (0,...., 600), I want the best for the interval (0,..., 24). I have tried both fitdistr (MASS package) and fitdist (from the fitdistrplus package), but I could not get them working, e.g. fitdistr(left, "weibull", upper=24) Error in optim(x = c(529L, 528L, 527L, 526L, 525L, 524L, 523L, 522L, 521L, : L-BFGS-B needs finite values of 'fn' In addition: Warning message: In dweibull(x, shape, scale, log) : NaNs produced Am I doing something wrong? Thanks, Michele p.s. I have seen similar posts, e.g., http://tolstoy.newcastle.edu.au/R/help/05/02/11558.html, but I am not sure whether I can apply the same approach here. From kerry1912 at hotmail.com Thu Nov 3 13:03:16 2011 From: kerry1912 at hotmail.com (kerry1912) Date: Thu, 3 Nov 2011 05:03:16 -0700 (PDT) Subject: [R] Histograms in R Message-ID: <1320321796661-3985397.post@n4.nabble.com> We have a histogram of our observed response and we want to overlay the corresponding poisson distribution with respect to our poisson model. -- View this message in context: http://r.789695.n4.nabble.com/Histograms-in-R-tp3985397p3985397.html Sent from the R help mailing list archive at Nabble.com. From kindlychung at gmail.com Thu Nov 3 13:08:57 2011 From: kindlychung at gmail.com (Kaiyin Zhong) Date: Thu, 3 Nov 2011 20:08:57 +0800 Subject: [R] Why can't this function be used with the 'by' command? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From reema28sep at gmail.com Thu Nov 3 14:41:38 2011 From: reema28sep at gmail.com (Reema Singh) Date: Thu, 3 Nov 2011 19:11:38 +0530 Subject: [R] Uploding package help In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From cjose at mail.usp.edu Thu Nov 3 16:46:48 2011 From: cjose at mail.usp.edu (playballa23) Date: Thu, 3 Nov 2011 08:46:48 -0700 (PDT) Subject: [R] data frame to workspace Message-ID: <1320335208079-3986415.post@n4.nabble.com> Is there a way to import a data frame into a workspace? I created a data frame and from my understanding, a data frame is a type of object, and that the workspace stores the current session's objects. Wondering why my data frame is not showing up... Any thoughts/suggestions? -- View this message in context: http://r.789695.n4.nabble.com/data-frame-to-workspace-tp3986415p3986415.html Sent from the R help mailing list archive at Nabble.com. From dcelta at gmail.com Thu Nov 3 16:59:55 2011 From: dcelta at gmail.com (dcelta) Date: Thu, 3 Nov 2011 08:59:55 -0700 (PDT) Subject: [R] XLConnect Error In-Reply-To: <4E08ECF2.6020307@statistik.tu-dortmund.de> References: <1309200534372-3628528.post@n4.nabble.com> <4E08ECF2.6020307@statistik.tu-dortmund.de> Message-ID: <1320335995789-3986491.post@n4.nabble.com> Hi, I am having the same issue as described by the original subscriber.... can you offer any guidance on how to install Java ??? Do I need to do this from the R command line ??? or do I need to do this outside of the application?? Thanks Daniel -- View this message in context: http://r.789695.n4.nabble.com/XLConnect-Error-tp3628528p3986491.html Sent from the R help mailing list archive at Nabble.com. From agao at umich.edu Thu Nov 3 17:08:04 2011 From: agao at umich.edu (Alan Gao) Date: Thu, 3 Nov 2011 12:08:04 -0400 Subject: [R] take me off the list Message-ID: thank you. Alan Gao University of Michigan '12 Statistics, B.S. 239.682.3509 From a.khaleghei at gmail.com Thu Nov 3 17:14:55 2011 From: a.khaleghei at gmail.com (Akram Khaleghei Ghosheh balagh) Date: Thu, 3 Nov 2011 12:14:55 -0400 Subject: [R] How to interpret vglm output!! Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From olivier.collignon at live.fr Thu Nov 3 17:17:46 2011 From: olivier.collignon at live.fr (blackscorpio) Date: Thu, 3 Nov 2011 09:17:46 -0700 (PDT) Subject: [R] L1 penalization for proportional odds logistic regression Message-ID: <1320337066623-3986573.post@n4.nabble.com> Dear community, I am currently attempting to perform a (L1) penalized ordinal logistic regression with proportional odds. For the moment I only found R packages allowing to perform forward or backward continuation ratio model with several penalizations. Does anyone have a clue of what R package I could use ? I am not even quite sure that penalized logistic regression with proportional odds has already been developed theoretically... Thanks a lot for your help ! -- View this message in context: http://r.789695.n4.nabble.com/L1-penalization-for-proportional-odds-logistic-regression-tp3986573p3986573.html Sent from the R help mailing list archive at Nabble.com. From flokke at live.de Thu Nov 3 17:31:55 2011 From: flokke at live.de (flokke) Date: Thu, 3 Nov 2011 09:31:55 -0700 (PDT) Subject: [R] problem with merging two matrices In-Reply-To: References: <1320266770417-3983136.post@n4.nabble.com> Message-ID: <1320337915950-3986638.post@n4.nabble.com> Dear Sarah, THanks for your answer! Sorry that my thread is somehow not clear, that's because I am not really experienced with R and dont know yet how to put thinings in words.. I am not trying to work with them differently, I am just trying to print them as the result of a function. But as the result() function does allow only one argument I have to somehow merge these two matrcices. I know that I could use the list() function as well but I wanted to have some nicer output than that, that's why I created these two matrices. Want I want to get in the end is that can call my function(x,y) and then you get the print of the two matrices, above each other. -- View this message in context: http://r.789695.n4.nabble.com/problem-with-merging-two-matrices-tp3983136p3986638.html Sent from the R help mailing list archive at Nabble.com. From acarson at unbc.ca Thu Nov 3 17:48:32 2011 From: acarson at unbc.ca (Allan Carson) Date: Thu, 3 Nov 2011 16:48:32 +0000 Subject: [R] Back-transforming in lme Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pierre-etienne.lessard.1 at ulaval.ca Thu Nov 3 18:17:46 2011 From: pierre-etienne.lessard.1 at ulaval.ca (PEL) Date: Thu, 3 Nov 2011 10:17:46 -0700 (PDT) Subject: [R] Creating barplot using time as X In-Reply-To: References: <1320261869546-3981961.post@n4.nabble.com> Message-ID: <1320340666822-3986882.post@n4.nabble.com> Thank you very much. This is exactly what I needed. Problem solved! Thanks again PEL -- View this message in context: http://r.789695.n4.nabble.com/Creating-barplot-using-time-as-X-tp3981961p3986882.html Sent from the R help mailing list archive at Nabble.com. From jholtman at gmail.com Thu Nov 3 18:44:17 2011 From: jholtman at gmail.com (jim holtman) Date: Thu, 3 Nov 2011 13:44:17 -0400 Subject: [R] optimising a loop In-Reply-To: <161DC602615F6943A19BAEA3DBF2E4B90177CC8DD55A@HARFANG.intranet.mrn.gouv> References: <161DC602615F6943A19BAEA3DBF2E4B90177CC8DD55A@HARFANG.intranet.mrn.gouv> Message-ID: try this: > nom.plac2 <- c("102", "103", "301", "303","304", "403") > poids2 <- matrix(NA, 6,6, dimnames=list(nom.plac2,nom.plac2)) > poids2 102 103 301 303 304 403 102 NA NA NA NA NA NA 103 NA NA NA NA NA NA 301 NA NA NA NA NA NA 303 NA NA NA NA NA NA 304 NA NA NA NA NA NA 403 NA NA NA NA NA NA > wei2 <- matrix(c(.6,.4,.5,.5,.9,.1,.8,.2,.7,.3,.6,.4),6,2,dimnames=list(nom.plac2, c("p1","p2")),byrow=T) > voisin <- matrix(c("103","304", "303", "102", "103" ,"303","403","304","303","102","103" ,"303"), + 6,2,dimnames=list(nom.plac2, c("v1","v2")),byrow=T) > > # do matrix addressing by converting names to numbers > indx <- rbind( + cbind(match(rownames(voisin), rownames(poids2)) + , match(voisin[, 1], colnames(poids2)) + ) + , cbind(match(rownames(voisin), rownames(poids2)) + , match(voisin[, 2], colnames(poids2)) + ) + ) > indx [,1] [,2] [1,] 1 2 [2,] 2 4 [3,] 3 2 [4,] 4 6 [5,] 5 4 [6,] 6 2 [7,] 1 5 [8,] 2 1 [9,] 3 4 [10,] 4 5 [11,] 5 1 [12,] 6 4 > > # change the data > poids2[indx] <- c(wei2[,1], wei2[,2]) > poids2 102 103 301 303 304 403 102 NA 0.6 NA NA 0.4 NA 103 0.5 NA NA 0.5 NA NA 301 NA 0.9 NA 0.1 NA NA 303 NA NA NA NA 0.2 0.8 304 0.3 NA NA 0.7 NA NA 403 NA 0.6 NA 0.4 NA NA > > > On Thu, Nov 3, 2011 at 1:25 PM, wrote: > Dear R community, > > I'm trying to remove a loop from my code but I'm stock and I can't find a good way to do it. ?Hopefully one of you will have something clever to propose. > > Here is a simplified example: > > I have a squared matrix: > >> nom.plac2 <- c("102", "103", "301", "303","304", "403") >> poids2 <- matrix(NA, 6,6, dimnames=list(nom.plac2,nom.plac2)) >> poids2 > ? ?102 103 301 303 304 403 > 102 ?NA ?NA ?NA ?NA ?NA ?NA > 103 ?NA ?NA ?NA ?NA ?NA ?NA > 301 ?NA ?NA ?NA ?NA ?NA ?NA > 303 ?NA ?NA ?NA ?NA ?NA ?NA > 304 ?NA ?NA ?NA ?NA ?NA ?NA > 403 ?NA ?NA ?NA ?NA ?NA ?NA > > I want to replace some of the NAs following specific criterion included in 2 others matrix: > >> wei2 <- matrix(c(.6,.4,.5,.5,.9,.1,.8,.2,.7,.3,.6,.4),6,2,dimnames=list(nom.plac2, c("p1","p2")),byrow=T) >> wei2 > ? ? p1 ?p2 > 102 0.6 0.4 > 103 0.5 0.5 > 301 0.9 0.1 > 303 0.8 0.2 > 304 0.7 0.3 > 403 0.6 0.4 >> voisin <- matrix(c("103","304", "303", "102", "103" ,"303","403","304","303","102","103" ,"303"), > ? ? ? ? ?6,2,dimnames=list(nom.plac2, c("v1","v2")),byrow=T) >> voisin > ? ?v1 ? ?v2 > 102 "103" "304" > 103 "303" "102" > 301 "103" "303" > 303 "403" "304" > 304 "303" "102" > 403 "103" "303" > > So my final result is: > > ? ?102 103 301 303 304 403 > 102 ?NA 0.6 ?NA ?NA 0.4 ?NA > 103 0.5 ?NA ?NA 0.5 ?NA ?NA > 301 ?NA 0.9 ?NA 0.1 ?NA ?NA > 303 ?NA ?NA ?NA ?NA 0.2 0.8 > 304 0.3 ?NA ?NA 0.7 ?NA ?NA > 403 ?NA 0.6 ?NA 0.4 ?NA ?NA > > > So, globally I want to fill for each line of "poids2" data from "wei2" associated with the good the good identifier found in "voisin". > > This can easily be done by a loop: > >> loop <- poids2 >> for(i in 1:6){ > + loop[i,voisin[i,]] <- wei2[i,] > + } > > But I expect it to be quite slow with my larger dataset. > > Does any of you has an idea how I could remove the loop and speed up the operation? > > Best regards, > > > Bastien Ferland-Raymond, M.Sc. Stat., M.Sc. Biol. > Division des orientations et projets sp?ciaux > Direction des inventaires forestiers > Minist?re des Ressources naturelles et de la Faune du Qu?bec > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. From saldanha.plangeo at gmail.com Thu Nov 3 18:46:28 2011 From: saldanha.plangeo at gmail.com (Raphael Saldanha) Date: Thu, 3 Nov 2011 15:46:28 -0200 Subject: [R] EMD arguments default values In-Reply-To: <1320318807.16766.YahooMailNeo@web26907.mail.ukl.yahoo.com> References: <1320318807.16766.YahooMailNeo@web26907.mail.ukl.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From saldanha.plangeo at gmail.com Thu Nov 3 18:49:21 2011 From: saldanha.plangeo at gmail.com (Raphael Saldanha) Date: Thu, 3 Nov 2011 15:49:21 -0200 Subject: [R] data frame to workspace In-Reply-To: <1320335208079-3986415.post@n4.nabble.com> References: <1320335208079-3986415.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wdunlap at tibco.com Thu Nov 3 18:54:41 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 3 Nov 2011 17:54:41 +0000 Subject: [R] optimising a loop In-Reply-To: <161DC602615F6943A19BAEA3DBF2E4B90177CC8DD55A@HARFANG.intranet.mrn.gouv> References: <161DC602615F6943A19BAEA3DBF2E4B90177CC8DD55A@HARFANG.intranet.mrn.gouv> Message-ID: Try replacing your for loop with the line loop[cbind(as.vector(row(voisin)), match(voisin, nom.plac2))] <- as.vector(wei2) Look help(Subscript) to see how subscripting an n-way array by an n-column integer matrix works. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Bastien.Ferland- > Raymond at mrnf.gouv.qc.ca > Sent: Thursday, November 03, 2011 10:25 AM > To: r-help at r-project.org > Subject: [R] optimising a loop > > Dear R community, > > I'm trying to remove a loop from my code but I'm stock and I can't find a good way to do it. > Hopefully one of you will have something clever to propose. > > Here is a simplified example: > > I have a squared matrix: > > > nom.plac2 <- c("102", "103", "301", "303","304", "403") > > poids2 <- matrix(NA, 6,6, dimnames=list(nom.plac2,nom.plac2)) > > poids2 > 102 103 301 303 304 403 > 102 NA NA NA NA NA NA > 103 NA NA NA NA NA NA > 301 NA NA NA NA NA NA > 303 NA NA NA NA NA NA > 304 NA NA NA NA NA NA > 403 NA NA NA NA NA NA > > I want to replace some of the NAs following specific criterion included in 2 others matrix: > > > wei2 <- matrix(c(.6,.4,.5,.5,.9,.1,.8,.2,.7,.3,.6,.4),6,2,dimnames=list(nom.plac2, > c("p1","p2")),byrow=T) > > wei2 > p1 p2 > 102 0.6 0.4 > 103 0.5 0.5 > 301 0.9 0.1 > 303 0.8 0.2 > 304 0.7 0.3 > 403 0.6 0.4 > > voisin <- matrix(c("103","304", "303", "102", "103" ,"303","403","304","303","102","103" ,"303"), > 6,2,dimnames=list(nom.plac2, c("v1","v2")),byrow=T) > > voisin > v1 v2 > 102 "103" "304" > 103 "303" "102" > 301 "103" "303" > 303 "403" "304" > 304 "303" "102" > 403 "103" "303" > > So my final result is: > > 102 103 301 303 304 403 > 102 NA 0.6 NA NA 0.4 NA > 103 0.5 NA NA 0.5 NA NA > 301 NA 0.9 NA 0.1 NA NA > 303 NA NA NA NA 0.2 0.8 > 304 0.3 NA NA 0.7 NA NA > 403 NA 0.6 NA 0.4 NA NA > > > So, globally I want to fill for each line of "poids2" data from "wei2" associated with the good the > good identifier found in "voisin". > > This can easily be done by a loop: > > > loop <- poids2 > > for(i in 1:6){ > + loop[i,voisin[i,]] <- wei2[i,] > + } > > But I expect it to be quite slow with my larger dataset. > > Does any of you has an idea how I could remove the loop and speed up the operation? > > Best regards, > > > Bastien Ferland-Raymond, M.Sc. Stat., M.Sc. Biol. > Division des orientations et projets sp?ciaux > Direction des inventaires forestiers > Minist?re des Ressources naturelles et de la Faune du Qu?bec > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From Stephen.Bond at cibc.com Thu Nov 3 18:56:06 2011 From: Stephen.Bond at cibc.com (Bond, Stephen) Date: Thu, 3 Nov 2011 13:56:06 -0400 Subject: [R] Create design matrix Message-ID: Greetings useRs, What is the easiest way to create a design matrix of several factor variables? Function gendata in Design seems to do that for a fitted model, but how to do that only on several factor vectors?? The result should be a df with one row for each distinct combination of levels of factors eg for (M,F) (Y,O) We get M Y M O F Y F O In reality I will have more than 1000 rows so doing by hand not good. Maybe there is a way with "outer", but I couldn't see it. All the best to everybody. Stephen From ripley at stats.ox.ac.uk Thu Nov 3 18:58:22 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Thu, 3 Nov 2011 17:58:22 +0000 (GMT) Subject: [R] Uploding package help In-Reply-To: References: Message-ID: See 'Writing R Extensions', http://cran.r-project.org/doc/manuals/R-exts.html#Submitting-a-package-to-CRAN However, you can only submit a package: other people actually upload it to CRAN, if they accept it. On Thu, 3 Nov 2011, Reema Singh wrote: > Hello All > > I want to upload a R package in CRAN. Kindly tell me how to upload a new > package in CRAN > > REGARDS~ > Reema Singh > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From rshepard at appl-ecosys.com Thu Nov 3 19:02:26 2011 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Thu, 3 Nov 2011 11:02:26 -0700 (PDT) Subject: [R] take me off the list In-Reply-To: References: Message-ID: On Thu, 3 Nov 2011, Alan Gao wrote: Alan, This is a self-service mail list; no maids to do the work for you. Go to the Web site , navigate to the mail list page, and unsubscribe yourself. Rich From this.is.mvw at gmail.com Thu Nov 3 19:08:33 2011 From: this.is.mvw at gmail.com (Mike Williamson) Date: Thu, 3 Nov 2011 11:08:33 -0700 Subject: [R] any updates w.r.t. lapply, sapply, apply retaining classes Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sarah.goslee at gmail.com Thu Nov 3 19:21:07 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Thu, 3 Nov 2011 14:21:07 -0400 Subject: [R] problem with merging two matrices In-Reply-To: <1320337915950-3986638.post@n4.nabble.com> References: <1320266770417-3983136.post@n4.nabble.com> <1320337915950-3986638.post@n4.nabble.com> Message-ID: Hi, On Thu, Nov 3, 2011 at 12:31 PM, flokke wrote: > Dear Sarah, > THanks for your answer! > Sorry that my thread is somehow not clear, that's because I am not really > experienced with R and > dont know yet how to put thinings in words.. > > I am not trying to work with them differently, I am just trying to print > them as the result of a > function. But as the result() function does allow only one argument I have > to somehow merge these > two matrcices. I know that I could use the list() function as well but I > wanted to have some nicer output than that, that's why I created these two > matrices. > > Want I want to get in the end is that can call my function(x,y) and then you > get the print > of the two matrices, above each other. See, that would have been useful to know. If you wish to return two such matrices, you need to use a list. If you are determined that they appear in a particular way, you could also print them from within the function, so that particular things appear onscreen regardless of the way in which the results are returned. But, in that case, you may be better off reconsidering what your actual objectives are, and whether separating form and function might not be a more effective course. Sarah -- Sarah Goslee http://www.functionaldiversity.org From jtor14 at gmail.com Thu Nov 3 19:23:18 2011 From: jtor14 at gmail.com (Justin Haynes) Date: Thu, 3 Nov 2011 11:23:18 -0700 Subject: [R] Create design matrix In-Reply-To: References: Message-ID: ?expand.grid > expand.grid(c("M","F"),c("Y","O")) Var1 Var2 1 M Y 2 F Y 3 M O 4 F O > Justin On Thu, Nov 3, 2011 at 10:56 AM, Bond, Stephen wrote: > Greetings useRs, > > What is the easiest way to create a design matrix of several factor variables? Function gendata in Design seems to do that for a fitted model, but how to do that only on several factor vectors?? > > The result should be a df with one row for each distinct combination of levels of factors eg for (M,F) (Y,O) > We get > M Y > M O > F Y > F O > > In reality I will have more than 1000 rows so doing by hand not good. > Maybe there is a way with "outer", but I couldn't see it. > All the best to everybody. > > Stephen > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From gunter.berton at gene.com Thu Nov 3 19:25:54 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Thu, 3 Nov 2011 11:25:54 -0700 Subject: [R] Create design matrix In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From adele_thompson at cargill.com Thu Nov 3 19:41:46 2011 From: adele_thompson at cargill.com (Schatzi) Date: Thu, 3 Nov 2011 11:41:46 -0700 (PDT) Subject: [R] For loop to cycle through datasets of differing lengths Message-ID: <1320345706666-3987308.post@n4.nabble.com> I have encountered this problem on several occasions and am not sure how to handle it. I use for-loops to cycle through datasets. When each dataset is of equal length, it works fine as I can combine the datasets and have each loop pick up a different column, but when the datasets are differing lengths, I am struggling. Here is an example: A<-1:10 B<-1:15 C<-1:18 Set1<-data.frame(A,runif(10)) Set2<-data.frame(B,runif(15)) Set3<-data.frame(C,runif(18)) for (i in 1:3){ if (i==1) Data<-Set1 else if (i==2) Data<-Set2 else Data<-Set3 dev.new() plot(Data[,1],Data[,2]) } I don't always want to plot them and instead do other things, such as fit a non-linear equation to the dataset, etc. I end up using that "if" statement to cycle through the datasets and was hoping there is an easier method. Maybe one would be to add extra zeros until they are the same length and then take out the extra zeros in the first step. Any help would be appreciated. ----- In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/For-loop-to-cycle-through-datasets-of-differing-lengths-tp3987308p3987308.html Sent from the R help mailing list archive at Nabble.com. From peter.langfelder at gmail.com Thu Nov 3 19:59:34 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Thu, 3 Nov 2011 11:59:34 -0700 Subject: [R] For loop to cycle through datasets of differing lengths In-Reply-To: <1320345706666-3987308.post@n4.nabble.com> References: <1320345706666-3987308.post@n4.nabble.com> Message-ID: On Thu, Nov 3, 2011 at 11:41 AM, Schatzi wrote: > I have encountered this problem on several occasions and am not sure how to > handle it. I use for-loops to cycle through datasets. When each dataset is > of equal length, it works fine as I can combine the datasets and have each > loop pick up a different column, but when the datasets are differing > lengths, I am struggling. Here is an example: > A<-1:10 > B<-1:15 > C<-1:18 > > Set1<-data.frame(A,runif(10)) > Set2<-data.frame(B,runif(15)) > Set3<-data.frame(C,runif(18)) > > for (i in 1:3){ > if (i==1) Data<-Set1 else if (i==2) Data<-Set2 else Data<-Set3 > dev.new() > plot(Data[,1],Data[,2]) > } > > > I don't always want to plot them and instead do other things, such as fit a > non-linear equation to the dataset, etc. I end up using that "if" statement > to cycle through the datasets and was hoping there is an easier method. > Maybe one would be to add extra zeros until they are the same length and > then take out the extra zeros in the first step. Any help would be > appreciated. Well, you can start by putting your data sets in a list: allSets = list(Set1, Set2, Set3) then in your loop say Data = allSets[[i]] and you're done. If you have an apriori unknown number of data sets, you can use get. Before the loop you define the names, setNames = c("Set1", "Set2", "Set3") Then in the loop, you can use Data = get(setNames[i]) But I'm not sure why this would be more useful than the list example unless perhaps if you `load()' previously `save()'d data sets (the load() function can give you the names of loaded variables). In any case, if you repeat the same analysis on a number of data sets with different dimensions, list() is IMHO the way to go. HTH, Peter From zev at zevross.com Thu Nov 3 19:59:33 2011 From: zev at zevross.com (Zev Ross) Date: Thu, 03 Nov 2011 14:59:33 -0400 Subject: [R] Reclassify string values Message-ID: <4EB2E495.8070602@zevross.com> Hi All, Is there a simple way to convert a string such as c("A", "B" ,"C", "D") to a string of c("Group1", "Group1", "Group2", "Group2"). Naturally I could use the factor function as below but I don't like seeing that warning message (and I don't want to turn off warning messages). Perhaps a function called "reclassify" or "recategorize"? Zev x<-LETTERS[1:4] x2<-as.character(factor(x, levels=LETTERS[1:4], labels=rep(c("Group1", "Group2"), each=2))) Warning message: In `levels<-`(`*tmp*`, value = c("Group1", "Group1", "Group2", "Group2" : duplicated levels will not be allowed in factors anymore -- Zev Ross ZevRoss Spatial Analysis 120 N Aurora, Suite 3A Ithaca, NY 14850 607-277-0004 (phone) 866-877-3690 (fax, toll-free) zev at zevross.com From Steve_Friedman at nps.gov Thu Nov 3 19:54:13 2011 From: Steve_Friedman at nps.gov (Steve_Friedman at nps.gov) Date: Thu, 3 Nov 2011 14:54:13 -0400 Subject: [R] Plotting skewed normal distribution with a bar plot Message-ID: Hi, I need to create a plot (type = "h") and then overlay a skewed-normal curve on this distribution, but I'm not finding a procedure to accomplish this. I want to use the plot function here in order to control the bin distributions. I have explored the sn library and found the dsn function. dsn uses known location, scaling and shape parameters associated with a given input vector of probabilities. However, how can I calculate the skewed-normal curve if I don't know these parameters in advance? Is there another function to calculate the skew-normal, perhaps in a different package? I'm working with R 2.13.2 on a windows based machine. Steve Friedman Ph. D. Ecologist / Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 Steve_Friedman at nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 From michael.weylandt at gmail.com Thu Nov 3 20:04:12 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt ) Date: Thu, 3 Nov 2011 15:04:12 -0400 Subject: [R] data frame to workspace In-Reply-To: <1320335208079-3986415.post@n4.nabble.com> References: <1320335208079-3986415.post@n4.nabble.com> Message-ID: <3FBC389D-CDE5-4F94-AB73-89AF4C8B6C15@gmail.com> What did you do to create/save/load your data frame? There are n -> infty ways to do all three of those steps and it's hard to give meaningful help without knowing what you tried. M On Nov 3, 2011, at 11:46 AM, playballa23 wrote: > Is there a way to import a data frame into a workspace? I created a data > frame and from my understanding, a data frame is a type of object, and that > the workspace stores the current session's objects. Wondering why my data > frame is not showing up... > > Any thoughts/suggestions? > > -- > View this message in context: http://r.789695.n4.nabble.com/data-frame-to-workspace-tp3986415p3986415.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From v.p.mail at freemail.hu Thu Nov 3 20:10:52 2011 From: v.p.mail at freemail.hu (hihi) Date: Thu, 3 Nov 2011 20:10:52 +0100 (CET) Subject: [R] Is it possible to vectorize/accelerate this? Message-ID: Dear Members, I work on a simulaton experiment but it has an bottleneck. It's quite fast because of R and vectorizing, but it has a very slow for loop. The adjacent element of a vector (in terms of index number) depends conditionally on the former value of itself. Like a simple cumulating function (eg. cumsum) but with condition. Let's show me an example: a_vec = rnorm(100) b_vec = rep(0, 100) b_vec[1]=a_vec[1] for (i in 2:100){b_vec[i]=ifelse(abs(b_vec[i-1]+a_vec[i])>1, a_vec[i], b_vec[i-1]+a_vec[i])} print(b_vec) (The behaviour is like cumsum's, but when the value would excess 1.0 then it has another value from a_vec.) Is it possible to make this faster? I experienced that my way is even slower than in Excel! Programming in C would my last try... Any suggestions? Than you, Peter From peter.langfelder at gmail.com Thu Nov 3 20:13:32 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Thu, 3 Nov 2011 12:13:32 -0700 Subject: [R] Reclassify string values In-Reply-To: <4EB2E495.8070602@zevross.com> References: <4EB2E495.8070602@zevross.com> Message-ID: On Thu, Nov 3, 2011 at 11:59 AM, Zev Ross wrote: > Hi All, > > Is there a simple way to convert a string such as c("A", "B" ,"C", "D") to a > string of c("Group1", "Group1", "Group2", "Group2"). Naturally I could use > the factor function as below but I don't like seeing that warning message > (and I don't want to turn off warning messages). Perhaps a function called > "reclassify" or "recategorize"? > > Zev > > x<-LETTERS[1:4] > x2<-as.character(factor(x, levels=LETTERS[1:4], labels=rep(c("Group1", > "Group2"), each=2))) > > Warning message: > In `levels<-`(`*tmp*`, value = c("Group1", "Group1", "Group2", "Group2" : > ?duplicated levels will not be allowed in factors anymore If you want to "translate", why not first build a translation table tt = cbind(LETTERS[1:4], c("group1", "group1", "group2", "group2")) then apply it on an example: xx = sample(LETTERS[1:4], 20, replace = TRUE) translation = tt[ match(xx, tt[, 1]), 2] > translation [1] "group2" "group2" "group2" "group2" "group2" "group1" "group2" "group1" [9] "group2" "group1" "group1" "group2" "group2" "group2" "group1" "group2" [17] "group2" "group1" "group1" "group2" Or did I misunderstand your intent? Peter From Virginie.Rondeau at isped.u-bordeaux2.fr Wed Nov 2 10:39:14 2011 From: Virginie.Rondeau at isped.u-bordeaux2.fr (Rondeau Virginie) Date: Wed, 2 Nov 2011 10:39:14 +0100 Subject: [R] [R-pkgs] new version of FRAILTYPACK: general frailty models Message-ID: <4EB10FC2.2010505@isped.u-bordeaux2.fr> Dear R users, We are pleased to tell you that "FRAILTYPACK" has been updated. "FRAILTYPACK" stands now for general frailty models estimated with a semi-parametrical penalized likelihood, but also with a parametrical approach. In case of comments/corrections/remarks/suggestions -- which are very welcome --please contact the maintainer directly. Kind regards, The FRAILTYPACK team. _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages From mstepanov at abo.fi Thu Nov 3 18:54:56 2011 From: mstepanov at abo.fi (mstepano) Date: Thu, 3 Nov 2011 10:54:56 -0700 (PDT) Subject: [R] Searching elements in list Message-ID: <1320342896253-3987066.post@n4.nabble.com> Hi all, I have a list ov vectors of enequal lenght and need to check is the given vector in list. > v1<-c(1,2) > v2<-c(1,2,3) > v3<-c(1,3) > allv<-list(v1,v2,v3) > > somev<-c(1,2) > somev%in%allv [1] FALSE FALSE Hence, %in% checks that elements of vector somev are in list. How it is possible to check that somev is element of list? -- View this message in context: http://r.789695.n4.nabble.com/Searching-elements-in-list-tp3987066p3987066.html Sent from the R help mailing list archive at Nabble.com. From mstepanov at abo.fi Thu Nov 3 20:21:03 2011 From: mstepanov at abo.fi (mstepano) Date: Thu, 3 Nov 2011 12:21:03 -0700 (PDT) Subject: [R] Searching elements in list In-Reply-To: <1320342896253-3987066.post@n4.nabble.com> References: <1320342896253-3987066.post@n4.nabble.com> Message-ID: <1320348063752-3987487.post@n4.nabble.com> I forget to mention that the vectors are ordered. I think that one of the possible solutions is to use lapply, i.e., < T %in% lapply(allv, function(x,y) all.equal(x,y),y=somev) [1] TRUE -- View this message in context: http://r.789695.n4.nabble.com/Searching-elements-in-list-tp3987066p3987487.html Sent from the R help mailing list archive at Nabble.com. From michael.weylandt at gmail.com Thu Nov 3 20:44:56 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt ) Date: Thu, 3 Nov 2011 15:44:56 -0400 Subject: [R] Is it possible to vectorize/accelerate this? In-Reply-To: References: Message-ID: <95680F94-F030-4DFF-8D51-1141144C1F99@gmail.com> I don't immediately see a good trick for vectorization so this seems to me to be a good candidate for work in a lower-level language. Staying within R, I'd suggest you use if and else rather than ifelse() since your computation isn't vectorized: this will eliminate a small amount over overhead. Since you also always add a_vec, you could also define b_vec as a copy of a to avoid all those calls to subset a, but I don't think the effects will be large and the code might not be as clear. You indicated that you may be comfortable with writing C, but I'd suggest you look into the Rcpp/Inline package pair which make the whole process much easier than it would otherwise be. I'm not at a computer write now or I'd write a fuller example, but the documentation for those packages is uncommonly good an you should be able to easily get it down into C++. If you aren't able to get it by tomorrow, let me know and I can help troubleshoot. The only things I foresee that you'll need to change are zero-basing, C's loop syntax, and (I think) the call to abs(). (I always forget where abs() lives in c++ ....) The only possible hold up is that you need to be at a computer with a C compiler Hope this helps, Michael On Nov 3, 2011, at 3:10 PM, hihi wrote: > Dear Members, > > I work on a simulaton experiment but it has an bottleneck. It's quite fast because of R and vectorizing, but it has a very slow for loop. The adjacent element of a vector (in terms of index number) depends conditionally on the former value of itself. Like a simple cumulating function (eg. cumsum) but with condition. Let's show me an example: > a_vec = rnorm(100) > b_vec = rep(0, 100) > b_vec[1]=a_vec[1] > for (i in 2:100){b_vec[i]=ifelse(abs(b_vec[i-1]+a_vec[i])>1, a_vec[i], b_vec[i-1]+a_vec[i])} > print(b_vec) > > (The behaviour is like cumsum's, but when the value would excess 1.0 then it has another value from a_vec.) > Is it possible to make this faster? I experienced that my way is even slower than in Excel! Programming in C would my last try... > Any suggestions? > > Than you, > Peter > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From michael.weylandt at gmail.com Thu Nov 3 20:54:56 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt ) Date: Thu, 3 Nov 2011 15:54:56 -0400 Subject: [R] Histograms in R In-Reply-To: <1320321796661-3985397.post@n4.nabble.com> References: <1320321796661-3985397.post@n4.nabble.com> Message-ID: <51D703BF-FB00-434F-A2A3-E4ABAA7BCB10@gmail.com> Try something like this Lam <- 3 X <- rpois(500, Lam) hist(X, freq = F) x <- seq(min(X), max(X), length = 500) lines(x, dpois(x, Lam), col=2) Adapt as necessary Michael On Nov 3, 2011, at 8:03 AM, kerry1912 wrote: > We have a histogram of our observed response and we want to overlay the > corresponding poisson distribution with respect to our poisson model. > > > -- > View this message in context: http://r.789695.n4.nabble.com/Histograms-in-R-tp3985397p3985397.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From bbolker at gmail.com Thu Nov 3 20:55:17 2011 From: bbolker at gmail.com (Ben Bolker) Date: Thu, 3 Nov 2011 19:55:17 +0000 Subject: [R] Comparing negative binomial models References: <1320321017321-3985353.post@n4.nabble.com> Message-ID: Davg hotmail.com> writes: > I am trying to compare negative binomial models for the prediction of sports > games (I know that Poisson models would be better but I'm just trying > Negative Binomial at the moment). > > But, to compare the models I need them to have the same theta value. How > can I change the explanatory variables while maintaining the theta value? library(MASS) modelfit <- glm(..., family=negative.binomial(theta=theta_value)) I believe that drop1() applied to a glm.nb() fit does this correctly (i.e. holds the theta parameter fixed at the estimate from the most complex model). From v.p.mail at freemail.hu Thu Nov 3 21:22:20 2011 From: v.p.mail at freemail.hu (hihi) Date: Thu, 3 Nov 2011 21:22:20 +0100 Subject: [R] Is it possible to vectorize/accelerate this? In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: nem el?rhet? URL: From djmuser at gmail.com Thu Nov 3 21:22:50 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Thu, 3 Nov 2011 13:22:50 -0700 Subject: [R] Is it possible to vectorize/accelerate this? In-Reply-To: References: Message-ID: Hi: You're doing the right thing in R by pre-allocating memory for the result, but ifelse() is a vectorized function and your loop is operating elementwise, so if-else is more appropriate. Try for (i in 2:100){ b_vec[i] <- if(abs(b_vec[i-1] + a_vec[i]) > 1) a_vec[i] else b_vec[i-1] + a_vec[i] } If speed is an issue, then I echo Michael's suggestion to write a C(++) function and call it within R. The inline package is good for this kind of thing. HTH, Dennis On Thu, Nov 3, 2011 at 12:10 PM, hihi wrote: > Dear Members, > > I work on a simulaton experiment but it has an bottleneck. It's quite fast because of R and vectorizing, but it has a very slow for loop. The adjacent element of a vector (in terms of index number) depends conditionally on the former value of itself. Like a simple cumulating function (eg. cumsum) but with condition. Let's show me an example: > a_vec = rnorm(100) > b_vec = rep(0, 100) > b_vec[1]=a_vec[1] > for (i in 2:100){b_vec[i]=ifelse(abs(b_vec[i-1]+a_vec[i])>1, a_vec[i], b_vec[i-1]+a_vec[i])} > print(b_vec) > > (The behaviour is like cumsum's, but when the value would excess 1.0 then it has another value from a_vec.) > Is it possible to make this faster? I experienced that my way is even slower than in Excel! Programming in C would my last try... > Any suggestions? > > Than you, > Peter > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From zev at zevross.com Thu Nov 3 21:31:26 2011 From: zev at zevross.com (Zev Ross) Date: Thu, 03 Nov 2011 16:31:26 -0400 Subject: [R] Reclassify string values In-Reply-To: References: <4EB2E495.8070602@zevross.com> Message-ID: <4EB2FA1E.4060706@zevross.com> Hi Peter, Thanks for the response. What you've suggested works fine but I'm looking for something that is simpler than my solution and avoids the pesky warning message. Your response avoids the warning message but just as complex (if not more). I just assumed there would be a function along the lines of: > mydata <- c("A", "C", "A", "D", "B", "B") > reclassify(mydata, inCategories=c("A", "B" ,"C", "D"), outCategories=c("Group1", "Group1", "Group2", "Group2")) [1] "Group1" "Group2" "Group1" "Group2" "Group1" "Group1" Zev On 11/3/2011 3:13 PM, Peter Langfelder wrote: > On Thu, Nov 3, 2011 at 11:59 AM, Zev Ross wrote: >> Hi All, >> >> Is there a simple way to convert a string such as c("A", "B" ,"C", "D") to a >> string of c("Group1", "Group1", "Group2", "Group2"). Naturally I could use >> the factor function as below but I don't like seeing that warning message >> (and I don't want to turn off warning messages). Perhaps a function called >> "reclassify" or "recategorize"? >> >> Zev >> >> x<-LETTERS[1:4] >> x2<-as.character(factor(x, levels=LETTERS[1:4], labels=rep(c("Group1", >> "Group2"), each=2))) >> >> Warning message: >> In `levels<-`(`*tmp*`, value = c("Group1", "Group1", "Group2", "Group2" : >> duplicated levels will not be allowed in factors anymore > If you want to "translate", why not first build a translation table > > tt = cbind(LETTERS[1:4], c("group1", "group1", "group2", "group2")) > > then apply it on an example: > > xx = sample(LETTERS[1:4], 20, replace = TRUE) > > translation = tt[ match(xx, tt[, 1]), 2] > >> translation > [1] "group2" "group2" "group2" "group2" "group2" "group1" "group2" "group1" > [9] "group2" "group1" "group1" "group2" "group2" "group2" "group1" "group2" > [17] "group2" "group1" "group1" "group2" > > Or did I misunderstand your intent? > > Peter > -- Zev Ross ZevRoss Spatial Analysis 120 N Aurora, Suite 3A Ithaca, NY 14850 607-277-0004 (phone) 866-877-3690 (fax, toll-free) zev at zevross.com From diggsb at ohsu.edu Thu Nov 3 21:36:22 2011 From: diggsb at ohsu.edu (Brian Diggs) Date: Thu, 3 Nov 2011 13:36:22 -0700 Subject: [R] Problem with R CMD check and the inconsolata font business. In-Reply-To: <4EB244BE.7010905@xtra.co.nz> References: <4EB244BE.7010905@xtra.co.nz> Message-ID: <4EB2FB46.6030701@ohsu.edu> On 11/3/2011 12:37 AM, Rolf Turner wrote: > > I have just installed R version 2.14.0 and tried to re-build and > re-check some > of the packages that I maintain. > > I'm getting a warning (in the process of running R CMD check on my "deldir" > package): > >> * checking PDF version of manual ... WARNING >> LaTeX errors when creating PDF version. >> This typically indicates Rd problems. >> LaTeX errors found: >> ! Font T1/fi4/m/n/10=ec-inconsolata at 10.0pt not loadable: Metric >> (TFM) file n >> ot found. >> >> relax >> l.19 ...lf Turner }\email{r.turner at auckland.ac.nz} >> ! \textfont 0 is undefined (character h). >> \Url at FormatString ...\Url at String \UrlRight \m at th $ >> >> l.26 ...\AsIs{}\url{http://www.math.unb.ca/~rolf/} >> \AsIs{} >> ! \textfont 0 is undefined (character t). >> \Url at FormatString ...\Url at String \UrlRight \m at th $ >> >> l.26 ...\AsIs{}\url{http://www.math.unb.ca/~rolf/} >> \AsIs{} >> ! \textfont 0 is undefined (character t). > ...... etc., etc., etc., ad (almost) infinitum. > > So there's some problem with a font file not being "loadable". > > Can anyone tell me what the I should ***do*** > about this? I managed to install the "inconsolata" package from CTAN. > At least I think I managed; I downloaded the *.zip file and then unzipped > it in /usr/share/texmf/tex/latex. And ran "texhash". This stopped R CMD > check from complaining that the "inconsolata" package could not be found, > but then led to the further complaint described above. > > So how do I make the required font "loadable"? What files do I need? > Where do I get them? And where should I put them once I've got them? > > I would be grateful for any assistance that can be rendered. > > (I know it's "just a warning", but I *hate* to ignore warnings!) > > cheers, > > Rolf Turner > > P. S. I'm running Ubuntu; session info, in case it's of any relevance is: > > sessionInfo() > R version 2.14.0 (2011-10-31) > Platform: i686-pc-linux-gnu (32-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] misc_0.0-15 I'm having a similar, though different, problem with inconsolata. It may be the same root problem. The error on R CMD check I get is: * checking PDF version of manual ... WARNING LaTeX errors when creating PDF version. This typically indicates Rd problems. LaTeX errors found: !pdfTeX error: pdflatex.EXE (file ec-inconsolata): Font ec-inconsolata at 540 n ot found ==> Fatal error occurred, no output PDF file produced! Now, oddly, there _is_ a -manual.pdf file in the .Rcheck directory (despite the message saying no PDf was produced). If I just try to run the pdflatex on the -manual.tex file in that directory, I get the same error and no PDF (?!) I am on Windows 7 64 bit using MiKTeX 2.9; I get the same error whether I use the 32 or 64 bit version of R for R CMD check. > sessionInfo() R version 2.14.0 (2011-10-31) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base Some info from my tex installation (showing I have inconsolata installed and that I am pulling the correct Rd.sty): C:\>kpsewhich Rd.sty C:/Program Files/R/R-2.14.0/share/texmf/tex/latex/Rd.sty C:\>kpsewhich inconsolata.sty C:/Program Files (x86)/MiKTeX 2.9/tex/latex/inconsolata/inconsolata.sty C:\>kpsewhich ec-inconsolata.tfm C:/Program Files (x86)/MiKTeX 2.9/fonts/tfm/public/inconsolata/ec-inconsolata.tfm -- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health & Science University From peter.langfelder at gmail.com Thu Nov 3 21:40:45 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Thu, 3 Nov 2011 13:40:45 -0700 Subject: [R] Reclassify string values In-Reply-To: <4EB2FA1E.4060706@zevross.com> References: <4EB2E495.8070602@zevross.com> <4EB2FA1E.4060706@zevross.com> Message-ID: On Thu, Nov 3, 2011 at 1:31 PM, Zev Ross wrote: > Hi Peter, > > Thanks for the response. What you've suggested works fine but I'm looking > for something that is simpler than my solution and avoids the pesky warning > message. Your response avoids the warning message but just as complex (if > not more). I just assumed there would be a function along the lines of: > >> mydata <- c("A", "C", "A", "D", "B", "B") >> reclassify(mydata, inCategories=c("A", "B" ,"C", "D"), >> ?outCategories=c("Group1", "Group1", "Group2", "Group2")) > > [1] "Group1" "Group2" "Group1" "Group2" "Group1" "Group1" > But of course, except sometimes you have to write the function yourself. reclassify = function(data, inCategories, outCategories) { outCategories[ match(data, inCategories)] } Sorry I can't make it any simpler than a 1-line solution :) Feel free to add some checking of input validity, if you need that. Peter From dwinsemius at comcast.net Thu Nov 3 21:47:46 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Nov 2011 16:47:46 -0400 Subject: [R] Searching elements in list In-Reply-To: <1320348063752-3987487.post@n4.nabble.com> References: <1320342896253-3987066.post@n4.nabble.com> <1320348063752-3987487.post@n4.nabble.com> Message-ID: <02639746-C9D3-44A1-B0EA-BB411A8BF55E@comcast.net> On Nov 3, 2011, at 3:21 PM, mstepano wrote: > I forget to mention that the vectors are ordered. I think that one > of the > possible solutions is to use lapply, i.e., > > < T %in% lapply(allv, function(x,y) all.equal(x,y),y=somev) > [1] TRUE all.equal worked fine when the answer was 'true' but not so well in my hands when it was 'false'. Also not a good idea to replace the base definition of %in%. > any( sapply( allv, identical, v1) ) [1] TRUE > any( sapply( allv, identical, c(3,3,4)) ) [1] FALSE `%l.in%` <- function(vec,lis) any( sapply( lis, identical, vec) ) Use: > v1 %l.in% allv [1] TRUE -- David Winsemius, MD Heritage Laboratories West Hartford, CT From dwinsemius at comcast.net Thu Nov 3 21:57:11 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Nov 2011 16:57:11 -0400 Subject: [R] Reclassify string values In-Reply-To: References: <4EB2E495.8070602@zevross.com> <4EB2FA1E.4060706@zevross.com> Message-ID: <8F56F393-DA7F-4186-8684-33B23A559AEC@comcast.net> On Nov 3, 2011, at 4:40 PM, Peter Langfelder wrote: > On Thu, Nov 3, 2011 at 1:31 PM, Zev Ross wrote: >> Hi Peter, >> >> Thanks for the response. What you've suggested works fine but I'm >> looking >> for something that is simpler than my solution and avoids the pesky >> warning >> message. Your response avoids the warning message but just as >> complex (if >> not more). I just assumed there would be a function along the lines >> of: >> >>> mydata <- c("A", "C", "A", "D", "B", "B") >>> reclassify(mydata, inCategories=c("A", "B" ,"C", "D"), >>> outCategories=c("Group1", "Group1", "Group2", "Group2")) >> >> [1] "Group1" "Group2" "Group1" "Group2" "Group1" "Group1" >> > > But of course, except sometimes you have to write the function > yourself. > > reclassify = function(data, inCategories, outCategories) > { > outCategories[ match(data, inCategories)] > } > > Sorry I can't make it any simpler than a 1-line solution :) It will be difficult to beat a oneliner like that. If Zev is still holding out for a canned solution he might look in the 'car'' package where there is at least one function that does releveling and grouping. I foget its name at the moment but it wouldn't hurt a new learneR to scroll through the entire 'car' suite of functions. -- David Winsemius, MD Heritage Laboratories West Hartford, CT From calum.polwart at nhs.net Thu Nov 3 22:21:07 2011 From: calum.polwart at nhs.net (Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST)) Date: Thu, 3 Nov 2011 21:21:07 +0000 Subject: [R] Kaplan Meier - not for dates In-Reply-To: <20111103195527.732454491DA@nhs-pd1e-esg010.ad1.nhs.net> References: <20111031183318.CA721449A20@nhs-pd1e-esg105.ad1.nhs.net>, <20111103195527.732454491DA@nhs-pd1e-esg010.ad1.nhs.net> Message-ID: <20111103212108.4B36F448751@nhs-pd1e-esg108.ad1.nhs.net> Thanks for the reply. The treatment is effectively for a chronic condition - so you stay on the treatment till it stops working. We know from trials how long that should be and we know the theoretical cost of that treatment but that's based on the text book dose (patients dose reduce and delay treatment and its based on weight so variable). We've been asked to provide our national planning team with an "average" cost based on our early experiences. So we have suggested to them we might be able to get a median cost. Some patients will stay on treatment several years so it will be impossible to get an average for years. So the censored patients will be those still on treatment (the event being stopping treatment) I'll give what you've suggested a go. Thanks Calum Polwart BSc(Hons) MSc MRPharmS SPres IPres Network Pharmacist - NECN and Pharmacy Clinical Team Manager (Cancer & Aseptic Services) - CDDFT Our website has now been unlocked and updated. Should you require contacts, meeting details, publications etc, please visit us on www.cancernorth.nhs.uk ________________________________________ From: Lancaster, Robert (Orbitz) [ROBERT.LANCASTER at orbitz.com] Sent: 03 November 2011 19:55 To: Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST); r-help at r-project.org Subject: RE: Kaplan Meier - not for dates I think it really depends on what your event of interest is. If your event is that the patient got better and "left treatment" then I think this could work. You would have to mark as censored any patient still in treatment or any patient that stopped treatment w/o getting better (e.g. in the case of death). You would then be predicting the cost required to make the patient well enough to leave treatment. It is a little non-standard to use $ instead of time, but time is money after all. You could set up your data frame with two columns: 1) cost 2) event/censored. Then create your survival object: mySurv = Surv(my_data$cost,my_data$event) And then use survfit to create your KM curves: myFit = survfit(mySurv~NULL) If you have other explanatory variables that you think may influence the cost, you can of course add them to your data frame and change the formula you use in survfit. For instance, you could have some severity measure, e.g. High, Medium, Low. You could then do: myFit = survfit(mySurv~my_data$severity) -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST) Sent: Monday, October 31, 2011 1:29 PM To: r-help at r-project.org Subject: [R] Kaplan Meier - not for dates I have some data which is censored and I want to determine the median. Its actually cost data for a cohort of patients, many of whom are still on treatment and so are censored. I can do the same sort of analysis for a survival curve and get the median survival... ...but can I just use the survival curve functions to plot an X axis that is $ rather than date? If not is there some other way to achieve this? Thanks Calum ******************************************************************************************************************** This message may contain confidential information. If yo...{{dropped:21}} ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ******************************************************************************************************************** This message may contain confidential information. If you are not the intended recipient please inform the sender that you have received the message in error before deleting it. Please do not disclose, copy or distribute information in this e-mail or take any action in reliance on its contents: to do so is strictly prohibited and may be unlawful. Thank you for your co-operation. NHSmail is the secure email and directory service available for all NHS staff in England and Scotland NHSmail is approved for exchanging patient data and other sensitive information with NHSmail and GSi recipients NHSmail provides an email address for your career in the NHS and can be accessed anywhere For more information and to find out how you can switch, visit www.connectingforhealth.nhs.uk/nhsmail From rolf.turner at xtra.co.nz Thu Nov 3 22:29:36 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Fri, 04 Nov 2011 10:29:36 +1300 Subject: [R] Problem with R CMD check and the inconsolata font business. In-Reply-To: <4EB244BE.7010905@xtra.co.nz> References: <4EB244BE.7010905@xtra.co.nz> Message-ID: <4EB307C0.9080602@xtra.co.nz> On 03/11/11 20:37, Rolf Turner wrote: > > I have just installed R version 2.14.0 and tried to re-build and > re-check some > of the packages that I maintain. > > I'm getting a warning (in the process of running R CMD check on my > "deldir" > package): > >> * checking PDF version of manual ... WARNING >> LaTeX errors when creating PDF version. >> This typically indicates Rd problems. >> LaTeX errors found: >> ! Font T1/fi4/m/n/10=ec-inconsolata at 10.0pt not loadable: Metric >> (TFM) file n >> ot found. >> I received two replies off-line and one that was sent to the list. The gist of the off-line replies was ``you need to install some fonts'', and (unusually, and very kindly! :-) ) the responders told me *how* to do so. The first message that I read was from John Nash who wrote: > Have you tried > > sudo apt-get install texlive-fonts-extra > > to get the font in? I had to do this just yesterday myself. > > If that works, maybe put a note on R-help. I've not sent to R-help in case it doesn't > work, and we get a bunch of unnecessary noise. Well it *did* work; thank you very much John! A similar message from G. Jay Kerns said: > It looks like you are missing some font files. Did you do: > > apt-get install texlive-fonts-recommended ? The third message (which *was* posted to the list) was from Brian Diggs, who said that he was having a related problem: > The error on R CMD check I get is: > > * checking PDF version of manual ... WARNING > LaTeX errors when creating PDF version. > This typically indicates Rd problems. > LaTeX errors found: > !pdfTeX error: pdflatex.EXE (file ec-inconsolata): Font ec-inconsolata > at 540 n > ot found > ==> Fatal error occurred, no output PDF file produced! I think this is the same as/similar to the problem I had before I installed the inconsolata (LaTeX) package from CTAN. Since Brian does Windoze I cannot advise as to how to install the package on his system, but I presume it is not too difficult --- once you know how! :-) I'm posting this so that John Nash's and Jay Kerns' solutions to my problem are there in the archives ``for the record''. Thanks to all who replied. cheers, Rolf Turner From dcarlson at tamu.edu Thu Nov 3 22:39:03 2011 From: dcarlson at tamu.edu (David L Carlson) Date: Thu, 3 Nov 2011 16:39:03 -0500 Subject: [R] Histograms in R In-Reply-To: <51D703BF-FB00-434F-A2A3-E4ABAA7BCB10@gmail.com> References: <1320321796661-3985397.post@n4.nabble.com> <51D703BF-FB00-434F-A2A3-E4ABAA7BCB10@gmail.com> Message-ID: <007001cc9a71$01e8d5c0$05ba8140$@edu> The lines() command doesn't work and histogram combines categories unless you specify the number. How about a barplot Lam <- 3 X <- table(rpois(500, Lam)) Max <- length(X)-1 barplot(rbind(X, 500*dpois(0:Max, Lam)), beside=TRUE, legend.text=c("Observed", "Expected")) or a rootogram library(vcd) rootogram(X, dpois(0:Max, Lam)*500) -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of R. Michael Weylandt Sent: Thursday, November 03, 2011 2:55 PM To: kerry1912 Cc: r-help at r-project.org Subject: Re: [R] Histograms in R Try something like this Lam <- 3 X <- rpois(500, Lam) hist(X, freq = F) x <- seq(min(X), max(X), length = 500) lines(x, dpois(x, Lam), col=2) Adapt as necessary Michael On Nov 3, 2011, at 8:03 AM, kerry1912 wrote: > We have a histogram of our observed response and we want to overlay the > corresponding poisson distribution with respect to our poisson model. > > > -- > View this message in context: http://r.789695.n4.nabble.com/Histograms-in-R-tp3985397p3985397.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From michael.weylandt at gmail.com Thu Nov 3 22:51:02 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Thu, 3 Nov 2011 17:51:02 -0400 Subject: [R] Is it possible to vectorize/accelerate this? In-Reply-To: References: <95680F94-F030-4DFF-8D51-1141144C1F99@gmail.com> Message-ID: Yes -- if & else is much faster than ifelse() because if is a primitive while ifelse() is a whole function call (in fact, you can see the code by typing ifelse into the prompt and see that it has two if calls within it. Michael On Thu, Nov 3, 2011 at 4:38 PM, hihi wrote: > Hi, > thank you for your very immediate response. :-) Is if than and else faster > than ifelse? I'm wondering (or not knowing something) > Best regards, > Peter > 2011/11/3 R. Michael Weylandt > >> >> I don't immediately see a good trick for vectorization so this seems to me >> to be a good candidate for work in a lower-level language. Staying within R, >> I'd suggest you use if and else rather than ifelse() since your computation >> isn't vectorized: this will eliminate a small amount over overhead. Since >> you also always add a_vec, you could also define b_vec as a copy of a to >> avoid all those calls to subset a, but I don't think the effects will be >> large and the code might not be as clear. >> >> You indicated that you may be comfortable with writing C, but I'd suggest >> you look into the Rcpp/Inline package pair which make the whole process much >> easier than it would otherwise be. >> >> ?I'm not at a computer write now or I'd write a fuller example, but the >> documentation for those packages is uncommonly good an you should be able to >> easily get it down into C++. If you aren't able to get it by tomorrow, let >> me know and I can help troubleshoot. The only things I foresee that you'll >> need to change are zero-basing, C's loop syntax, and (I think) the call to >> abs(). (I always forget where abs() lives in c++ ....) >> >> The only possible hold up is that you need to be at a computer with a C >> compiler >> >> Hope this helps, >> >> Michael >> >> On Nov 3, 2011, at 3:10 PM, hihi wrote: >> >> > Dear Members, >> > >> > I work on a simulaton experiment but it has an bottleneck. It's quite >> > fast because of R and vectorizing, but it has a very slow for loop. The >> > adjacent element of a vector (in terms of index number) depends >> > conditionally on the former value of itself. Like a simple cumulating >> > function (eg. cumsum) but with condition. Let's show me an example: >> > a_vec = rnorm(100) >> > b_vec = rep(0, 100) >> > b_vec[1]=a_vec[1] >> > for (i in 2:100){b_vec[i]=ifelse(abs(b_vec[i-1]+a_vec[i])>1, a_vec[i], >> > b_vec[i-1]+a_vec[i])} >> > print(b_vec) >> > >> > (The behaviour is like cumsum's, but when the value would excess 1.0 >> > then it has another value from a_vec.) >> > Is it possible to make this faster? I experienced that my way is even >> > slower than in Excel! Programming in C would my last try... >> > Any suggestions? >> > >> > Than you, >> > Peter >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. > > From jwiley.psych at gmail.com Thu Nov 3 22:51:43 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Thu, 3 Nov 2011 14:51:43 -0700 Subject: [R] any updates w.r.t. lapply, sapply, apply retaining classes In-Reply-To: References: Message-ID: Hi Mike, This isn't really an answer to your question, but perhaps will serve to continue discussion. I think that there are some fundamental issues when working special classes. As a thought example, suppose I wrote a class, "posreal", which inherits from the numeric class. It is only valid for positive, real numbers. I use it in a package, but do not develop methods for it. A user comes along and creates a vector, x that is a posreal. Then tries: mean(x * -3). Since I never bothered to write a special method for mean for my class, R falls back to the inherited numeric, but gives a value that is clearly not valid for posreal. What should happen? S3 methods do not really have validation, so in principle, one could write a function like: f <- function(x) { vclass <- class(x) res <- mean(x) class(res) <- vclass return(res) } which "retains" the appropriate class, but in name only. R core cannot possibly know or imagine all classes that may be written that inherit from more basic types but with possible special aspects and requirements. I think the inherited is considered to be more generic and that is returned. It is usually up to the user to ensure that the function (whose methods were not specific to that special class but the inherited) is valid for that class and can manually convert it back: res <- as.posreal(res) What about lapply and sapply? Neither are generic or have methods for difftime, and so do some unexpected/desirable things. Again, without methods defined for a particular class, they cannot know what is special or appropriate way to handle it, they use defaults which sometimes work but may give unexpected or undesirable results, but what else can be done? (okay, they could just throw an error) If a function is naive about a class, it does not seem right to operate on it using unknown methods and then pretend to be returning the same type of data. As it stands, they convert to a data type they know and return that. Now, you mention that for loops are slow in R, and this is true to a degree. However, the *apply functions are basically just internal loops, so they do not really save you (they are certainly not vectorized!), though they are more elegant than explicit loops IMO. One way to use them while retaining class would be like: sapply(seq_along(test), function(i) class(test[i])) this is less efficient then sapply(test, class), but the overhead drops considerably as the function does nontrivial calculations. Finally, I find the (relatively) new compiler package really shines at making functions that are just wrappers for for loops more efficient. Take a look at the examples from: require(compiler) ?cmpfun I am not familiar with numPy so I do not know how it handles new classes, but with some tweaks to my workflow, I do not find myself running into problems with how R handles them. I definitely appreciate your position because I have been there...as I became more familiar with R, classes, and methods, I find I work in a way that avoids passing objects to functions that do not know how to handle them properly. Cheers, Josh On Thu, Nov 3, 2011 at 11:08 AM, Mike Williamson wrote: > Hi All, > > ? ?I don't have a "I need help" question, so much as a query into any > update whether 'R' has made any progress with some of the core functions > retaining classes. ?As an example, because it's one of the cases that most > egregiously impacts me & my work and keeps pushing me away from 'R' and > into other numerical languages (such as NumPy in python), I will use sapply > / lapply to demonstrate, but this behavior is ubiquitous throughout 'R'. > > ? ?Let's say I have a class which is theoretically supported, but not one > of the core "numeric" or "character" classes (and, to some degree, "factor" > classes). ?Many of the basic functions will convert my desired class into > either numeric or character, so that my returned answer is gibberish. > > E.g.: > > test= as.difftime(c(1, 1, 8, 0.25, 8, 1.25), units= "days") ?## create a > small array of time differences > class(test) ?## this will return the proper class, "difftime" > class(test[1] ) ## this will also return the proper class, "difftime" > sapply(test, class) ?## this will return *numerics* for all of the classes. > ?Ack!! > > ? ?In the example I give above, the impact might seem small, but the > implications are *huge*. ?This means that I am, in effect, not allowed to > use *any* of the vectoring functions in 'R', which avoid performing loops > thereby speeding up process time extraordinarily. ?Many can sympathize that > 'R' is ridiculously slow with "for" loops, compared to other languages. > ?But that's theoretically OK, a good statistician or data analyst should be > able to work comfortably with matrices and vectors. ?However, *'R' cannot > work comfortably* with matrices or vectors, *unless* they are using the > numeric or character classes. ?Many of the classes suffer the problem I > just described, although I only used "difftime" in the example. ?Factors > seem a bit more "comfortable", and can be handled most of the time, but not > as well as numerics, and at times functions working on factors can return > the numerical representation of the factor instead of the original factor. > > ? ?Is there any progress in guaranteeing that all core functions either > (a) ideally return exactly the classes, and hierarchy of classes, that they > received (e.g., a list of data frames with difftimes & dates & characters > would return a list of data frames with difftimes & dates & characters), or > (b) barring that, the function should at least error out with a clear error > explaining that sapply, for example, cannot vectorize on the class being > used? ?Returning incorrect answers is far worse than returning an error, > from a perspective of stability. > > ? ?This is, by far, the largest Achilles' heel to 'R'. ?Personally, as my > career advances and I work on more technical things, I am finding that I > have to leave 'R' by the wayside and use other languages for robust > numerical calculations and programming. ?This saddens me, because there are > so many wonderful packages developed by the community. ?The example above > came up because I am using the "forecast" library to great effect in > predicting how long our product cycle time will be. ?However, I spend much > of my time fighting all these class & typing bugs in 'R' (and we have to > start recognizing that they are bugs, otherwise they may never get > resolved), such that many of the improvements in my productivity due to all > the wonderful computational packages are entirely offset by the time > I spend fighting this issue of poor classes. > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Thanks & Regards! > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Mike > > --- > XKCD > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ From wdunlap at tibco.com Thu Nov 3 23:22:55 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 3 Nov 2011 22:22:55 +0000 Subject: [R] Is it possible to vectorize/accelerate this? In-Reply-To: References: <95680F94-F030-4DFF-8D51-1141144C1F99@gmail.com> Message-ID: You should get familiar with some basic timing tools and techniques so you can investigate things like this yourself. system.time is the most basic timing tool. E.g., > system.time(for(i in 1:1000)f0(a)) user system elapsed 22.920 0.000 22.932 means it took c. 23 seconds of real time to run f0(a) 1000 times. When comparing timing, it makes things easier to define a series of functions that implement the various algorithms but have the same inputs and outputs. E.g., for your problem f0 <- function(a_vec) { b_vec <- a_vec for (i in 2:length(b_vec)){ b_vec[i] <- ifelse(abs(b_vec[i-1] + a_vec[i]) > 1, a_vec[i], b_vec[i-1] + a_vec[i]) } b_vec } f1 <- function(a_vec) { b_vec <- a_vec for (i in 2:length(b_vec)){ b_vec[i] <- if(abs(b_vec[i-1] + a_vec[i]) > 1) a_vec[i] else b_vec[i-1] + a_vec[i] } b_vec } f2 <- function(a_vec) { b_vec <- a_vec for (i in 2:length(b_vec)){ if(abs(s <- b_vec[i-1] + a_vec[i]) <= 1) b_vec[i] <- s } b_vec } Then run them with the same dataset: > a <- runif(1000, 0, .3) > system.time(for(i in 1:1000)f0(a)) user system elapsed 22.920 0.000 22.932 > system.time(for(i in 1:1000)f1(a)) user system elapsed 5.510 0.000 5.514 > system.time(for(i in 1:1000)f2(a)) user system elapsed 4.210 0.000 4.217 (The rbenchmark package's benchmark function encapsulates this idiom.) It pays to use a dataset similar to the one you will ultimately be using, where "similar" depends on the context. E.g., the algorithm in f2 is relatively faster when the cumsum exceeds 1 most of the time > a <- runif(1000, 0, 10) > system.time(for(i in 1:1000)f0(a)) user system elapsed 21.900 0.000 21.912 > system.time(for(i in 1:1000)f1(a)) user system elapsed 4.610 0.000 4.609 > system.time(for(i in 1:1000)f2(a)) user system elapsed 2.490 0.000 2.494 If you will be working with large datasets, you should look at how the time grows as the size of the dataset grows. If the time looks quadratic between, say, length 100 and length 200, don't waste your time testing it for length 1000000. For algorithms that work on data.frames (or matrices), the relative speed ofen depends on the ratio of the number of rows and the number of columns of data. Check that out. For these sorts of tests it is worthwhile to make a function to generate "typical" looking data of any desired size. It doesn't take too long to do this once you have the right mindset. Once you do you don't have to rely on folklore like "never use loops" and instead do evidence-based computing. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of R. Michael > Weylandt > Sent: Thursday, November 03, 2011 2:51 PM > To: hihi; r-help > Subject: Re: [R] Is it possible to vectorize/accelerate this? > > Yes -- if & else is much faster than ifelse() because if is a > primitive while ifelse() is a whole function call (in fact, you can > see the code by typing ifelse into the prompt and see that it has two > if calls within it. > > Michael > > On Thu, Nov 3, 2011 at 4:38 PM, hihi wrote: > > Hi, > > thank you for your very immediate response. :-) Is if than and else faster > > than ifelse? I'm wondering (or not knowing something) > > Best regards, > > Peter > > 2011/11/3 R. Michael Weylandt > > > >> > >> I don't immediately see a good trick for vectorization so this seems to me > >> to be a good candidate for work in a lower-level language. Staying within R, > >> I'd suggest you use if and else rather than ifelse() since your computation > >> isn't vectorized: this will eliminate a small amount over overhead. Since > >> you also always add a_vec, you could also define b_vec as a copy of a to > >> avoid all those calls to subset a, but I don't think the effects will be > >> large and the code might not be as clear. > >> > >> You indicated that you may be comfortable with writing C, but I'd suggest > >> you look into the Rcpp/Inline package pair which make the whole process much > >> easier than it would otherwise be. > >> > >> ?I'm not at a computer write now or I'd write a fuller example, but the > >> documentation for those packages is uncommonly good an you should be able to > >> easily get it down into C++. If you aren't able to get it by tomorrow, let > >> me know and I can help troubleshoot. The only things I foresee that you'll > >> need to change are zero-basing, C's loop syntax, and (I think) the call to > >> abs(). (I always forget where abs() lives in c++ ....) > >> > >> The only possible hold up is that you need to be at a computer with a C > >> compiler > >> > >> Hope this helps, > >> > >> Michael > >> > >> On Nov 3, 2011, at 3:10 PM, hihi wrote: > >> > >> > Dear Members, > >> > > >> > I work on a simulaton experiment but it has an bottleneck. It's quite > >> > fast because of R and vectorizing, but it has a very slow for loop. The > >> > adjacent element of a vector (in terms of index number) depends > >> > conditionally on the former value of itself. Like a simple cumulating > >> > function (eg. cumsum) but with condition. Let's show me an example: > >> > a_vec = rnorm(100) > >> > b_vec = rep(0, 100) > >> > b_vec[1]=a_vec[1] > >> > for (i in 2:100){b_vec[i]=ifelse(abs(b_vec[i-1]+a_vec[i])>1, a_vec[i], > >> > b_vec[i-1]+a_vec[i])} > >> > print(b_vec) > >> > > >> > (The behaviour is like cumsum's, but when the value would excess 1.0 > >> > then it has another value from a_vec.) > >> > Is it possible to make this faster? I experienced that my way is even > >> > slower than in Excel! Programming in C would my last try... > >> > Any suggestions? > >> > > >> > Than you, > >> > Peter > >> > > >> > ______________________________________________ > >> > R-help at r-project.org mailing list > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide > >> > http://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. > > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From carl at witthoft.com Thu Nov 3 23:24:52 2011 From: carl at witthoft.com (Carl Witthoft) Date: Thu, 03 Nov 2011 18:24:52 -0400 Subject: [R] How much data can R process? Message-ID: <4EB314B4.1070904@witthoft.com> My answer is similar: how much data? All of it. And shame on the OP for ever using Excel. From: Rolf Turner Date: Thu, 03 Nov 2011 22:20:17 +1300 On 03/11/11 15:59, Nicholay Anne Caumeran wrote: > Would like to know how much data can R process - number of rows and columns? How long is a piece of string? cheers, Rolf Turner -- Sent from my Cray XK6 "Pendeo-navem mei anguillae plena est." From carl at witthoft.com Thu Nov 3 23:29:22 2011 From: carl at witthoft.com (Carl Witthoft) Date: Thu, 03 Nov 2011 18:29:22 -0400 Subject: [R] Is it possible to vectorize/accelerate this? Message-ID: <4EB315C2.20500@witthoft.com> I have to admit to not doing careful timing tests, but I often eliminate if() lines as follows (bad/good is just my preference) BAD: b[i] <- if(a[i]>1) a[i] else a[i-1] GOOD: b[i] <- a[i]* (a[i]>1) + a[i-1] * (a[i]<=1) On Thu, Nov 3, 2011 at 12:10 PM, hihi wrote: > Dear Members, > > I work on a simulaton experiment but it has an bottleneck. It's quite fast because of R and vectorizing, but it has a very slow for loop. The adjacent element of a vector (in terms of index number) depends conditionally on the former value of itself. Like a simple cumulating function (eg. cumsum) but with condition. Let's show me an example: > a_vec = rnorm(100) > b_vec = rep(0, 100) > b_vec[1]=a_vec[1] > for (i in 2:100){b_vec[i]=ifelse(abs(b_vec[i-1]+a_vec[i])>1, a_vec[i], b_vec[i-1]+a_vec[i])} > print(b_vec) -- Sent from my Cray XK6 "Pendeo-navem mei anguillae plena est." From jdnewmil at dcn.davis.ca.us Thu Nov 3 23:30:16 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Thu, 03 Nov 2011 15:30:16 -0700 Subject: [R] Question about Calculation of Cross Product In-Reply-To: References: Message-ID: <7f48a0f8-64f6-4021-90f1-b3447c6780fe@email.android.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From diggsb at ohsu.edu Thu Nov 3 23:30:30 2011 From: diggsb at ohsu.edu (Brian Diggs) Date: Thu, 3 Nov 2011 15:30:30 -0700 Subject: [R] Problem with R CMD check and the inconsolata font business. In-Reply-To: <4EB307C0.9080602@xtra.co.nz> References: <4EB244BE.7010905@xtra.co.nz> <4EB307C0.9080602@xtra.co.nz> Message-ID: <4EB31606.7000702@ohsu.edu> On 11/3/2011 2:29 PM, Rolf Turner wrote: > On 03/11/11 20:37, Rolf Turner wrote: >> >> I have just installed R version 2.14.0 and tried to re-build and >> re-check some >> of the packages that I maintain. >> >> I'm getting a warning (in the process of running R CMD check on my >> "deldir" >> package): >> >>> * checking PDF version of manual ... WARNING >>> LaTeX errors when creating PDF version. >>> This typically indicates Rd problems. >>> LaTeX errors found: >>> ! Font T1/fi4/m/n/10=ec-inconsolata at 10.0pt not loadable: Metric >>> (TFM) file n >>> ot found. >>> > > > > I received two replies off-line and one that was sent to the list. The > gist of the > off-line replies was ``you need to install some fonts'', and (unusually, > and very > kindly! :-) ) the responders told me *how* to do so. > > The third message (which *was* posted to the list) was from Brian Diggs, > who said that he was having a related problem: > >> The error on R CMD check I get is: >> >> * checking PDF version of manual ... WARNING >> LaTeX errors when creating PDF version. >> This typically indicates Rd problems. >> LaTeX errors found: >> !pdfTeX error: pdflatex.EXE (file ec-inconsolata): Font ec-inconsolata >> at 540 n >> ot found >> ==> Fatal error occurred, no output PDF file produced! > > I think this is the same as/similar to the problem I had before I installed > the inconsolata (LaTeX) package from CTAN. Since Brian does Windoze I > cannot advise as to how to install the package on his system, but I presume > it is not too difficult --- once you know how! :-) I thought that making sure the proper latex packages were installed would be sufficient, but apparently not. I found the actual font at http://www.levien.com/type/myfonts/Inconsolata.otf, downloaded it, and installed it. It shows up in my list of fonts available (as Inconsolata Medium) via the control panel. However, I still get the same error from R CMD check. I tried rebuilding the TeX filename database, but that didn't help. I've searched my drive for any mention of inconsolata, and the hits are: ./Program Files (x86)/MiKTeX 2.9/doc/fonts/inconsolata ./Program Files (x86)/MiKTeX 2.9/doc/fonts/inconsolata/inconsolata.pdf ./Program Files (x86)/MiKTeX 2.9/fonts/enc/dvips/inconsolata ./Program Files (x86)/MiKTeX 2.9/fonts/map/dvips/inconsolata ./Program Files (x86)/MiKTeX 2.9/fonts/opentype/public/inconsolata ./Program Files (x86)/MiKTeX 2.9/fonts/sfd/inconsolata ./Program Files (x86)/MiKTeX 2.9/fonts/tfm/public/inconsolata ./Program Files (x86)/MiKTeX 2.9/fonts/tfm/public/inconsolata/ec-inconsolata.tfm ./Program Files (x86)/MiKTeX 2.9/fonts/tfm/public/inconsolata/ei1-inconsolata.tfm ./Program Files (x86)/MiKTeX 2.9/fonts/tfm/public/inconsolata/qx-inconsolata.tfm ./Program Files (x86)/MiKTeX 2.9/fonts/tfm/public/inconsolata/rm-inconsolata.tfm ./Program Files (x86)/MiKTeX 2.9/fonts/tfm/public/inconsolata/texnansi-inconsolata.tfm ./Program Files (x86)/MiKTeX 2.9/fonts/tfm/public/inconsolata/ts1-inconsolata.tfm ./Program Files (x86)/MiKTeX 2.9/fonts/type1/public/inconsolata ./Program Files (x86)/MiKTeX 2.9/source/inconsolata-src.tar.bz2 ./Program Files (x86)/MiKTeX 2.9/tex/latex/inconsolata ./Program Files (x86)/MiKTeX 2.9/tex/latex/inconsolata/inconsolata.sty ./Program Files (x86)/MiKTeX 2.9/tpm/packages/inconsolata.tpm Which seems to confirm that TeX knows and has the font. I would appreciate any further assistance. > I'm posting this so that John Nash's and Jay Kerns' solutions to my problem > are there in the archives ``for the record''. > > Thanks to all who replied. > > cheers, > > Rolf Turner > -- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health & Science University From wdunlap at tibco.com Thu Nov 3 23:53:31 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 3 Nov 2011 22:53:31 +0000 Subject: [R] Is it possible to vectorize/accelerate this? References: <95680F94-F030-4DFF-8D51-1141144C1F99@gmail.com> Message-ID: I neglected to give another benefit of putting your algorithms into functions: you can use the compiler package to compile them, which can give a big boost in speed. E.g., I compiled the functions f0, f1, and f2 that I defined earlier to make new functions f0_c, f1_c, and f2_c: > library(compiler) > f0_c <- cmpfun(f0) > f1_c <- cmpfun(f1) > f2_c <- cmpfun(f2) > system.time(for(i in 1:1000)f0_c(a)) # a is runif(1000, 0, 10) user system elapsed 18.620 0.000 18.649 > system.time(for(i in 1:1000)f1_c(a)) user system elapsed 1.290 0.000 1.288 > system.time(for(i in 1:1000)f2_c(a)) user system elapsed 0.790 0.000 0.791 Compare those times with the 23, 5.5, and 4.2 seconds for the non-compiled version. I haven't used the compiler package enough to generate any folklore on it, but it certainly helps in this simple example. (identical() shows that the output of all these functions are the same.) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: William Dunlap > Sent: Thursday, November 03, 2011 3:23 PM > To: hihi; r-help > Subject: RE: [R] Is it possible to vectorize/accelerate this? > > You should get familiar with some basic timing tools > and techniques so you can investigate things like this > yourself. > > system.time is the most basic timing tool. E.g., > > system.time(for(i in 1:1000)f0(a)) > user system elapsed > 22.920 0.000 22.932 > means it took c. 23 seconds of real time to run f0(a) > 1000 times. > > When comparing timing, it makes things easier to define > a series of functions that implement the various algorithms > but have the same inputs and outputs. E.g., for your problem > > f0 <- function(a_vec) { > b_vec <- a_vec > for (i in 2:length(b_vec)){ > b_vec[i] <- ifelse(abs(b_vec[i-1] + a_vec[i]) > 1, a_vec[i], b_vec[i-1] + a_vec[i]) > } > b_vec > } > > f1 <- function(a_vec) { > b_vec <- a_vec > for (i in 2:length(b_vec)){ > b_vec[i] <- if(abs(b_vec[i-1] + a_vec[i]) > 1) a_vec[i] else b_vec[i-1] + a_vec[i] > } > b_vec > } > > f2 <- function(a_vec) { > b_vec <- a_vec > for (i in 2:length(b_vec)){ > if(abs(s <- b_vec[i-1] + a_vec[i]) <= 1) b_vec[i] <- s > } > b_vec > } > > Then run them with the same dataset: > > a <- runif(1000, 0, .3) > > system.time(for(i in 1:1000)f0(a)) > user system elapsed > 22.920 0.000 22.932 > > system.time(for(i in 1:1000)f1(a)) > user system elapsed > 5.510 0.000 5.514 > > system.time(for(i in 1:1000)f2(a)) > user system elapsed > 4.210 0.000 4.217 > > (The rbenchmark package's benchmark function encapsulates this idiom.) > > It pays to use a dataset similar to the one you will ultimately be using, > where "similar" depends on the context. E.g., the algorithm in f2 is relatively > faster when the cumsum exceeds 1 most of the time > > > a <- runif(1000, 0, 10) > > system.time(for(i in 1:1000)f0(a)) > user system elapsed > 21.900 0.000 21.912 > > system.time(for(i in 1:1000)f1(a)) > user system elapsed > 4.610 0.000 4.609 > > system.time(for(i in 1:1000)f2(a)) > user system elapsed > 2.490 0.000 2.494 > > If you will be working with large datasets, you should look at how the > time grows as the size of the dataset grows. If the time looks quadratic between, > say, length 100 and length 200, don't waste your time testing it for length 1000000. > For algorithms that work on data.frames (or matrices), the relative speed ofen > depends on the ratio of the number of rows and the number of columns of data. > Check that out. For these sorts of tests it is worthwhile to make a function to > generate "typical" looking data of any desired size. > > It doesn't take too long to do this once you have the right mindset. Once you > do you don't have to rely on folklore like "never use loops" and instead do > evidence-based computing. > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > > -----Original Message----- > > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of R. Michael > > Weylandt > > Sent: Thursday, November 03, 2011 2:51 PM > > To: hihi; r-help > > Subject: Re: [R] Is it possible to vectorize/accelerate this? > > > > Yes -- if & else is much faster than ifelse() because if is a > > primitive while ifelse() is a whole function call (in fact, you can > > see the code by typing ifelse into the prompt and see that it has two > > if calls within it. > > > > Michael > > > > On Thu, Nov 3, 2011 at 4:38 PM, hihi wrote: > > > Hi, > > > thank you for your very immediate response. :-) Is if than and else faster > > > than ifelse? I'm wondering (or not knowing something) > > > Best regards, > > > Peter > > > 2011/11/3 R. Michael Weylandt > > > > > >> > > >> I don't immediately see a good trick for vectorization so this seems to me > > >> to be a good candidate for work in a lower-level language. Staying within R, > > >> I'd suggest you use if and else rather than ifelse() since your computation > > >> isn't vectorized: this will eliminate a small amount over overhead. Since > > >> you also always add a_vec, you could also define b_vec as a copy of a to > > >> avoid all those calls to subset a, but I don't think the effects will be > > >> large and the code might not be as clear. > > >> > > >> You indicated that you may be comfortable with writing C, but I'd suggest > > >> you look into the Rcpp/Inline package pair which make the whole process much > > >> easier than it would otherwise be. > > >> > > >> ?I'm not at a computer write now or I'd write a fuller example, but the > > >> documentation for those packages is uncommonly good an you should be able to > > >> easily get it down into C++. If you aren't able to get it by tomorrow, let > > >> me know and I can help troubleshoot. The only things I foresee that you'll > > >> need to change are zero-basing, C's loop syntax, and (I think) the call to > > >> abs(). (I always forget where abs() lives in c++ ....) > > >> > > >> The only possible hold up is that you need to be at a computer with a C > > >> compiler > > >> > > >> Hope this helps, > > >> > > >> Michael > > >> > > >> On Nov 3, 2011, at 3:10 PM, hihi wrote: > > >> > > >> > Dear Members, > > >> > > > >> > I work on a simulaton experiment but it has an bottleneck. It's quite > > >> > fast because of R and vectorizing, but it has a very slow for loop. The > > >> > adjacent element of a vector (in terms of index number) depends > > >> > conditionally on the former value of itself. Like a simple cumulating > > >> > function (eg. cumsum) but with condition. Let's show me an example: > > >> > a_vec = rnorm(100) > > >> > b_vec = rep(0, 100) > > >> > b_vec[1]=a_vec[1] > > >> > for (i in 2:100){b_vec[i]=ifelse(abs(b_vec[i-1]+a_vec[i])>1, a_vec[i], > > >> > b_vec[i-1]+a_vec[i])} > > >> > print(b_vec) > > >> > > > >> > (The behaviour is like cumsum's, but when the value would excess 1.0 > > >> > then it has another value from a_vec.) > > >> > Is it possible to make this faster? I experienced that my way is even > > >> > slower than in Excel! Programming in C would my last try... > > >> > Any suggestions? > > >> > > > >> > Than you, > > >> > Peter > > >> > > > >> > ______________________________________________ > > >> > R-help at r-project.org mailing list > > >> > https://stat.ethz.ch/mailman/listinfo/r-help > > >> > PLEASE do read the posting guide > > >> > http://www.R-project.org/posting-guide.html > > >> > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. From rolf.turner at xtra.co.nz Thu Nov 3 23:53:46 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Fri, 04 Nov 2011 11:53:46 +1300 Subject: [R] Question about Calculation of Cross Product In-Reply-To: <7f48a0f8-64f6-4021-90f1-b3447c6780fe@email.android.com> References: <7f48a0f8-64f6-4021-90f1-b3447c6780fe@email.android.com> Message-ID: <4EB31B7A.9020706@xtra.co.nz> On 04/11/11 11:30, Jeff Newmiller wrote: > The crossprod function in base R implements the MATRIX cross product, more familiarly known as "matrix multiplication". The term "crossprod" is thereby rather misleading, nicht wahr? Wouldn't it be *nice* to put into the help file a wee warning telling the young naive user (explicitly) that this is *not* the "usual" 3-D vector cross product? :-) cheers, Rolf Turner From Ray.Brownrigg at ecs.vuw.ac.nz Thu Nov 3 23:58:22 2011 From: Ray.Brownrigg at ecs.vuw.ac.nz (Ray Brownrigg) Date: Fri, 4 Nov 2011 11:58:22 +1300 Subject: [R] Problem with R CMD check and the inconsolata font business. In-Reply-To: <4EB31606.7000702@ohsu.edu> References: <4EB244BE.7010905@xtra.co.nz> <4EB307C0.9080602@xtra.co.nz> <4EB31606.7000702@ohsu.edu> Message-ID: <201111041158.22405.Ray.Brownrigg@ecs.vuw.ac.nz> On Fri, 04 Nov 2011, Brian Diggs wrote: > On 11/3/2011 2:29 PM, Rolf Turner wrote: > > On 03/11/11 20:37, Rolf Turner wrote: > >> I have just installed R version 2.14.0 and tried to re-build and > >> re-check some > >> of the packages that I maintain. > >> > >> I'm getting a warning (in the process of running R CMD check on my > >> "deldir" > >> > >> package): SNIP > > I received two replies off-line and one that was sent to the list. The > > gist of the > > off-line replies was ``you need to install some fonts'', and (unusually, > > and very > > kindly! :-) ) the responders told me *how* to do so. > > > > > The third message (which *was* posted to the list) was from Brian Diggs, > > > > who said that he was having a related problem: > >> The error on R CMD check I get is: > >> > >> * checking PDF version of manual ... WARNING > >> LaTeX errors when creating PDF version. > >> This typically indicates Rd problems. > >> LaTeX errors found: > >> !pdfTeX error: pdflatex.EXE (file ec-inconsolata): Font ec-inconsolata > >> at 540 n > >> ot found > >> ==> Fatal error occurred, no output PDF file produced! SNIP > I thought that making sure the proper latex packages were installed > would be sufficient, but apparently not. I found the actual font at > http://www.levien.com/type/myfonts/Inconsolata.otf, downloaded it, and > installed it. It shows up in my list of fonts available (as Inconsolata > Medium) via the control panel. However, I still get the same error from > R CMD check. I tried rebuilding the TeX filename database, but that > didn't help. I've searched my drive for any mention of inconsolata, and > the hits are: > > ./Program Files (x86)/MiKTeX 2.9/doc/fonts/inconsolata > ./Program Files (x86)/MiKTeX 2.9/doc/fonts/inconsolata/inconsolata.pdf > ./Program Files (x86)/MiKTeX 2.9/fonts/enc/dvips/inconsolata > ./Program Files (x86)/MiKTeX 2.9/fonts/map/dvips/inconsolata > ./Program Files (x86)/MiKTeX 2.9/fonts/opentype/public/inconsolata > ./Program Files (x86)/MiKTeX 2.9/fonts/sfd/inconsolata > ./Program Files (x86)/MiKTeX 2.9/fonts/tfm/public/inconsolata > ./Program Files (x86)/MiKTeX > 2.9/fonts/tfm/public/inconsolata/ec-inconsolata.tfm > ./Program Files (x86)/MiKTeX > 2.9/fonts/tfm/public/inconsolata/ei1-inconsolata.tfm > ./Program Files (x86)/MiKTeX > 2.9/fonts/tfm/public/inconsolata/qx-inconsolata.tfm > ./Program Files (x86)/MiKTeX > 2.9/fonts/tfm/public/inconsolata/rm-inconsolata.tfm > ./Program Files (x86)/MiKTeX > 2.9/fonts/tfm/public/inconsolata/texnansi-inconsolata.tfm > ./Program Files (x86)/MiKTeX > 2.9/fonts/tfm/public/inconsolata/ts1-inconsolata.tfm > ./Program Files (x86)/MiKTeX 2.9/fonts/type1/public/inconsolata > ./Program Files (x86)/MiKTeX 2.9/source/inconsolata-src.tar.bz2 > ./Program Files (x86)/MiKTeX 2.9/tex/latex/inconsolata > ./Program Files (x86)/MiKTeX 2.9/tex/latex/inconsolata/inconsolata.sty > ./Program Files (x86)/MiKTeX 2.9/tpm/packages/inconsolata.tpm > > Which seems to confirm that TeX knows and has the font. > > I would appreciate any further assistance. > SNIP Brian: When I first tried an R CMD check (or it was probably an R CMD INSTALL --build) using R-2.14.0 on a Windows system with MiKTeX 2.8, MiKTeX went ahead and "automagically" installed the required components from a CTAN repository (which I had to select on the way). Do you mind sending me (off-list) your package so I can check it with my (very old Pentium III, Win XP) system? [I see you are running a 64-bit system.] Hope this helps, Ray Brownrigg From diggsb at ohsu.edu Fri Nov 4 00:30:39 2011 From: diggsb at ohsu.edu (Brian Diggs) Date: Thu, 3 Nov 2011 16:30:39 -0700 Subject: [R] Problem with R CMD check and the inconsolata font business. In-Reply-To: <4EB31606.7000702@ohsu.edu> References: <4EB244BE.7010905@xtra.co.nz> <4EB307C0.9080602@xtra.co.nz> <4EB31606.7000702@ohsu.edu> Message-ID: <4EB3241F.2060608@ohsu.edu> On 11/3/2011 3:30 PM, Brian Diggs wrote: >>> The error on R CMD check I get is: >>> >>> * checking PDF version of manual ... WARNING >>> LaTeX errors when creating PDF version. >>> This typically indicates Rd problems. >>> LaTeX errors found: >>> !pdfTeX error: pdflatex.EXE (file ec-inconsolata): Font ec-inconsolata >>> at 540 n >>> ot found >>> ==> Fatal error occurred, no output PDF file produced! >> >> I think this is the same as/similar to the problem I had before I >> installed >> the inconsolata (LaTeX) package from CTAN. Since Brian does Windoze I >> cannot advise as to how to install the package on his system, but I >> presume >> it is not too difficult --- once you know how! :-) > > I thought that making sure the proper latex packages were installed > would be sufficient, but apparently not. I found the actual font at > http://www.levien.com/type/myfonts/Inconsolata.otf, downloaded it, and > installed it. It shows up in my list of fonts available (as Inconsolata > Medium) via the control panel. However, I still get the same error from > R CMD check. I tried rebuilding the TeX filename database, but that > didn't help. I've searched my drive for any mention of inconsolata, and > the hits are: > Well, I figured it out. Or at least got it working. I had to run initexmf --mkmaps because apparently there was something wrong with my font mappings. I don't know why; I don't know how. But it works now. I think installing the font into the Windows Font directory was not necessary. I'm including the solution in case anyone else has this problem. -- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health & Science University From this.is.mvw at gmail.com Fri Nov 4 00:49:06 2011 From: this.is.mvw at gmail.com (Mike Williamson) Date: Thu, 3 Nov 2011 16:49:06 -0700 Subject: [R] any updates w.r.t. lapply, sapply, apply retaining classes In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rmh at temple.edu Fri Nov 4 01:00:43 2011 From: rmh at temple.edu (Richard M. Heiberger) Date: Thu, 3 Nov 2011 20:00:43 -0400 Subject: [R] any updates w.r.t. lapply, sapply, apply retaining classes In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From hadley at rice.edu Fri Nov 4 01:33:53 2011 From: hadley at rice.edu (Hadley Wickham) Date: Fri, 4 Nov 2011 13:33:53 +1300 Subject: [R] any updates w.r.t. lapply, sapply, apply retaining classes In-Reply-To: References: Message-ID: > ? ?In the example I give above, the impact might seem small, but the > implications are *huge*. ?This means that I am, in effect, not allowed to > use *any* of the vectoring functions in 'R', which avoid performing loops > thereby speeding up process time extraordinarily. ?Many can sympathize that > 'R' is ridiculously slow with "for" loops, compared to other languages. > ?But that's theoretically OK, a good statistician or data analyst should be > able to work comfortably with matrices and vectors. Two comments: * sapply is generally only _slightly_ faster than a for loop * it's almost always better to use vapply instead of sapply. But I agree that simplify2array should be a generic so that you can write custom methods to support new classes. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ From hadley at rice.edu Fri Nov 4 01:36:07 2011 From: hadley at rice.edu (Hadley Wickham) Date: Fri, 4 Nov 2011 13:36:07 +1300 Subject: [R] any updates w.r.t. lapply, sapply, apply retaining classes In-Reply-To: References: Message-ID: > ? ?I agree that it is non-trivial to solve the cases you & I have posed. > ?However, I would wholeheartedly support having an error spit back for any > function that does not explicitly support a class. ?In this case, if I > attempt to do ? sapply(x, class), and 'x' is of class "difftime", then I > should receive an error "sapply cannot function upon class 'difftime' ". > ?Why do I take this stance? ?There are at least 2 strong reasons: I don't see why that command should be a problem because class() returns a string. A better example might be sapply(x, identity) which in general you would hope to be identical to x: x <- structure(1:10, class = "blah") identical(x, sapply(x, identity)) # [1] FALSE Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ From jwiley.psych at gmail.com Fri Nov 4 02:20:19 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Thu, 3 Nov 2011 18:20:19 -0700 Subject: [R] any updates w.r.t. lapply, sapply, apply retaining classes In-Reply-To: References: Message-ID: Hi Mike, I definitely understand your point. I don't have any particularly good ideas, though I think you might like S4, which is the newer formal class/methods system. As a note, I misspoke (or miswrote) that difftime inherits from numeric---the mode is numeric, but does not inherit. Cheers, Josh On Thu, Nov 3, 2011 at 4:49 PM, Mike Williamson wrote: > Hi Joshua, > ? ? Thank you for the input! > ? ? I agree that it is non-trivial to solve the cases you & I have posed. > ?However, I would wholeheartedly support having an error spit back for any > function that does not explicitly support a class. ?In this case, if I > attempt to do ? sapply(x, class), and 'x' is of class "difftime", then I > should receive an error "sapply cannot function upon class 'difftime' ". > ?Why do I take this stance? ?There are at least 2 strong reasons: > > Most importantly, an incorrect answer is far more dangerous than no answer. > ?E.g., if I ask "what is 3 + 3?", I would far prefer to receive "I don't > know" than "5". ?The former lets me know I need to choose another path, the > latter mistakenly makes me think I have an answer, when I do not, and I > continue with analyses on the assumption that answer is correct. ?In the > case of dates, this happens often. ?E.g., is the numeric that is returned > from sapply, for instance, the # of seconds since 1970-01-01, or the number > of days since 1970-01-01. ?This depends upon how 'R' internally attempts to > fix any incongruities. > But also very significantly, an error will get me in the habit of avoiding > any marginalized class types. ?I keep thinking, for instance, that I can use > the "Dates" class, since 'R' says that it supports them. ?But if I got into > the habit of converting all dates into numerics myself beforehand (maybe > counting the number of seconds from 1970-01-01, since that seems a magic > date), then I would be guaranteed that a function will either (a) cause an > error (e.g., if I try a character function on it), or (b) function properly. > ?However, since I don't overtly receive errors when attempting to use dates > (or difftimes, or factors, or whatever), I keep using them, instead of > relying solely upon the true & trusted classes. > > the trickiest here is really factors. ?Factors are, by most accounts, > considered a core class. ?In some cases, you can only use factors. ?E.g., > when you want some sort of ordinal categorical variable. ?Therefore, the > fact that factors also barf similarly to other classes like difftime (albeit > much more rarely), is especially dangerous. > > ? ? There are, of course, habits that we can create to make ourselves better > programmers, and I will recognize that I can improve. ?However, this issue > of functions generating "wrong" answers is such a huge?problem with 'R', and > other languages are catching up to 'R' so quickly, as far as their > capability to handle higher level math, that this issue is making 'R' a less > desirable language to use, as time progresses. ?I don't mean to claim that > my opinion is the end-all-be-all, but I would like to hear others chime in, > whether this is a large concern, or whether there is a very small minority > of folks impacted by it. > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Regards, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Mike > > --- > XKCD > > > > On Thu, Nov 3, 2011 at 2:51 PM, Joshua Wiley wrote: >> >> Hi Mike, >> >> This isn't really an answer to your question, but perhaps will serve >> to continue discussion. ?I think that there are some fundamental >> issues when working special classes. ?As a thought example, suppose I >> wrote a class, "posreal", which inherits from the numeric class. ?It >> is only valid for positive, real numbers. ?I use it in a package, but >> do not develop methods for it. ?A user comes along and creates a >> vector, x that is a posreal. ?Then tries: mean(x * -3). ?Since I never >> bothered to write a special method for mean for my class, R falls back >> to the inherited numeric, but gives a value that is clearly not valid >> for posreal. ?What should happen? ?S3 methods do not really have >> validation, so in principle, one could write a function like: >> >> f <- function(x) { >> ?vclass <- class(x) >> ?res <- mean(x) >> ?class(res) <- vclass >> ?return(res) >> } >> >> which "retains" the appropriate class, but in name only. ?R core >> cannot possibly know or imagine all classes that may be written that >> inherit from more basic types but with possible special aspects and >> requirements. ?I think the inherited is considered to be more generic >> and that is returned. ?It is usually up to the user to ensure that the >> function (whose methods were not specific to that special class but >> the inherited) is valid for that class and can manually convert it >> back: >> >> res <- as.posreal(res) >> >> What about lapply and sapply? ?Neither are generic or have methods for >> difftime, and so do some unexpected/desirable things. ?Again, without >> methods defined for a particular class, they cannot know what is >> special or appropriate way to handle it, they use defaults which >> sometimes work but may give unexpected or undesirable results, but >> what else can be done? ?(okay, they could just throw an error) ?If a >> function is naive about a class, it does not seem right to operate on >> it using unknown methods and then pretend to be returning the same >> type of data. ?As it stands, they convert to a data type they know and >> return that. >> >> Now, you mention that for loops are slow in R, and this is true to a >> degree. ?However, the *apply functions are basically just internal >> loops, so they do not really save you (they are certainly not >> vectorized!), though they are more elegant than explicit loops IMO. >> One way to use them while retaining class would be like: >> >> sapply(seq_along(test), function(i) class(test[i])) >> >> this is less efficient then sapply(test, class), but the overhead >> drops considerably as the function does nontrivial calculations. >> Finally, I find the (relatively) new compiler package really shines at >> making functions that are just wrappers for for loops more efficient. >> Take a look at the examples from: >> >> require(compiler) >> ?cmpfun >> >> I am not familiar with numPy so I do not know how it handles new >> classes, but with some tweaks to my workflow, I do not find myself >> running into problems with how R handles them. ?I definitely >> appreciate your position because I have been there...as I became more >> familiar with R, classes, and methods, I find I work in a way that >> avoids passing objects to functions that do not know how to handle >> them properly. >> >> Cheers, >> >> Josh >> >> >> On Thu, Nov 3, 2011 at 11:08 AM, Mike Williamson >> wrote: >> > Hi All, >> > >> > ? ?I don't have a "I need help" question, so much as a query into any >> > update whether 'R' has made any progress with some of the core functions >> > retaining classes. ?As an example, because it's one of the cases that >> > most >> > egregiously impacts me & my work and keeps pushing me away from 'R' and >> > into other numerical languages (such as NumPy in python), I will use >> > sapply >> > / lapply to demonstrate, but this behavior is ubiquitous throughout 'R'. >> > >> > ? ?Let's say I have a class which is theoretically supported, but not >> > one >> > of the core "numeric" or "character" classes (and, to some degree, >> > "factor" >> > classes). ?Many of the basic functions will convert my desired class >> > into >> > either numeric or character, so that my returned answer is gibberish. >> > >> > E.g.: >> > >> > test= as.difftime(c(1, 1, 8, 0.25, 8, 1.25), units= "days") ?## create a >> > small array of time differences >> > class(test) ?## this will return the proper class, "difftime" >> > class(test[1] ) ## this will also return the proper class, "difftime" >> > sapply(test, class) ?## this will return *numerics* for all of the >> > classes. >> > ?Ack!! >> > >> > ? ?In the example I give above, the impact might seem small, but the >> > implications are *huge*. ?This means that I am, in effect, not allowed >> > to >> > use *any* of the vectoring functions in 'R', which avoid performing >> > loops >> > thereby speeding up process time extraordinarily. ?Many can sympathize >> > that >> > 'R' is ridiculously slow with "for" loops, compared to other languages. >> > ?But that's theoretically OK, a good statistician or data analyst should >> > be >> > able to work comfortably with matrices and vectors. ?However, *'R' >> > cannot >> > work comfortably* with matrices or vectors, *unless* they are using the >> > numeric or character classes. ?Many of the classes suffer the problem I >> > just described, although I only used "difftime" in the example. ?Factors >> > seem a bit more "comfortable", and can be handled most of the time, but >> > not >> > as well as numerics, and at times functions working on factors can >> > return >> > the numerical representation of the factor instead of the original >> > factor. >> > >> > ? ?Is there any progress in guaranteeing that all core functions either >> > (a) ideally return exactly the classes, and hierarchy of classes, that >> > they >> > received (e.g., a list of data frames with difftimes & dates & >> > characters >> > would return a list of data frames with difftimes & dates & characters), >> > or >> > (b) barring that, the function should at least error out with a clear >> > error >> > explaining that sapply, for example, cannot vectorize on the class being >> > used? ?Returning incorrect answers is far worse than returning an error, >> > from a perspective of stability. >> > >> > ? ?This is, by far, the largest Achilles' heel to 'R'. ?Personally, as >> > my >> > career advances and I work on more technical things, I am finding that I >> > have to leave 'R' by the wayside and use other languages for robust >> > numerical calculations and programming. ?This saddens me, because there >> > are >> > so many wonderful packages developed by the community. ?The example >> > above >> > came up because I am using the "forecast" library to great effect in >> > predicting how long our product cycle time will be. ?However, I spend >> > much >> > of my time fighting all these class & typing bugs in 'R' (and we have to >> > start recognizing that they are bugs, otherwise they may never get >> > resolved), such that many of the improvements in my productivity due to >> > all >> > the wonderful computational packages are entirely offset by the time >> > I spend fighting this issue of poor classes. >> > >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Thanks & Regards! >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Mike >> > >> > --- >> > XKCD >> > >> > ? ? ? ?[[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> >> >> -- >> Joshua Wiley >> Ph.D. Student, Health Psychology >> Programmer Analyst II, ATS Statistical Consulting Group >> University of California, Los Angeles >> https://joshuawiley.com/ > > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ From yanchangzhao at gmail.com Fri Nov 4 02:28:10 2011 From: yanchangzhao at gmail.com (Yanchang Zhao) Date: Fri, 4 Nov 2011 12:28:10 +1100 Subject: [R] Help: stemming and stem completion with package tm in R Message-ID: <69BB5AF3-4131-4792-92C6-17FFEE0A7E60@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.weylandt at gmail.com Fri Nov 4 03:32:52 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Thu, 3 Nov 2011 22:32:52 -0400 Subject: [R] Why can't this function be used with the 'by' command? In-Reply-To: References: Message-ID: I believe it has to do with passing multiple columns to the shapiro.test. Note that by(x[,1:2], x$group, function(x) shapiro.test(x)[[2]]) doesn't work but by(x[,1:2], x$group, function(x) shapiro.test(x[,1])[[2]]) does. Michael On Thu, Nov 3, 2011 at 8:08 AM, Kaiyin Zhong wrote: > Why can't this function be used with the 'by' command? > >> x = array(runif(16), dim=c(8,2)) >> x = data.frame(x) >> x$group = rep(c('wt', 'app'), each=4) > >> shapiro.p = function(x) shapiro.test(x)[[2]] >> apply(x[,1:2], 2, shapiro.p) > ? ? ? X1 ? ? ? ?X2 > 0.4126345 0.2208781 > >> by(x[,1:2], x$group, shapiro.p) > Error in `[.data.frame`(x, complete.cases(x)) : > ?undefined columns selected > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From michael.weylandt at gmail.com Fri Nov 4 03:36:17 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Thu, 3 Nov 2011 22:36:17 -0400 Subject: [R] Histograms in R In-Reply-To: <007001cc9a71$01e8d5c0$05ba8140$@edu> References: <1320321796661-3985397.post@n4.nabble.com> <51D703BF-FB00-434F-A2A3-E4ABAA7BCB10@gmail.com> <007001cc9a71$01e8d5c0$05ba8140$@edu> Message-ID: You're absolutely right. Was thinking exponential for some reason.... The rootogram is quite nice; I've never seen one before. Thanks! Michael On Thu, Nov 3, 2011 at 5:39 PM, David L Carlson wrote: > The lines() command doesn't work and histogram combines categories unless > you specify the number. How about a barplot > > Lam <- 3 > X <- table(rpois(500, Lam)) > Max <- length(X)-1 > barplot(rbind(X, 500*dpois(0:Max, Lam)), beside=TRUE, > legend.text=c("Observed", "Expected")) > > or a rootogram > library(vcd) > rootogram(X, dpois(0:Max, Lam)*500) > > > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On > Behalf Of R. Michael Weylandt > Sent: Thursday, November 03, 2011 2:55 PM > To: kerry1912 > Cc: r-help at r-project.org > Subject: Re: [R] Histograms in R > > Try something like this > > Lam <- 3 > X <- rpois(500, Lam) > hist(X, freq = F) > x <- seq(min(X), max(X), length = 500) > lines(x, dpois(x, Lam), col=2) > > Adapt as necessary > > Michael > > On Nov 3, 2011, at 8:03 AM, kerry1912 wrote: > >> We have a histogram of our observed response and we want to overlay the >> corresponding poisson distribution with respect to our poisson model. >> >> >> -- >> View this message in context: > http://r.789695.n4.nabble.com/Histograms-in-R-tp3985397p3985397.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From michael.weylandt at gmail.com Fri Nov 4 03:39:21 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Thu, 3 Nov 2011 22:39:21 -0400 Subject: [R] Plotting skewed normal distribution with a bar plot In-Reply-To: References: Message-ID: It seems like you'll need to apply some sort of MLE to estimate the parameters directly from the data before using dsn() to get the density. This might help with some of it: http://help.rmetrics.org/fGarch/html/snorm.html Michael On Thu, Nov 3, 2011 at 2:54 PM, wrote: > > Hi, > > I need to create a plot (type = "h") ?and then overlay a skewed-normal > curve on this distribution, but I'm not finding a procedure to accomplish > this. I want to use the plot function here in order to control the bin > distributions. > > I have explored the sn library and found the dsn function. ?dsn uses known > location, scaling and shape parameters associated with a given input vector > of probabilities. ?However, how can I calculate the skewed-normal curve if > I don't know these parameters in advance? > > Is there another function to calculate the skew-normal, perhaps in a > different package? > > > I'm working with R 2.13.2 on a windows based machine. > > Steve Friedman Ph. D. > Ecologist ?/ Spatial Statistical Analyst > Everglades and Dry Tortugas National Park > 950 N Krome Ave (3rd Floor) > Homestead, Florida 33034 > > Steve_Friedman at nps.gov > Office (305) 224 - 4282 > Fax ? ? (305) 224 - 4147 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From jvadams at usgs.gov Thu Nov 3 22:42:28 2011 From: jvadams at usgs.gov (Jean V Adams) Date: Thu, 3 Nov 2011 16:42:28 -0500 Subject: [R] Select columns of a data.frame by name OR index in a function In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ROBERT.LANCASTER at orbitz.com Thu Nov 3 20:55:14 2011 From: ROBERT.LANCASTER at orbitz.com (Lancaster, Robert (Orbitz)) Date: Thu, 3 Nov 2011 14:55:14 -0500 Subject: [R] Kaplan Meier - not for dates In-Reply-To: <20111031183318.CA721449A20@nhs-pd1e-esg105.ad1.nhs.net> References: <20111031183318.CA721449A20@nhs-pd1e-esg105.ad1.nhs.net> Message-ID: I think it really depends on what your event of interest is. If your event is that the patient got better and "left treatment" then I think this could work. You would have to mark as censored any patient still in treatment or any patient that stopped treatment w/o getting better (e.g. in the case of death). You would then be predicting the cost required to make the patient well enough to leave treatment. It is a little non-standard to use $ instead of time, but time is money after all. You could set up your data frame with two columns: 1) cost 2) event/censored. Then create your survival object: mySurv = Surv(my_data$cost,my_data$event) And then use survfit to create your KM curves: myFit = survfit(mySurv~NULL) If you have other explanatory variables that you think may influence the cost, you can of course add them to your data frame and change the formula you use in survfit. For instance, you could have some severity measure, e.g. High, Medium, Low. You could then do: myFit = survfit(mySurv~my_data$severity) -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST) Sent: Monday, October 31, 2011 1:29 PM To: r-help at r-project.org Subject: [R] Kaplan Meier - not for dates I have some data which is censored and I want to determine the median. Its actually cost data for a cohort of patients, many of whom are still on treatment and so are censored. I can do the same sort of analysis for a survival curve and get the median survival... ...but can I just use the survival curve functions to plot an X axis that is $ rather than date? If not is there some other way to achieve this? Thanks Calum ******************************************************************************************************************** This message may contain confidential information. If yo...{{dropped:9}} From jvadams at usgs.gov Thu Nov 3 22:54:26 2011 From: jvadams at usgs.gov (Jean V Adams) Date: Thu, 3 Nov 2011 16:54:26 -0500 Subject: [R] query about counting rows of a dataframe In-Reply-To: <631F8C7792124941838E6850D3A7802B032455D1DA6A@ERMES.regionemarche.intra> References: <631F8C7792124941838E6850D3A7802B032455D1DA6A@ERMES.regionemarche.intra> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From flokke at live.de Thu Nov 3 23:00:25 2011 From: flokke at live.de (flokke) Date: Thu, 3 Nov 2011 15:00:25 -0700 (PDT) Subject: [R] problem with merging two matrices In-Reply-To: References: <1320266770417-3983136.post@n4.nabble.com> <1320337915950-3986638.post@n4.nabble.com> Message-ID: <1320357625463-3988319.post@n4.nabble.com> Thanks, that helps a lot! -- View this message in context: http://r.789695.n4.nabble.com/problem-with-merging-two-matrices-tp3983136p3988319.html Sent from the R help mailing list archive at Nabble.com. From debmidya at yahoo.com Thu Nov 3 23:05:23 2011 From: debmidya at yahoo.com (Deb Midya) Date: Thu, 3 Nov 2011 15:05:23 -0700 (PDT) Subject: [R] Extract Data from Yahoo Finance In-Reply-To: References: <1320292966.73541.YahooMailNeo@web161421.mail.bf1.yahoo.com> Message-ID: <1320357923.20842.YahooMailNeo@web161406.mail.bf1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From peter.langfelder at gmail.com Fri Nov 4 00:08:31 2011 From: peter.langfelder at gmail.com (plangfelder) Date: Thu, 3 Nov 2011 16:08:31 -0700 (PDT) Subject: [R] Grouping clusters from dendrograms In-Reply-To: <1281112509810-2316521.post@n4.nabble.com> References: <1281112509810-2316521.post@n4.nabble.com> Message-ID: <1320361711268-3988526.post@n4.nabble.com> Hi Julia, sorry for the very late reply, your original email was posted while I was on hiatus from R-help. I'm the author of the dynamicTreeCut package. I recommend that you try using the "hybrid" method using the cutreeDynamic function. What you observed is a known problem of the tree method (which, by the way, was the reason I developed the Hybrid method). Using the hybrid method is simple, for example as cut2<-cutreeDynamic(dendro,distM = combo2, maxTreeHeight=1,deepSplit=2,minModuleSize=1) You can play with the argument deepSplit to obtain finer or coarser modules. HTH, Peter -- View this message in context: http://r.789695.n4.nabble.com/Grouping-clusters-from-dendrograms-tp2316521p3988526.html Sent from the R help mailing list archive at Nabble.com. From s.karmv at gmail.com Fri Nov 4 00:09:36 2011 From: s.karmv at gmail.com (uka) Date: Thu, 3 Nov 2011 16:09:36 -0700 (PDT) Subject: [R] how to count number of occurrences In-Reply-To: References: Message-ID: <1320361776236-3988529.post@n4.nabble.com> This was very helpful. Thank you very much. Just one question, I notice that it does not count the number of X's before the first Y. I want the result be 1 4 0 0 0 5 0 0 0 0. I tried combining this output with the first value of rle output, but realized that rle doesn't give me the 0s. So, if my first observation was Y, then I want it to show that there are 0 Xs before that. Thank you again. -- View this message in context: http://r.789695.n4.nabble.com/how-to-count-number-of-occurrences-tp3979546p3988529.html Sent from the R help mailing list archive at Nabble.com. From jfrabetti at sdsc.edu Fri Nov 4 01:06:27 2011 From: jfrabetti at sdsc.edu (Jo Frabetti) Date: Fri, 4 Nov 2011 00:06:27 +0000 Subject: [R] How to use 'prcomp' with CLUSPLOT? Message-ID: <47CA6EA672AF4A4DA62A7DCCFEDC9ABB0BB34CF0@XMAIL-MBX-AH1.AD.UCSD.EDU> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jfrabetti at sdsc.edu Fri Nov 4 01:10:19 2011 From: jfrabetti at sdsc.edu (Jo Frabetti) Date: Fri, 4 Nov 2011 00:10:19 +0000 Subject: [R] How to use 'prcomp' with CLUSPLOT? Message-ID: <47CA6EA672AF4A4DA62A7DCCFEDC9ABB0BB34D18@XMAIL-MBX-AH1.AD.UCSD.EDU> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From josh.m.ulrich at gmail.com Fri Nov 4 06:12:55 2011 From: josh.m.ulrich at gmail.com (Joshua Ulrich) Date: Fri, 4 Nov 2011 00:12:55 -0500 Subject: [R] Extract Data from Yahoo Finance In-Reply-To: <1320357923.20842.YahooMailNeo@web161406.mail.bf1.yahoo.com> References: <1320292966.73541.YahooMailNeo@web161421.mail.bf1.yahoo.com> <1320357923.20842.YahooMailNeo@web161406.mail.bf1.yahoo.com> Message-ID: Deb, See getQuote in the quantmod package. For example: getQuote("SPY") Be sure to read ?getQuote. Best, -- Joshua Ulrich ?| ?FOSS Trading: www.fosstrading.com On Thu, Nov 3, 2011 at 5:05 PM, Deb Midya wrote: > Michael, > > Thanks for your response. > > The link to the page is: http://www.gummy-stuff.org/Yahoo-data.htm > > I like to download the fields (mentioned under special tags) for a period of time and for a particular stock (or for a list of stocks). > > Once again, thank you very much for the time you have given. > > Regards, > > Deb > > From: R. Michael Weylandt > To: Deb Midya > Cc: "r-help at r-project.org" > Sent: Friday, 4 November 2011 12:13 AM > Subject: Re: [R] Extract Data from Yahoo Finance > > The quantmod package can probably do what you are asking, but it's a > little hard to be certain since you provide neither a list of all the > fields you are actually talking about nor a link to the page with the > fields in question. > > Michael > > On Thu, Nov 3, 2011 at 12:02 AM, Deb Midya wrote: >> Hi R ?users, >> >> I am using R-2.14.0 on Windows XP. >> >> May I request you to assist me for the following please. >> >> I like to extract all the fields (example: a : Ask, b : Bid, ??, w : 52-week Range, x: Stock Exchange) ?for certain period of time, say, 1 October 2011 to 31 October 2011. >> >> Is there any R-Package(s) & any R- script please? >> >> Once again, thank you very much for the time you have given. >> >> Regards, >> >> Deb >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From ajit.aher at cedar-consulting.com Fri Nov 4 07:24:47 2011 From: ajit.aher at cedar-consulting.com (Aher) Date: Thu, 3 Nov 2011 23:24:47 -0700 (PDT) Subject: [R] Reading parameters from dataframe and loading as objects Message-ID: <1320387887417-3989150.post@n4.nabble.com> Hi List, I want to read several parameters from data frame and load them as object into R session, Is there any package or function in R for this?? Here is example param <-c("clust_num", "minsamp_size", "maxsamp_size", "min_pct", "max_pct") value <-c(15, 20000, 200000, 0.001, .999) data <- data.frame ( cbind(param , value)) data param value 1 clust_num 15 2 minsamp_size 20000 3 maxsamp_size 2e+05 4 min_pct 0.001 5 max_pct 0.999 My data contains many such parameters, I need to read each parameter and its value from the data and load it as objects in R session as below: clust_num <- 15 minsamp_size <-20000 maxsamp_size <-2e+05 min_pct <-0.001 max_pct <-0.999 The way right now I am doing it is as creating as many variables as parameters in the data frame and one observation for value of each parameter. example: clust_num minsamp_size maxsamp_size min_pct max_pct 15 20000 200000 0.001 0.999 data$ clust_num , data$minsamp_size, ..... Is there any better way for doing this? -- View this message in context: http://r.789695.n4.nabble.com/Reading-parameters-from-dataframe-and-loading-as-objects-tp3989150p3989150.html Sent from the R help mailing list archive at Nabble.com. From aajit75 at yahoo.co.in Fri Nov 4 07:36:33 2011 From: aajit75 at yahoo.co.in (aajit75) Date: Thu, 3 Nov 2011 23:36:33 -0700 (PDT) Subject: [R] Decision tree model using rpart ( classification Message-ID: <1320388593513-3989162.post@n4.nabble.com> Hi Experts, I am new to R, using decision tree model for getting segmentation rules. A) Using behavioural data (attributes defining customer behaviour, ( example balances, number of accounts etc.) 1. Clustering: Cluster behavioural data to suitable number of clusters 2. Decision Tree: Using rpart classification tree for generating rules for segmentation using cluster number(cluster id) as target variable and variables from behavioural data as input variables. B) Using profile data (customers demographic data ) 1. Clustering: Cluster profile data to suitable number of clusters 2. Decision Tree: Using rpart classification tree for generating rules for segmentation using cluster number(cluster id) as target variable and variables from profile data as input variables. C) Using profile data (customers demographic data ) and deciles created based on behaviour 1. Deciles: Deciles customers to 10 groups based on some behavioural data 2. Decision Tree: Using rpart classification for generating rules for segmentation using Deciles as target variable and variables from profile data as input variables. In first two cases A and B decision tree model using rpart finish the execution in a minute or two, But in third case (C) it continues to run for infinite amount of time( monitored and running even after 14 hours). fit <- rpart(decile ~., method="class", data=dtm_ip) Is there anything wrong with my approach? Thanks for the help in advance. -Ajit -- View this message in context: http://r.789695.n4.nabble.com/Decision-tree-model-using-rpart-classification-tp3989162p3989162.html Sent from the R help mailing list archive at Nabble.com. From tal.galili at gmail.com Fri Nov 4 08:37:29 2011 From: tal.galili at gmail.com (Tal Galili) Date: Fri, 4 Nov 2011 09:37:29 +0200 Subject: [R] Decision tree model using rpart ( classification In-Reply-To: <1320388593513-3989162.post@n4.nabble.com> References: <1320388593513-3989162.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ericlecoutre at gmail.com Fri Nov 4 08:52:58 2011 From: ericlecoutre at gmail.com (Eric Lecoutre) Date: Fri, 4 Nov 2011 08:52:58 +0100 Subject: [R] Reading parameters from dataframe and loading as objects In-Reply-To: <1320387887417-3989150.post@n4.nabble.com> References: <1320387887417-3989150.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From NordlDJ at dshs.wa.gov Fri Nov 4 08:52:58 2011 From: NordlDJ at dshs.wa.gov (Nordlund, Dan (DSHS/RDA)) Date: Fri, 4 Nov 2011 00:52:58 -0700 Subject: [R] how to count number of occurrences In-Reply-To: <1320361776236-3988529.post@n4.nabble.com> References: <1320361776236-3988529.post@n4.nabble.com> Message-ID: <941871A13165C2418EC144ACB212BDB0021B2065@dshsmxoly1504g.dshs.wa.lcl> > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of uka > Sent: Thursday, November 03, 2011 4:10 PM > To: r-help at r-project.org > Subject: Re: [R] how to count number of occurrences > > This was very helpful. Thank you very much. Just one question, I notice > that > it does not count the number of X's before the first Y. I want the > result be > 1 4 0 0 0 5 0 0 0 0. I tried combining this output with the first value > of > rle output, but realized that rle doesn't give me the 0s. So, if my > first > observation was Y, then I want it to show that there are 0 Xs before > that. > Thank you again. > You should really provide the relevant context from previous posts so that potential helpers don't need to go looking for it. That being said, you could try something like samp <- c("X", "Y", "X", "X", "X", "X", "Y", "Y", "Y", "Y", "X", "X", "X", "X", "X", "Y", "Y", "Y", "Y", "Y") diff(which(c('Y', samp)=='Y'))-1 Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 From simone.salvadei at gmail.com Fri Nov 4 09:18:59 2011 From: simone.salvadei at gmail.com (Simone Salvadei) Date: Fri, 4 Nov 2011 09:18:59 +0100 Subject: [R] array manipulation Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Kay.Cichini at uibk.ac.at Fri Nov 4 10:35:35 2011 From: Kay.Cichini at uibk.ac.at (Kay Cichini) Date: Fri, 4 Nov 2011 02:35:35 -0700 (PDT) Subject: [R] replace double backslash with singel backslash Message-ID: <1320399335867-3989434.post@n4.nabble.com> I want to replace \\ with \ in: str <- "C:\\DOKUME~1\\u0327336\\LOKALE~1\\Temp\\RtmpQ5NJ8X\\TIRIS_PICS\\1_Img.jpg" and tried: gsub("\\\\", "\\", str) but this removes the \\ without replacing them by \ Any help much appreciated, Kay ----- ------------------------ Kay Cichini Postgraduate student Institute of Botany Univ. of Innsbruck ------------------------ -- View this message in context: http://r.789695.n4.nabble.com/replace-double-backslash-with-singel-backslash-tp3989434p3989434.html Sent from the R help mailing list archive at Nabble.com. From aajit75 at yahoo.co.in Fri Nov 4 09:32:07 2011 From: aajit75 at yahoo.co.in (aajit75) Date: Fri, 4 Nov 2011 01:32:07 -0700 (PDT) Subject: [R] Decision tree model using rpart ( classification In-Reply-To: References: <1320388593513-3989162.post@n4.nabble.com> Message-ID: <1320395527772-3989320.post@n4.nabble.com> Hi, Thanks for the responce, code for each case is as: c_c_factor <- 0.001 min_obs_split <- 80 A) fit <- rpart(segment ~., method="class", control=rpart.control(minsplit=min_obs_split, cp=c_c_factor), data=Beh_cluster_out) B) fit <- rpart(segment ~., method="class", control=rpart.control(minsplit=min_obs_split, cp=c_c_factor), data=profile_cluster_out) C) fit <- rpart(decile ~., method="class", control=rpart.control(minsplit=min_obs_split, cp=c_c_factor), data=dtm_ip) In A and B target variable 'segment' is from the clustering data using same set of input variables , while in C target variable 'decile' is derived from behavioural variables and input variables are from profile data. Number of rows in the input table in all three cases are same. Regards, -Ajit -- View this message in context: http://r.789695.n4.nabble.com/Decision-tree-model-using-rpart-classification-tp3989162p3989320.html Sent from the R help mailing list archive at Nabble.com. From simone.salvadei at gmail.com Fri Nov 4 09:30:09 2011 From: simone.salvadei at gmail.com (Simone Salvadei) Date: Fri, 4 Nov 2011 09:30:09 +0100 Subject: [R] array manipulation Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jholtman at gmail.com Fri Nov 4 11:44:04 2011 From: jholtman at gmail.com (Jim Holtman) Date: Fri, 4 Nov 2011 06:44:04 -0400 Subject: [R] replace double backslash with singel backslash In-Reply-To: <1320399335867-3989434.post@n4.nabble.com> References: <1320399335867-3989434.post@n4.nabble.com> Message-ID: <8D492BEC-A472-4C0D-B387-D9A7B7D0183E@gmail.com> what is the problem that you are trying to solve? you need the double \\ since they have a special meaning in quoted strings. in this case they represent a since backslash. if you really had a single one, then something like this '\n' would be a carriage return. Sent from my iPad On Nov 4, 2011, at 5:35, Kay Cichini wrote: > I want to replace \\ with \ in: > str <- > "C:\\DOKUME~1\\u0327336\\LOKALE~1\\Temp\\RtmpQ5NJ8X\\TIRIS_PICS\\1_Img.jpg" > > and tried: > gsub("\\\\", "\\", str) > > but this removes the \\ without replacing them by \ > > Any help much appreciated, > Kay > > ----- > ------------------------ > Kay Cichini > Postgraduate student > Institute of Botany > Univ. of Innsbruck > ------------------------ > > -- > View this message in context: http://r.789695.n4.nabble.com/replace-double-backslash-with-singel-backslash-tp3989434p3989434.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From josh.m.ulrich at gmail.com Fri Nov 4 12:34:28 2011 From: josh.m.ulrich at gmail.com (Joshua Ulrich) Date: Fri, 4 Nov 2011 06:34:28 -0500 Subject: [R] Extract Data from Yahoo Finance In-Reply-To: <1320405008.21134.YahooMailNeo@web161406.mail.bf1.yahoo.com> References: <1320292966.73541.YahooMailNeo@web161421.mail.bf1.yahoo.com> <1320357923.20842.YahooMailNeo@web161406.mail.bf1.yahoo.com> <1320405008.21134.YahooMailNeo@web161406.mail.bf1.yahoo.com> Message-ID: Deb, Sorry, you can't do that with getQuote because Yahoo does not make those data available historically. Generally, you will need to pay for historical bid/ask (tick) data. Best, -- Joshua Ulrich ?| ?FOSS Trading: www.fosstrading.com On Fri, Nov 4, 2011 at 6:10 AM, Deb Midya wrote: > Joshua, > > Thank you very much for your response. > > How can I use getQuote to download data for a stock for a certain period of > time, say, 1 October 2011 to 31 October 2011. > > Once again, thank very much for the time you have given. > > Regards, > > Deb > From: Joshua Ulrich > To: Deb Midya > Cc: R. Michael Weylandt ; "r-help at r-project.org" > > Sent: Friday, 4 November 2011 4:12 PM > Subject: Re: [R] Extract Data from Yahoo Finance > > Deb, > > See getQuote in the quantmod package.? For example: > getQuote("SPY") > > Be sure to read ?getQuote. > > Best, > -- > Joshua Ulrich ?| ?FOSS Trading: www.fosstrading.com > > > > On Thu, Nov 3, 2011 at 5:05 PM, Deb Midya wrote: >> Michael, >> >> Thanks for your response. >> >> The link to the page is: http://www.gummy-stuff.org/Yahoo-data.htm >> >> I like to download the fields (mentioned under special tags) for a period >> of time and for a particular stock (or for a list of stocks). >> >> Once again, thank you very much for the time you have given. >> >> Regards, >> >> Deb >> >> From: R. Michael Weylandt >> To: Deb Midya >> Cc: "r-help at r-project.org" >> Sent: Friday, 4 November 2011 12:13 AM >> Subject: Re: [R] Extract Data from Yahoo Finance >> >> The quantmod package can probably do what you are asking, but it's a >> little hard to be certain since you provide neither a list of all the >> fields you are actually talking about nor a link to the page with the >> fields in question. >> >> Michael >> >> On Thu, Nov 3, 2011 at 12:02 AM, Deb Midya wrote: >>> Hi R ?users, >>> >>> I am using R-2.14.0 on Windows XP. >>> >>> May I request you to assist me for the following please. >>> >>> I like to extract all the fields (example: a : Ask, b : Bid, ??, w : >>> 52-week Range, x: Stock Exchange) ?for certain period of time, say, 1 >>> October 2011 to 31 October 2011. >>> >>> Is there any R-Package(s) & any R- script please? >>> >>> Once again, thank you very much for the time you have given. >>> >>> Regards, >>> >>> Deb >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> ? ? ? ?[[alternative HTML version deleted]] >> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > > From Pietro.Parodi at willis.com Fri Nov 4 13:51:58 2011 From: Pietro.Parodi at willis.com (Parodi, Pietro) Date: Fri, 4 Nov 2011 07:51:58 -0500 Subject: [R] Counting number of common elements between the rows of two different matrices In-Reply-To: References: Message-ID: <0A760D5925AC5E4893BDDDAABC44E29801EF6ED7@USNSH-I-EC80.int.dir.willis.com> Hello I'm trying to solve this problem without using a for loop but I have so far failed to find a solution. I have two matrices of K columns each, e.g. (K=5), and with numbers of row N_A and N_B respectively A = (1 5 3 8 15; 2 7 20 11 13; 12 19 20 21 43) B = (2 6 30 8 16; 3 8 19 11 13) (the actual matrices have hundreds of thousands of entry, that's why I'm keen to avoid "for" loops) And what I need to do is to apply a function which counts the number of common elements between ANY row of A and ANY row of B, giving a result like this: A1 vs B1: 1 # (8 is a common element) A1 vs B2: 1 # (8 is a common element) A2 vs B1: 1 # (2 is a common element) A2 vs B2: 1 # 11, 13 are common elements Etc. I've built a function that counts the number of common elements between two vectors, based on the intersect function in the R manual common_elements <- function(x,y) length(y[match(x,y,nomatch=0)]) And a double loop who solves my problem would be something like (pseudo-code) For(i in 1:N_A){ for(j in 1:N_B){ ce(i,j)=common_elements(a(i),b(j)) } } Is there an efficient, clean way to do the same job and give as an output a matrix N_A x N_B such as that above? Thanks a lot for your help Regards Pietro ______________________________________________________________________ For information pertaining to Willis' email confidentiality and monitoring policy, usage restrictions, or for specific company registration and regulatory status information, please visit http://www.willis.com/email_trailer.aspx We are now able to offer our clients an encrypted email capability for secure communication purposes. If you wish to take advantage of this service or learn more about it, please let me know or contact your Client Advocate for full details. ~W67897 From AEasom at SportingIndex.com Fri Nov 4 14:06:09 2011 From: AEasom at SportingIndex.com (Andre Easom) Date: Fri, 4 Nov 2011 13:06:09 +0000 Subject: [R] Select some, but not all, variables stepwise Message-ID: <5648FEE24754234D80F8E5BBE1A22BE51219BC9C01@GBGH-EXCH-CMS.sig.ads> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From f.harrell at vanderbilt.edu Fri Nov 4 14:18:07 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Fri, 4 Nov 2011 06:18:07 -0700 (PDT) Subject: [R] Select some, but not all, variables stepwise In-Reply-To: <5648FEE24754234D80F8E5BBE1A22BE51219BC9C01@GBGH-EXCH-CMS.sig.ads> References: <5648FEE24754234D80F8E5BBE1A22BE51219BC9C01@GBGH-EXCH-CMS.sig.ads> Message-ID: <1320412687559-3990026.post@n4.nabble.com> Resist the temptation. Stepwise analysis without shrinkage will ruin model inferences without helping with predictive accuracy. Frank AndreE wrote: > > Hi, > > I would like to fit a linear model where some but not all explanators are > chosen stepwise - ie I definitely want to include some terms, but others > only if they are deemed significant (by AIC or whatever other approach is > available). For example if I wanted to definitely include x1 and x2, but > only include z1 and z2 if they are significant, something like this: > > df <- data.frame(y=c(4,2,6,7,3,9,5,7,6,2), x1=c(2,3,4,0,5,8,8,1,1,2), > x2=c(0,0,0,0,1,1,0,0,0,1), z1=c(0,1,0,0,0,1,1,0,1,1), > z2=c(1,1,1,0,0,1,1,1,1,0)) > model <- lm(y ~ x1 + x2 + stepwise(z1 + z2)) > > Any help would be appreciated. > > Cheers, > Andre > ********************************************************************** > This email and any attachments are confidential, protect...{{dropped:22}} > > ______________________________________________ > R-help@ mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Select-some-but-not-all-variables-stepwise-tp3990002p3990026.html Sent from the R help mailing list archive at Nabble.com. From skfglades at gmail.com Fri Nov 4 14:36:07 2011 From: skfglades at gmail.com (Steve Friedman) Date: Fri, 4 Nov 2011 09:36:07 -0400 Subject: [R] Plotting skewed normal distribution with a bar plot In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From friendly at yorku.ca Fri Nov 4 14:36:39 2011 From: friendly at yorku.ca (Michael Friendly) Date: Fri, 04 Nov 2011 09:36:39 -0400 Subject: [R] replace double backslash with singel backslash In-Reply-To: <1320399335867-3989434.post@n4.nabble.com> References: <1320399335867-3989434.post@n4.nabble.com> Message-ID: <4EB3EA67.50707@yorku.ca> On 11/4/2011 5:35 AM, Kay Cichini wrote: > I want to replace \\ with \ in: > str<- > "C:\\DOKUME~1\\u0327336\\LOKALE~1\\Temp\\RtmpQ5NJ8X\\TIRIS_PICS\\1_Img.jpg" > > and tried: > gsub("\\\\", "\\", str) > > but this removes the \\ without replacing them by \ > You may be able to avoid this simply by using forward slashes in path names. Otherwise, ?normalizePath may help. -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA From jholtman at gmail.com Fri Nov 4 14:37:50 2011 From: jholtman at gmail.com (jim holtman) Date: Fri, 4 Nov 2011 09:37:50 -0400 Subject: [R] Counting number of common elements between the rows of two different matrices In-Reply-To: <0A760D5925AC5E4893BDDDAABC44E29801EF6ED7@USNSH-I-EC80.int.dir.willis.com> References: <0A760D5925AC5E4893BDDDAABC44E29801EF6ED7@USNSH-I-EC80.int.dir.willis.com> Message-ID: Try this: # create dummy data a <- matrix(sample(20, 50, TRUE), ncol = 5) b <- matrix(sample(20, 50, TRUE), ncol = 5) # create combinations to test x <- expand.grid(seq(nrow(a)), seq(nrow(b))) # test result <- mapply(function(m1, m2) any(a[m1, ] %in% b[m2, ]) , x[, 1] , x[, 2] ) # create the output matrix result.m <- matrix(result, nrow = nrow(a), ncol = nrow(b)) On Fri, Nov 4, 2011 at 8:51 AM, Parodi, Pietro wrote: > > Hello > > I'm trying to solve this problem without using a for loop but I have so > far failed to find a solution. > > I have two matrices of K columns each, e.g. (K=5), and with numbers of > row N_A and N_B respectively > > A = ? ? (1 5 3 8 15; > ? ? ? ? 2 7 20 11 13; > ? ? ? ? 12 19 20 21 43) > > B = ? ? (2 6 30 8 16; > ? ? ? ? 3 8 19 11 13) > > (the actual matrices have hundreds of thousands of entry, that's why I'm > keen to avoid "for" loops) > > And what I need to do is to apply a function which counts the number of > common elements between ANY row of A and ANY row of B, giving a result > like this: > > > A1 vs B1: ?1 ?# (8 is a common element) > A1 vs B2: ?1 ?# (8 is a common element) > A2 vs B1: ?1 ?# (2 is a common element) > A2 vs B2: ?1 ?# 11, 13 are common elements > Etc. > > I've built a function that counts the number of common elements between > two vectors, based on the intersect function in the R manual > > common_elements <- function(x,y) length(y[match(x,y,nomatch=0)]) > > And a double loop who solves my problem would be something like > (pseudo-code) > > For(i in 1:N_A){ > ? ? ? ?for(j in 1:N_B){ > ? ? ? ? ? ? ? ?ce(i,j)=common_elements(a(i),b(j)) > ? ? ? ? ? ? ? ?} > ? ? ? ?} > > Is there an efficient, clean way to do the same job and give as an > output a matrix N_A x N_B such as that above? > > Thanks a lot for your help > > Regards > > Pietro > > ______________________________________________________________________ > > For information pertaining to Willis' email confidentiality and monitoring policy, usage restrictions, or for specific company registration and regulatory status information, please visit http://www.willis.com/email_trailer.aspx > > We are now able to offer our clients an encrypted email capability for secure communication purposes. If you wish to take advantage of this service or learn more about it, please let me know or contact your Client Advocate for full details. ~W67897 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. From jdnewmil at dcn.davis.ca.us Fri Nov 4 14:45:07 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Fri, 04 Nov 2011 06:45:07 -0700 Subject: [R] replace double backslash with singel backslash In-Reply-To: <1320399335867-3989434.post@n4.nabble.com> References: <1320399335867-3989434.post@n4.nabble.com> Message-ID: <223da9a6-4eb1-4840-9c5a-5464cf4fd30a@email.android.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From gleynes+r at gmail.com Fri Nov 4 15:23:04 2011 From: gleynes+r at gmail.com (Gene Leynes) Date: Fri, 4 Nov 2011 09:23:04 -0500 Subject: [R] replace double backslash with singel backslash In-Reply-To: <223da9a6-4eb1-4840-9c5a-5464cf4fd30a@email.android.com> References: <1320399335867-3989434.post@n4.nabble.com> <223da9a6-4eb1-4840-9c5a-5464cf4fd30a@email.android.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pisicandru at hotmail.com Fri Nov 4 15:51:41 2011 From: pisicandru at hotmail.com (Monica Pisica) Date: Fri, 4 Nov 2011 14:51:41 +0000 Subject: [R] How to write a shapefile with projection Message-ID: Hi, ? I am trying to write a shapefile with projection. I have my data in a data.frame called try and consists in xy coordinates and a numerical attribute value z1. ? Libraries loaded are: sp, rgdal, raster, maptools ? head(try) ???????? x?????? ??? y??? ? ? ? ? ? ? ??? z1 1 610237.1 ???????? 3375751 ???????????? 8.221 2 610236.1 ???????? 3375750 ???????????? 8.153 3 610236.1 ???????? 3375749 ???????????? 8.275 4 610236.1 ???????? 3375748 ???????????? 8.251 5 610236.1 ???????? 3375747 ???????????? 8.217 6 610236.1 ???????? 3375746 ???????????? 8.196 ? #Get the projection from a raster ?named llev I have loaded before: crs <- projection(llev) ? # get a spatial point data frame from my data crest.sp <- SpatialPointsDataFrame(try[,1:2], try, proj4string=CRS(crs)) ? summary(crest.sp) Object of class SpatialPointsDataFrame Coordinates: ??????? min?????? ??????? max x? 610235.1? ?????? 610354.1 y 3374862.4 ?????? 3375751.4 Is projected: TRUE proj4string : [+proj=utm +zone=15 +ellps=GRS80 +datum=NAD83 +units=m +no_defs +towgs84=0,0,0] Number of points: 890 Data attributes: ????? ???????? ?x????????? y???????????????? z1?????? ?Min.?? :? 610235?? Min. : 3374862?? Min.: 6.966? ?1st Qu.:610269?? 1st Qu.:3375085?? 1st Qu.:7.570? ?Median :610298?? Median :3375307?? Median :7.901? ?Mean?? :610300?? Mean? ?:3375307?? Mean?? :7.882? ?3rd Qu.:610334?? 3rd Qu.:3375529?? 3rd Qu.:8.180? ?Max.?? :610354?? Max.?? :3375751?? Max.?? :8.756? ? #write the shapefile writePointsShape(crest.sp, "I:/LA_levee/Shape/llev_crest_pts6") ? If I load this shapefile in ArcGIS is has no projection. I also looked at the write.shapefile command from shapefiles library and again I get a file without projection. Is there any way to write the projection for the shapefile in R? Probably I can do something in ArcGIS, but I would like to have my shapefile from R complete. ? Thanks, ? Monica ? From michael.weylandt at gmail.com Fri Nov 4 15:55:02 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 4 Nov 2011 10:55:02 -0400 Subject: [R] How to use 'prcomp' with CLUSPLOT? In-Reply-To: <47CA6EA672AF4A4DA62A7DCCFEDC9ABB0BB34D18@XMAIL-MBX-AH1.AD.UCSD.EDU> References: <47CA6EA672AF4A4DA62A7DCCFEDC9ABB0BB34D18@XMAIL-MBX-AH1.AD.UCSD.EDU> Message-ID: Hello Jo, Full disclosure: I don't know much about clustering/partition cluster analysis/etc so I've only attacked this as an R problem. However, this might get you going in the right direction: df <- read.table(textConnection("PRVID,VAR1,VAR2,VAR3,VAR4,VAR5,VAR6,VAR7,VAR8,VAR9,VAR10,VAR11 PRV1,0,54463,53049,62847,75060,184925,0,0,0,0,0 PRV2,0,2100,76,131274,0,0,0,0,0,0,18 PRV3,967,0,0,0,0,0,0,0,0,3634,0 PRV4,817,18344,3274,9264,1862,0,0,141,0,0,0 PRV5,0,0,0,0,0,0,29044,0,0,0,0 PRV6,59,6924,825,3008,377,926,0,0,10156,0,5555 PRV7,11,24902,36040,47223,20086,0,0,749,415,0,0"), header = T, sep = ",", stringsAsFactors = T) closeAllConnections() library(cluster) mat <- as.matrix(df[,-1]) newtble <- prop.table(mat, 1) * 100 num.clust <- 3 clusplotMW <- cluster:::clusplot.default # Create a copy of the two necessary functions for clusplot that route to princomp mkCheckMW <- cluster:::mkCheckX body(mkCheckMW) <- parse(text=gsub("princomp", "prcomp",deparse(body(mkCheckMW)))) # replace princomp with prcomp in our copy body(clusplotMW) <- parse(text=gsub("mkCheckX", "mkCheckMW",deparse(body(clusplotMW)))) # route our clusplot to our mkCheckX clusplotMW(newtble, fitnw$cluster, color = T, shade = T, lines = 0) Since you didn't provide a working example, I can't verify this, but let me know if it works for you. Michael On Thu, Nov 3, 2011 at 8:10 PM, Jo Frabetti wrote: > Hello, > > I have a large data set that has more columns than rows (sample data below). ?I am trying to perform a partitioning cluster analysis and then plot that using pca. ?I have tried using CLUSPLOT(), but that only allows for 'princomp' where I need 'prcomp' as I do not want to reduce my columns. Is there a way to edit the CLUSPLOT() code to use 'prcomp', please? > > # sample of my data > PRVID,VAR1,VAR2,VAR3,VAR4,VAR5,VAR6,VAR7,VAR8,VAR9,VAR10,VAR11 > PRV1,0,54463,53049,62847,75060,184925,0,0,0,0,0 > PRV2,0,2100,76,131274,0,0,0,0,0,0,18 > PRV3,967,0,0,0,0,0,0,0,0,3634,0 > PRV4,817,18344,3274,9264,1862,0,0,141,0,0,0 > PRV5,0,0,0,0,0,0,29044,0,0,0,0 > PRV6,59,6924,825,3008,377,926,0,0,10156,0,5555 > PRV7,11,24902,36040,47223,20086,0,0,749,415,0,0 > > library(cluster) > fn ? = "big.csv"; > tbl ?= read.table(fn, ?header=TRUE, sep=",", row.names=1); > mat <- as.matrix(tbl); > newtbl <- prop.table(mat,1)*100; > > num.clust <- 3; > fitnw <- kmeans(newtbl, num.clust); > clusplot(newtbl, fitnw$cluster, color=TRUE, shade=TRUE, lines=0, main= paste('Principal Components plot - Kmeans ', clust.level, ' Clusters') ) > > Error in princomp.default(x, scores = TRUE, cor = ncol(x) != 2) : > ?'princomp' can only be used with more units than variables > > Thank you for R and any assistance you may offer! > > Jo > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From ata.sonu at gmail.com Fri Nov 4 11:46:27 2011 From: ata.sonu at gmail.com (ATANU) Date: Fri, 4 Nov 2011 03:46:27 -0700 (PDT) Subject: [R] formatting a dataframe Message-ID: <1320403587114-3989638.post@n4.nabble.com> http://r.789695.n4.nabble.com/file/n3989638/tab3.gif i want to create a dataframe like the one given in the pictures having names of sub categories of columns. the table may be presented either in the console or in an excel sheet. anyone please suggest me a technique. thanks in advance. -atanu -- View this message in context: http://r.789695.n4.nabble.com/formatting-a-dataframe-tp3989638p3989638.html Sent from the R help mailing list archive at Nabble.com. From bonda at hsu-hh.de Fri Nov 4 11:27:12 2011 From: bonda at hsu-hh.de (bonda) Date: Fri, 4 Nov 2011 03:27:12 -0700 (PDT) Subject: [R] nproc parameter in efpFunctional In-Reply-To: References: <1320228608589-3972419.post@n4.nabble.com> <1320306970603-3984605.post@n4.nabble.com> Message-ID: <1320402432656-3989598.post@n4.nabble.com> The 2006 CSDA paper is really very informative, perhaps, I'm trying to understand the things lying beyond. If we have e.g. k=3, then taking nproc=3 for the functional maxBB we get a critical value (boundary) maxBB$computeCritval(0.05,nproc=3) [1] 1.544421, and this for nproc=NULL (Bonferroni approximation) will be maxBB$computeCritval(0.05) [1] 1.358099. Aggregating 3 Brownian bridges first over components, we obtain time series process. Now, we wonder if maximum value of the process (aggregation over time) lies over boundary. Which boundary - 1.544421 or 1.358099 - should one take? They look too different and, for instance, lead to "unfair computing" of empirical size (as rejection rate of null hypothesis) or empirical power (as acception rate of alternative). -- View this message in context: http://r.789695.n4.nabble.com/nproc-parameter-in-efpFunctional-tp3972419p3989598.html Sent from the R help mailing list archive at Nabble.com. From boylangc at earthlink.net Fri Nov 4 13:41:36 2011 From: boylangc at earthlink.net (boylangc at earthlink.net) Date: Fri, 4 Nov 2011 08:41:36 -0400 (GMT-04:00) Subject: [R] Plotting skewed normal distribution with a bar plot Message-ID: <19590811.1320410497026.JavaMail.root@elwamui-norfolk.atl.sa.earthlink.net> Steve, I don't profess to be an expert in R (by ANY means), but I've used the sn.mle function in package sn to do something similar. So, for example, if you have data (y), you can do the following: y = c(5,4,6,4,5,6,7,7,55,64,23,13,1,4,1,14,1,15,11,12) sn.mle(y=y) I just made up this stream of data, so please forgive if it doesn't make sense. The function will produce a number of things, including the mles for the mean, sd, and skew, as well as a histogram with the density overlayed onto it. Maybe this helps? Hope so. Greg -----Original Message----- >From: "R. Michael Weylandt" >Sent: Nov 3, 2011 10:39 PM >To: Steve_Friedman at nps.gov >Cc: r-help at r-project.org >Subject: Re: [R] Plotting skewed normal distribution with a bar plot > >It seems like you'll need to apply some sort of MLE to estimate the >parameters directly from the data before using dsn() to get the >density. This might help with some of it: >http://help.rmetrics.org/fGarch/html/snorm.html > >Michael > >On Thu, Nov 3, 2011 at 2:54 PM, wrote: >> >> Hi, >> >> I need to create a plot (type = "h") ?and then overlay a skewed-normal >> curve on this distribution, but I'm not finding a procedure to accomplish >> this. I want to use the plot function here in order to control the bin >> distributions. >> >> I have explored the sn library and found the dsn function. ?dsn uses known >> location, scaling and shape parameters associated with a given input vector >> of probabilities. ?However, how can I calculate the skewed-normal curve if >> I don't know these parameters in advance? >> >> Is there another function to calculate the skew-normal, perhaps in a >> different package? >> >> >> I'm working with R 2.13.2 on a windows based machine. >> >> Steve Friedman Ph. D. >> Ecologist ?/ Spatial Statistical Analyst >> Everglades and Dry Tortugas National Park >> 950 N Krome Ave (3rd Floor) >> Homestead, Florida 33034 >> >> Steve_Friedman at nps.gov >> Office (305) 224 - 4282 >> Fax ? ? (305) 224 - 4147 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. From christian.langkamp at gmxpro.de Fri Nov 4 11:34:08 2011 From: christian.langkamp at gmxpro.de (clangkamp) Date: Fri, 4 Nov 2011 03:34:08 -0700 (PDT) Subject: [R] 12th Root of a Square (Transition) Matrix In-Reply-To: <4C1B46C3.7010505@gmail.com> References: <3080FA352049DF49B5FCD17896109AD576CBA2C3@BL2PRD0102MB008.prod.exchangelabs.com> <4C1B46C3.7010505@gmail.com> Message-ID: <1320402848675-3989618.post@n4.nabble.com> I have tried this method, but the result is not working, at least not as I expect: I used the CreditMetrics package transition matrix rc <- c("AAA", "AA", "A", "BBB", "BB", "B", "CCC", "D") M <- matrix(c(90.81, 8.33, 0.68, 0.06, 0.08, 0.02, 0.01, 0.01, 0.70, 90.65, 7.79, 0.64, 0.06, 0.13, 0.02, 0.01, 0.09, 2.27, 91.05, 5.52, 0.74, 0.26, 0.01, 0.06, 0.02, 0.33, 5.95, 85.93, 5.30, 1.17, 1.12, 0.18, 0.03, 0.14, 0.67, 7.73, 80.53, 8.84, 1.00, 1.06, 0.01, 0.11, 0.24, 0.43, 6.48, 83.46, 4.07, 5.20, 0.21, 0, 0.22, 1.30, 2.38, 11.24, 64.86, 19.79, 0, 0, 0, 0, 0, 0, 0, 100 )/100, 8, 8, dimnames = list(rc, rc), byrow = TRUE) then followed through with the steps: nth_root <- X %*% L_star %*% X_inv But the check (going back 12 to the power again) doesn't yield the original matrix. Now some rounding errors can be expected, but I didn't expect a perfectly diagonal matrix, when the initial matrix isn't diagonal at all. > round(nth_root^12,4) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,] 0.9078 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0 [2,] 0.0000 0.9053 0.0000 0.0000 0.0000 0.0000 0.0000 0 [3,] 0.0000 0.0000 0.9079 0.0000 0.0000 0.0000 0.0000 0 [4,] 0.0000 0.0000 0.0000 0.8553 0.0000 0.0000 0.0000 0 [5,] 0.0000 0.0000 0.0000 0.0000 0.7998 0.0000 0.0000 0 [6,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.8285 0.0000 0 [7,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.6457 0 [8,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1 Any takers ----- Christian Langkamp christian.langkamp-at-gmxpro.de -- View this message in context: http://r.789695.n4.nabble.com/12th-Root-of-a-Square-Transition-Matrix-tp2259736p3989618.html Sent from the R help mailing list archive at Nabble.com. From cindy_zhou_08 at hotmail.com Fri Nov 4 16:00:04 2011 From: cindy_zhou_08 at hotmail.com (Ying Zhou) Date: Fri, 4 Nov 2011 11:00:04 -0400 Subject: [R] survfit function? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From fmora at oikos.unam.mx Fri Nov 4 14:20:25 2011 From: fmora at oikos.unam.mx (Francisco Mora Ardila) Date: Fri, 4 Nov 2011 07:20:25 -0600 Subject: [R] Select some, but not all, variables stepwise In-Reply-To: <5648FEE24754234D80F8E5BBE1A22BE51219BC9C01@GBGH-EXCH-CMS.sig.ads> References: <5648FEE24754234D80F8E5BBE1A22BE51219BC9C01@GBGH-EXCH-CMS.sig.ads> Message-ID: <20111104131437.M324@oikos.unam.mx> Hi Andre I don?t know if it will work, but I?ve tried the MuMIn package, were you can evaluate all possible models (usin for example AIC) at one time. Maybe you can focus on comparing those models which retain the explanators you want. Best wishes Francisco On Fri, 4 Nov 2011 13:06:09 +0000, Andre Easom wrote > Hi, > > I would like to fit a linear model where some but not all explanators are > chosen stepwise - ie I definitely want to include some terms, but others only > if they are deemed significant (by AIC or whatever other approach is available) > . For example if I wanted to definitely include x1 and x2, but only include > z1 and z2 if they are significant, something like this: > > df <- data.frame(y=c(4,2,6,7,3,9,5,7,6,2), x1=c(2,3,4,0,5,8,8,1,1,2), x2=c(0,0, > 0,0,1,1,0,0,0,1), z1=c(0,1,0,0,0,1,1,0,1,1), z2=c(1,1,1,0,0,1,1,1,1,0)) model > <- lm(y ~ x1 + x2 + stepwise(z1 + z2)) > > Any help would be appreciated. > > Cheers, > Andre > ********************************************************************** > This email and any attachments are confidential, protect...{{dropped:22}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. Francisco Mora Ardila Laboratorio de Biodiversidad y Funcionamiento del Ecosistema Centro de Investigaciones en Ecosistemas UNAM-Campus Morelia Tel 3222777 ext. 42621 Morelia , MIchoac?n, M?xico. -- Open WebMail Project (http://openwebmail.org) From Jose.Iparraguirre at ageuk.org.uk Fri Nov 4 14:27:48 2011 From: Jose.Iparraguirre at ageuk.org.uk (Jose Iparraguirre) Date: Fri, 4 Nov 2011 13:27:48 +0000 Subject: [R] How to delete only those rows in a dataframe in which all records are missing Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pisicandru at hotmail.com Fri Nov 4 16:08:05 2011 From: pisicandru at hotmail.com (Monica Pisica) Date: Fri, 4 Nov 2011 15:08:05 +0000 Subject: [R] How to write a shapefile with projection - problem solved Message-ID: Hi, Sorry i have put such a detailed question to the list about writing a shapefile with projection. I realized that if i use writeOGR from rgdal and not the other write shapefile functions i can get a shapefile with projection recognized by ArcGIS. The command is (in case anybody wonders): ?writeOGR(crest.sp, "I:\\LA_levee\\Shape", "llev_crest_pts6", driver = "ESRI Shapefile") where crest.sp is a spatial point data frame with projection. Thanks, Monica From Pietro.Parodi at willis.com Fri Nov 4 16:11:32 2011 From: Pietro.Parodi at willis.com (Parodi, Pietro) Date: Fri, 4 Nov 2011 10:11:32 -0500 Subject: [R] Counting number of common elements between the rows of two different matrices In-Reply-To: References: <0A760D5925AC5E4893BDDDAABC44E29801EF6ED7@USNSH-I-EC80.int.dir.willis.com> Message-ID: <0A760D5925AC5E4893BDDDAABC44E29801EF6F65@USNSH-I-EC80.int.dir.willis.com> Jim I tried that and it works. Thank you very much for your help! Regards Pietro -----Original Message----- From: jim holtman [mailto:jholtman at gmail.com] Sent: 04 November 2011 13:38 To: Parodi, Pietro Cc: r-help at r-project.org Subject: Re: [R] Counting number of common elements between the rows of two different matrices Try this: # create dummy data a <- matrix(sample(20, 50, TRUE), ncol = 5) b <- matrix(sample(20, 50, TRUE), ncol = 5) # create combinations to test x <- expand.grid(seq(nrow(a)), seq(nrow(b))) # test result <- mapply(function(m1, m2) any(a[m1, ] %in% b[m2, ]) , x[, 1] , x[, 2] ) # create the output matrix result.m <- matrix(result, nrow = nrow(a), ncol = nrow(b)) On Fri, Nov 4, 2011 at 8:51 AM, Parodi, Pietro wrote: > > Hello > > I'm trying to solve this problem without using a for loop but I have so > far failed to find a solution. > > I have two matrices of K columns each, e.g. (K=5), and with numbers of > row N_A and N_B respectively > > A = ? ? (1 5 3 8 15; > ? ? ? ? 2 7 20 11 13; > ? ? ? ? 12 19 20 21 43) > > B = ? ? (2 6 30 8 16; > ? ? ? ? 3 8 19 11 13) > > (the actual matrices have hundreds of thousands of entry, that's why I'm > keen to avoid "for" loops) > > And what I need to do is to apply a function which counts the number of > common elements between ANY row of A and ANY row of B, giving a result > like this: > > > A1 vs B1: ?1 ?# (8 is a common element) > A1 vs B2: ?1 ?# (8 is a common element) > A2 vs B1: ?1 ?# (2 is a common element) > A2 vs B2: ?1 ?# 11, 13 are common elements > Etc. > > I've built a function that counts the number of common elements between > two vectors, based on the intersect function in the R manual > > common_elements <- function(x,y) length(y[match(x,y,nomatch=0)]) > > And a double loop who solves my problem would be something like > (pseudo-code) > > For(i in 1:N_A){ > ? ? ? ?for(j in 1:N_B){ > ? ? ? ? ? ? ? ?ce(i,j)=common_elements(a(i),b(j)) > ? ? ? ? ? ? ? ?} > ? ? ? ?} > > Is there an efficient, clean way to do the same job and give as an > output a matrix N_A x N_B such as that above? > > Thanks a lot for your help > > Regards > > Pietro > > ______________________________________________________________________ > > For information pertaining to Willis' email confidentiality and monitoring policy, usage restrictions, or for specific company registration and regulatory status information, please visit http://www.willis.com/email_trailer.aspx > > We are now able to offer our clients an encrypted email capability for secure communication purposes. If you wish to take advantage of this service or learn more about it, please let me know or contact your Client Advocate for full details. ~W67897 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. From michael.weylandt at gmail.com Fri Nov 4 16:17:56 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 4 Nov 2011 11:17:56 -0400 Subject: [R] How to delete only those rows in a dataframe in which all records are missing In-Reply-To: References: Message-ID: Perhaps something like this will work. df[!(rowSums(is.na(df))==NCOL(df)),] Michael On Fri, Nov 4, 2011 at 9:27 AM, Jose Iparraguirre wrote: > Hi, > > Imagine I have the following data frame: > >> a <- c(1,NA,3) >> b <- c(2,NA,NA) >> c <- data.frame(cbind(a,b)) >> c > ? a ?b > 1 ?1 ?2 > 2 NA NA > 3 ?3 NA > > I want to delete the second row. If I use na.omit, that would also affect the third row. I tried to use a loop and an ifelse clause with is.na to get R identify that row in which all records are missing, as opposed to the first row in which no records are missing or the third one, in which only one record is missing. How can I get R identify the row in which all records are missing? Or, how can I get R delete/omit only this row? > Thanks in advance, > > Jos? > > > Jos? Iparraguirre > Chief Economist > Age UK > > T 020 303 31482 > E Jose.Iparraguirre at ageuk.org.uk > > Tavis House, 1- 6 Tavistock Square > London, WC1H 9NB > www.ageuk.org.uk | ageukblog.org.uk | @AgeUKPA > > > Age UK ?Improving later life > > www.ageuk.org.uk > > > > > > ------------------------------- > > Age UK is a registered charity and company limited by guarantee, (registered charity number 1128267, registered company number 6825798). Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA. > > For the purposes of promoting Age UK Insurance, Age UK is an Appointed Representative of Age UK Enterprises Limited, Age UK is an Introducer Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth Access for the purposes of introducing potential annuity and health cash plans customers respectively. ?Age UK Enterprises Limited, JLT Benefit Solutions Limited and Simplyhealth Access are all authorised and regulated by the Financial Services Authority. > > > > > > ------------------------------ > > This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you receive a message in error, please advise the sender and delete immediately. > > > > Except where this email is sent in the usual course of our business, any opinions expressed in this email are those of the author and do not necessarily reflect the opinions of Age UK or its subsidiaries and associated companies. Age UK monitors all e-mail transmissions passing through its network and may block or modify mails which are deemed to be unsuitable. > > > > > > Age Concern England (charity number 261794) and Help the Aged (charity number 272786) and their trading and other associated companies merged on 1st April 2009. ?Together they have formed the Age UK Group, dedicated to improving the lives of people in later life. ?The three national Age Concerns in Scotland, Northern Ireland and Wales have also merged with Help the Aged in these nations to form three registered charities: Age Scotland, Age NI, Age Cymru. > > > > > > > > > > > > > > > > > > > > > > > > > > > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From AEasom at SportingIndex.com Fri Nov 4 16:36:56 2011 From: AEasom at SportingIndex.com (AndreE) Date: Fri, 4 Nov 2011 08:36:56 -0700 (PDT) Subject: [R] Select some, but not all, variables stepwise In-Reply-To: <20111104131437.M324@oikos.unam.mx> References: <5648FEE24754234D80F8E5BBE1A22BE51219BC9C01@GBGH-EXCH-CMS.sig.ads> <20111104131437.M324@oikos.unam.mx> Message-ID: <1320421016070-3990516.post@n4.nabble.com> Thanks Francisco - I've actually realized that ?step can do pretty much exactly what I want. Andre -- View this message in context: http://r.789695.n4.nabble.com/Select-some-but-not-all-variables-stepwise-tp3990002p3990516.html Sent from the R help mailing list archive at Nabble.com. From jdnewmil at dcn.davis.ca.us Fri Nov 4 16:54:14 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Fri, 04 Nov 2011 08:54:14 -0700 Subject: [R] replace double backslash with singel backslash In-Reply-To: References: <1320399335867-3989434.post@n4.nabble.com> <223da9a6-4eb1-4840-9c5a-5464cf4fd30a@email.android.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From f.harrell at vanderbilt.edu Fri Nov 4 17:01:08 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Fri, 4 Nov 2011 09:01:08 -0700 (PDT) Subject: [R] Select some, but not all, variables stepwise In-Reply-To: <1320421016070-3990516.post@n4.nabble.com> References: <5648FEE24754234D80F8E5BBE1A22BE51219BC9C01@GBGH-EXCH-CMS.sig.ads> <20111104131437.M324@oikos.unam.mx> <1320421016070-3990516.post@n4.nabble.com> Message-ID: <1320422468588-3990598.post@n4.nabble.com> Why do any of that? Those procedures are statistically invalid. Frank AndreE wrote: > > Thanks Francisco - I've actually realized that ?step can do pretty much > exactly what I want. > Andre > ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Select-some-but-not-all-variables-stepwise-tp3990002p3990598.html Sent from the R help mailing list archive at Nabble.com. From jmarca at translab.its.uci.edu Fri Nov 4 17:34:07 2011 From: jmarca at translab.its.uci.edu (James Marca) Date: Fri, 4 Nov 2011 09:34:07 -0700 Subject: [R] zoo performance regression noticed (1.6-5 is faster...) Message-ID: <20111104163407.GA20540@translab.its.uci.edu> Good morning, I have discovered what I believe to be a performance regression between Zoo 1.6x and Zoo 1.7-6 in the application of rollapply. On zoo 1.6x, rollapply of my function over my data takes about 20 minutes. Using 1.7-6, the same code takes about 6 hours. R --version R version 2.13.1 (2011-07-08) Copyright (C) 2011 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-pc-linux-gnu (64-bit) Two versions of zoo 1.6 run *fast* On one machine I am running less /usr/lib64/R/library/zoo/DESCRIPTION Package: zoo Version: 1.6-3 Date: 2010-04-23 Title: Z's ordered observations ... Packaged: 2010-04-23 07:28:47 UTC; zeileis Repository: CRAN Date/Publication: 2010-04-23 07:43:54 Built: R 2.10.1; ; 2010-04-25 06:41:34 UTC; unix (Thankfully I forgot to upgrade.packages() on this machine!) On the other Package: zoo Version: 1.6-5 Date: 2011-04-08 ... Packaged: 2011-04-08 17:13:47 UTC; zeileis Repository: CRAN Date/Publication: 2011-04-08 17:27:47 Built: R 2.13.1; ; 2011-11-04 15:49:54 UTC; unix I have stripped out zoo 1.7-6 from all my machines. I tried to ensure all libraries were identical on the two machines (using lsof), and after finally downgrading zoo I got the second machine to be as fast as the first, so I am quite certain the difference in speed is down to the Zoo version used. My code runs a fairly simple function over a time series using the following call to process a year of 30s data (9 columns, about a million rows): vals <- rollapply(data=ts.data[,c(n.3.cols, o.3.cols,volocc.cols)] ,width=40 ,FUN=rolling.function.fn(n.cols=n.3.cols,o.cols=o.3.cols,vo.cols=volocc.cols) ,by.column=FALSE ,align='right') (The rolling.function.fn call returns a function that is initialized with the initial call above (a trick I learned from Javascript)) If this is a known situation with the new 1.7 generation Zoo, my apologies and I'll go away. If my code could be turned into a useful test, I'd be happy to help out as much as I'm able. Given the extreme runtime difference though, I thought I should offer my help in this case, since zoo is such a useful package in my work. Regards, James Marca -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From ggrothendieck at gmail.com Fri Nov 4 17:56:22 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Fri, 4 Nov 2011 12:56:22 -0400 Subject: [R] zoo performance regression noticed (1.6-5 is faster...) In-Reply-To: <20111104163407.GA20540@translab.its.uci.edu> References: <20111104163407.GA20540@translab.its.uci.edu> Message-ID: On Fri, Nov 4, 2011 at 12:34 PM, James Marca wrote: > Good morning, > > I have discovered what I believe to be a performance regression > between Zoo 1.6x and Zoo 1.7-6 in the application of rollapply. > On zoo 1.6x, rollapply of my function over my data takes about 20 > minutes. Using 1.7-6, the same code takes about 6 hours. > > R --version > R version 2.13.1 (2011-07-08) > Copyright (C) 2011 The R Foundation for Statistical Computing > ISBN 3-900051-07-0 > Platform: x86_64-pc-linux-gnu (64-bit) > > Two versions of zoo 1.6 run *fast* ?On one machine I am running > > ?less /usr/lib64/R/library/zoo/DESCRIPTION > ?Package: zoo > ?Version: 1.6-3 > ?Date: 2010-04-23 > ?Title: Z's ordered observations > ?... > ?Packaged: 2010-04-23 07:28:47 UTC; zeileis > ?Repository: CRAN > ?Date/Publication: 2010-04-23 07:43:54 > ?Built: R 2.10.1; ; 2010-04-25 06:41:34 UTC; unix > > (Thankfully I forgot to upgrade.packages() on this machine!) > > On the other > > ?Package: zoo > ?Version: 1.6-5 > ?Date: 2011-04-08 > ?... > ?Packaged: 2011-04-08 17:13:47 UTC; zeileis > ?Repository: CRAN > ?Date/Publication: 2011-04-08 17:27:47 > ?Built: R 2.13.1; ; 2011-11-04 15:49:54 UTC; unix > > I have stripped out zoo 1.7-6 from all my machines. > > I tried to ensure all libraries were identical on the two machines > (using lsof), and after finally downgrading zoo I got the second > machine to be as fast as the first, so I am quite certain the > difference in speed is down to the Zoo version used. > > My code runs a fairly simple function over a time series using the > following call to process a year of 30s data (9 columns, about a > million rows): > > ? ?vals <- rollapply(data=ts.data[,c(n.3.cols, o.3.cols,volocc.cols)] > ? ? ? ? ? ? ? ? ?,width=40 > ? ? ? ? ? ? ? ? ?,FUN=rolling.function.fn(n.cols=n.3.cols,o.cols=o.3.cols,vo.cols=volocc.cols) > ? ? ? ? ? ? ? ? ?,by.column=FALSE > ? ? ? ? ? ? ? ? ?,align='right') > > > (The rolling.function.fn call returns a function that is initialized > with the initial call above (a trick I learned from Javascript)) > > If this is a known situation with the new 1.7 generation Zoo, my > apologies and I'll go away. ?If my code could be turned into a useful > test, I'd be happy to help out as much as I'm able. ?Given the extreme > runtime difference though, I thought I should offer my help in this > case, since zoo is such a useful package in my work. This was a known problem and was fixed but if its still there then there must be some other condition under which it can occur as well. If you can provide a small self contained reproducible example it would help in tracking it down. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From tal.galili at gmail.com Fri Nov 4 18:01:48 2011 From: tal.galili at gmail.com (Tal Galili) Date: Fri, 4 Nov 2011 19:01:48 +0200 Subject: [R] Decision tree model using rpart ( classification In-Reply-To: <1320395527772-3989320.post@n4.nabble.com> References: <1320388593513-3989162.post@n4.nabble.com> <1320395527772-3989320.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ggrothendieck at gmail.com Fri Nov 4 18:02:24 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Fri, 4 Nov 2011 13:02:24 -0400 Subject: [R] zoo performance regression noticed (1.6-5 is faster...) In-Reply-To: References: <20111104163407.GA20540@translab.its.uci.edu> Message-ID: On Fri, Nov 4, 2011 at 12:56 PM, Gabor Grothendieck wrote: > On Fri, Nov 4, 2011 at 12:34 PM, James Marca > wrote: >> Good morning, >> >> I have discovered what I believe to be a performance regression >> between Zoo 1.6x and Zoo 1.7-6 in the application of rollapply. >> On zoo 1.6x, rollapply of my function over my data takes about 20 >> minutes. Using 1.7-6, the same code takes about 6 hours. >> >> R --version >> R version 2.13.1 (2011-07-08) >> Copyright (C) 2011 The R Foundation for Statistical Computing >> ISBN 3-900051-07-0 >> Platform: x86_64-pc-linux-gnu (64-bit) >> >> Two versions of zoo 1.6 run *fast* ?On one machine I am running >> >> ?less /usr/lib64/R/library/zoo/DESCRIPTION >> ?Package: zoo >> ?Version: 1.6-3 >> ?Date: 2010-04-23 >> ?Title: Z's ordered observations >> ?... >> ?Packaged: 2010-04-23 07:28:47 UTC; zeileis >> ?Repository: CRAN >> ?Date/Publication: 2010-04-23 07:43:54 >> ?Built: R 2.10.1; ; 2010-04-25 06:41:34 UTC; unix >> >> (Thankfully I forgot to upgrade.packages() on this machine!) >> >> On the other >> >> ?Package: zoo >> ?Version: 1.6-5 >> ?Date: 2011-04-08 >> ?... >> ?Packaged: 2011-04-08 17:13:47 UTC; zeileis >> ?Repository: CRAN >> ?Date/Publication: 2011-04-08 17:27:47 >> ?Built: R 2.13.1; ; 2011-11-04 15:49:54 UTC; unix >> >> I have stripped out zoo 1.7-6 from all my machines. >> >> I tried to ensure all libraries were identical on the two machines >> (using lsof), and after finally downgrading zoo I got the second >> machine to be as fast as the first, so I am quite certain the >> difference in speed is down to the Zoo version used. >> >> My code runs a fairly simple function over a time series using the >> following call to process a year of 30s data (9 columns, about a >> million rows): >> >> ? ?vals <- rollapply(data=ts.data[,c(n.3.cols, o.3.cols,volocc.cols)] >> ? ? ? ? ? ? ? ? ?,width=40 >> ? ? ? ? ? ? ? ? ?,FUN=rolling.function.fn(n.cols=n.3.cols,o.cols=o.3.cols,vo.cols=volocc.cols) >> ? ? ? ? ? ? ? ? ?,by.column=FALSE >> ? ? ? ? ? ? ? ? ?,align='right') >> >> >> (The rolling.function.fn call returns a function that is initialized >> with the initial call above (a trick I learned from Javascript)) >> >> If this is a known situation with the new 1.7 generation Zoo, my >> apologies and I'll go away. ?If my code could be turned into a useful >> test, I'd be happy to help out as much as I'm able. ?Given the extreme >> runtime difference though, I thought I should offer my help in this >> case, since zoo is such a useful package in my work. > > This was a known problem and was fixed but if its still there then > there must be some other condition under which it can occur as well. > If you can provide a small self contained reproducible example it > would help in tracking it down. > > -- > Statistics & Software Consulting > GKX Group, GKX Associates Inc. > tel: 1-877-GKX-GROUP > email: ggrothendieck at gmail.com > Also, as a workaround you can try this to use an old rollapply in a new version of zoo: library(zoo) source("http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/rollapply.R?revision=817&root=zoo") rollapply(...whatever...) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From dwinsemius at comcast.net Thu Nov 3 22:40:19 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Nov 2011 17:40:19 -0400 Subject: [R] query about counting rows of a dataframe In-Reply-To: <631F8C7792124941838E6850D3A7802B032455D1DA6A@ERMES.regionemarche.intra> References: <631F8C7792124941838E6850D3A7802B032455D1DA6A@ERMES.regionemarche.intra> Message-ID: <77D3646A-22AA-408F-9EC5-9EDAF3FB8E94@comcast.net> On Nov 3, 2011, at 12:28 PM, Stefano Sofia wrote: > Dear R users, > I have got the following data frame, called my_df: > > gender day_birth month_birth year_birth labour > 1 F 22 10 > 2001 1 > 2 M 29 10 > 2001 2 > 3 M 1 11 > 2001 1 > 4 F 3 11 > 2001 1 > 5 M 3 11 > 2001 2 > 6 F 4 11 > 2001 1 > 7 F 4 11 > 2001 2 > 8 F 5 12 > 2001 2 > 9 M 22 14 > 2001 2 > 10 F 29 13 > 2001 2 > ... > > I need to count data in different ways: > > 1. count the births for each day (having 0 when necessary) > independently from the value of the "labour" column xtabs sometimes give better results. If you want all 31 days then make day_birth a factor with levels=1:31) > xtabs( ~ day_birth + month_birth + year_birth, data=dat) , , year_birth = 2001 month_birth day_birth 10 11 12 13 14 1 0 1 0 0 0 3 0 2 0 0 0 4 0 2 0 0 0 5 0 0 1 0 0 22 1 0 0 0 1 29 1 0 0 1 0 > > 2. count the births for each day (having 0 when necessary), divided > by the value of "labour" (which can have two valuers, 1 or 2) Cannot figure out what is being asked here. What to do with the two values? Just count them? This would give a partitioned count > xtabs( labour==1 ~ day_birth + month_birth , data=dat) month_birth day_birth 10 11 12 13 14 1 0 1 0 0 0 3 0 1 0 0 0 4 0 1 0 0 0 5 0 0 0 0 0 22 1 0 0 0 0 29 0 0 0 0 0 > xtabs( labour==2 ~ day_birth + month_birth , data=dat) month_birth day_birth 10 11 12 13 14 1 0 0 0 0 0 3 0 1 0 0 0 4 0 1 0 0 0 5 0 0 1 0 0 22 0 0 0 0 1 29 1 0 0 1 0 > > 3. count the births for each day of all the years (i.e. the 22nd of > October of all the years present in the data frame) independently > from the value of "labour" If I understand correctly: > xtabs( ~ day_birth + month_birth + year_birth, data=dat) , , year_birth = 2001 month_birth day_birth 10 11 12 13 14 1 0 1 0 0 0 3 0 2 0 0 0 4 0 2 0 0 0 5 0 0 1 0 0 22 1 0 0 0 1 29 1 0 0 1 0 > > 4. count the births for each day of all the years (i.e. the 22nd of > October of all the years present in the data frame), divided by the > value of "labour" Again confusing. Do you mean to use separate tables for labour==1 and labour==2? Perhaps context to explain what these values represent. Some of us are "concrete". The results of xtabs are tables and can be divided like matrices. > > I tried with the command > > table(my_df$year_birth, my_df$month_birth, my_df$day_birth) > > which satisfies (partially) question numer 1 (I am not able to have > 0 in the not available days). > > Is there a smart way to do that without invoking too many loops? > > thank you for your help David Winsemius, MD Heritage Laboratories West Hartford, CT From michael.weylandt at gmail.com Fri Nov 4 18:57:24 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 4 Nov 2011 13:57:24 -0400 Subject: [R] Plotting skewed normal distribution with a bar plot In-Reply-To: References: Message-ID: You might want to check again: I'm running fGarch on 2.13.2, Mac OSX 10.5.8. Michael On Fri, Nov 4, 2011 at 9:36 AM, Steve Friedman wrote: > Hi Michael, > > Thanks for pointing me fGarch.? I actually started there, but it is not yet > available for 2.13.2 so I went directly to the (sn-package). > > I've briefly explored your suggestion and think it will work. > > Thanks > Steve > > On Nov 3, 2011 10:41 PM, "R. Michael Weylandt" > wrote: >> >> It seems like you'll need to apply some sort of MLE to estimate the >> parameters directly from the data before using dsn() to get the >> density. This might help with some of it: >> http://help.rmetrics.org/fGarch/html/snorm.html >> >> Michael >> >> On Thu, Nov 3, 2011 at 2:54 PM, ? wrote: >> > >> > Hi, >> > >> > I need to create a plot (type = "h") ?and then overlay a skewed-normal >> > curve on this distribution, but I'm not finding a procedure to >> > accomplish >> > this. I want to use the plot function here in order to control the bin >> > distributions. >> > >> > I have explored the sn library and found the dsn function. ?dsn uses >> > known >> > location, scaling and shape parameters associated with a given input >> > vector >> > of probabilities. ?However, how can I calculate the skewed-normal curve >> > if >> > I don't know these parameters in advance? >> > >> > Is there another function to calculate the skew-normal, perhaps in a >> > different package? >> > >> > >> > I'm working with R 2.13.2 on a windows based machine. >> > >> > Steve Friedman Ph. D. >> > Ecologist ?/ Spatial Statistical Analyst >> > Everglades and Dry Tortugas National Park >> > 950 N Krome Ave (3rd Floor) >> > Homestead, Florida 33034 >> > >> > Steve_Friedman at nps.gov >> > Office (305) 224 - 4282 >> > Fax ? ? (305) 224 - 4147 >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > From jesse.r.brown at lmco.com Fri Nov 4 19:04:01 2011 From: jesse.r.brown at lmco.com (Jesse Brown) Date: Fri, 04 Nov 2011 14:04:01 -0400 Subject: [R] barplot as histogram Message-ID: <4EB42911.2080707@atl.lmco.com> Hello: I'm dealing with an issue currently that I'm not sure the best way to approach. I've got a very large (10G+) dataset that I'm trying to create a histogram for. I don't seem to be able to use hist directly as I can not create an R vector of size greater than 2.2G. I considered condensing the data previous to loading it into R and just plotting the frequencies as a barplot; unfortunately, barplot does not support plotting the values according to a set of x-axis positions. What I have is something similar to: ys <- c(12,3,7,22,10) xs <- c(1,30,35,39,60) and I'd like the bars (ys) to appear at the positions described by xs. I can get this to work on smaller sets by filling zero values in for missing ys for the entire range of xs but in my case this would again create a vector too large for R. Is there another way to use the two vectors to create a simulated frequency histogram? Is there a way to create a histogram object (as returned by hist) from the condensed data so that plot would handle it correctly? Thanks in advance, Jesse From michael.weylandt at gmail.com Fri Nov 4 19:18:04 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 4 Nov 2011 14:18:04 -0400 Subject: [R] barplot as histogram In-Reply-To: <4EB42911.2080707@atl.lmco.com> References: <4EB42911.2080707@atl.lmco.com> Message-ID: Perhaps plot(xs, ys, type = "h", lwd = 3) will work? I'm not sure that a direct call to hist(, plot = F) will get around the data problems. If you type getAnywhere(hist.default) you can see the code that runs hist(): perhaps you can extract the working bits you need. Michael On Fri, Nov 4, 2011 at 2:04 PM, Jesse Brown wrote: > Hello: > > I'm dealing with an issue currently that I'm not sure the best way to > approach. I've got a very large (10G+) dataset that I'm trying to create a > histogram for. I don't seem to be able to use hist directly as I can not > create an R vector of size greater than 2.2G. I considered condensing the > data ?previous to loading it into R ?and just plotting the frequencies as a > barplot; unfortunately, barplot does not support plotting the values > according to a set of x-axis positions. > > What I have is something similar to: > > ys <- c(12,3,7,22,10) > xs <- c(1,30,35,39,60) > > and I'd like the bars (ys) to appear at the positions described by xs. I can > get this to work on smaller sets by filling zero values in for missing ys > for the entire range of xs but in my case this would again create a vector > too large for R. > > Is there another way to use the two vectors to create a simulated frequency > histogram? Is there a way to create a histogram object (as returned by hist) > from the condensed data so that plot would handle it correctly? > > Thanks in advance, > > Jesse > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From murdoch.duncan at gmail.com Fri Nov 4 19:19:10 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Fri, 04 Nov 2011 14:19:10 -0400 Subject: [R] barplot as histogram In-Reply-To: <4EB42911.2080707@atl.lmco.com> References: <4EB42911.2080707@atl.lmco.com> Message-ID: <4EB42C9E.7010808@gmail.com> On 04/11/2011 2:04 PM, Jesse Brown wrote: > Hello: > > I'm dealing with an issue currently that I'm not sure the best way to > approach. I've got a very large (10G+) dataset that I'm trying to create > a histogram for. I don't seem to be able to use hist directly as I can > not create an R vector of size greater than 2.2G. I considered > condensing the data previous to loading it into R and just plotting > the frequencies as a barplot; unfortunately, barplot does not support > plotting the values according to a set of x-axis positions. > > What I have is something similar to: > > ys<- c(12,3,7,22,10) > xs<- c(1,30,35,39,60) > > and I'd like the bars (ys) to appear at the positions described by xs. I > can get this to work on smaller sets by filling zero values in for > missing ys for the entire range of xs but in my case this would again > create a vector too large for R. > > Is there another way to use the two vectors to create a simulated > frequency histogram? Is there a way to create a histogram object (as > returned by hist) from the condensed data so that plot would handle it > correctly? Follow your own last suggestion. Take a small subset of your data, and calculate x <- hist(data, plot=FALSE) str(x) will show you the structure of the object in x. Modify the entries to reflect your full dataset, and then plot(x) will show it. Duncan Murdoch From scott.raynaud at yahoo.com Fri Nov 4 19:51:11 2011 From: scott.raynaud at yahoo.com (Scott Raynaud) Date: Fri, 4 Nov 2011 11:51:11 -0700 (PDT) Subject: [R] Random multinomial variable Message-ID: <1320432671.43171.YahooMailNeo@web120618.mail.ne1.yahoo.com> I need some help interpreting the following code which is part of a mutlilevel model simulation with 2 levels.? I've put in comments with my understanding of the code, but I'm not sure how [i2id] is functioning.? It's defined in another part of the program as?l2id<-rep(c(1:n2),each=n1) which looks like a number corresponding to the level 1 and 2 sample size (n1, n2).? Can someone explain what?[i2id] is telling x[,3] and x[,4]? ? macpred<-rmultinom(n2,1,c(0.15,0.30,0.55)) ##?generate one multinomial variable of length=1 with probabilities of .15, .30 and .55 and do this n2 times ? x[,3]<-macpred[1,][l2id] ?##assign the first mutinomial value to column 3 ? x[,4]<-macpred[2,][l2id] ##assign the second multinomial value to column 4 From dwinsemius at comcast.net Fri Nov 4 19:56:48 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Nov 2011 14:56:48 -0400 Subject: [R] Fit continuous distribution to truncated empirical values In-Reply-To: References: Message-ID: <4A1F1909-0DF1-4A02-AB1F-BE688107E8DB@comcast.net> On Nov 3, 2011, at 7:54 AM, Michele Mazzucco wrote: > Hi all, > > I am trying to fit a distribution to some data about survival times. > I am interested only in a specific interval, e.g., while the data > lies in the interval (0,...., 600), I want the best for the interval > (0,..., 24). > > I have tried both fitdistr (MASS package) and fitdist (from the > fitdistrplus package), but I could not get them working, e.g. > > fitdistr(left, "weibull", upper=24) > Error in optim(x = c(529L, 528L, 527L, 526L, 525L, 524L, 523L, 522L, > 521L, : > L-BFGS-B needs finite values of 'fn' > In addition: Warning message: > In dweibull(x, shape, scale, log) : NaNs produced > > Am I doing something wrong? You didn't supply data to test, but shouldn't you supply a lower bound if you want to fit "weibull"? It is, after all, bounded at 0. > left <- c(529L, 528L, 527L, 526L, 525L, 524L, 523L, 522L, 521L, 50*runif(100)) > fitdistr(left, "weibull", upper=24) Error in optim(x = c(529, 528, 527, 526, 525, 524, 523, 522, 521, 18.3964251773432, : L-BFGS-B needs finite values of 'fn' In addition: Warning message: In dweibull(x, shape, scale, log) : NaNs produced > fitdistr(left, "weibull", upper=24, lower=0.5) shape scale 0.58195013 24.00000000 ( 0.04046087) ( 3.38621367) > > > Thanks, > Michele > > > p.s. I have seen similar posts, e.g., http://tolstoy.newcastle.edu.au/R/help/05/02/11558.html > , but I am not sure whether I can apply the same approach here. > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From jesse.r.brown at lmco.com Fri Nov 4 20:09:18 2011 From: jesse.r.brown at lmco.com (Jesse Brown) Date: Fri, 04 Nov 2011 15:09:18 -0400 Subject: [R] barplot as histogram In-Reply-To: References: <4EB42911.2080707@atl.lmco.com> Message-ID: <4EB4385E.6030505@atl.lmco.com> I believe that plot(..., type='h') will do the trick. I had tried that earlier but forgot to play with the lwd parameter. Incidentally, I didn't know about getAnywhere(hist.default) - really handy. I was reading the code to find the details. Thanks! Jesse R. Michael Weylandt wrote: > Perhaps > > plot(xs, ys, type = "h", lwd = 3) > > will work? > > I'm not sure that a direct call to hist(, plot = F) will get around > the data problems. If you type getAnywhere(hist.default) you can see > the code that runs hist(): perhaps you can extract the working bits > you need. > > Michael > > On Fri, Nov 4, 2011 at 2:04 PM, Jesse Brown wrote: > >> Hello: >> >> I'm dealing with an issue currently that I'm not sure the best way to >> approach. I've got a very large (10G+) dataset that I'm trying to create a >> histogram for. I don't seem to be able to use hist directly as I can not >> create an R vector of size greater than 2.2G. I considered condensing the >> data previous to loading it into R and just plotting the frequencies as a >> barplot; unfortunately, barplot does not support plotting the values >> according to a set of x-axis positions. >> >> What I have is something similar to: >> >> ys <- c(12,3,7,22,10) >> xs <- c(1,30,35,39,60) >> >> and I'd like the bars (ys) to appear at the positions described by xs. I can >> get this to work on smaller sets by filling zero values in for missing ys >> for the entire range of xs but in my case this would again create a vector >> too large for R. >> >> Is there another way to use the two vectors to create a simulated frequency >> histogram? Is there a way to create a histogram object (as returned by hist) >> from the condensed data so that plot would handle it correctly? >> >> Thanks in advance, >> >> Jesse >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > From dwinsemius at comcast.net Fri Nov 4 20:32:46 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Nov 2011 15:32:46 -0400 Subject: [R] 12th Root of a Square (Transition) Matrix In-Reply-To: <1320402848675-3989618.post@n4.nabble.com> References: <3080FA352049DF49B5FCD17896109AD576CBA2C3@BL2PRD0102MB008.prod.exchangelabs.com> <4C1B46C3.7010505@gmail.com> <1320402848675-3989618.post@n4.nabble.com> Message-ID: On Nov 4, 2011, at 6:34 AM, clangkamp wrote: > I have tried this method, but the result is not working, at least > not as I > expect: > I used the CreditMetrics package transition matrix > rc <- c("AAA", "AA", "A", "BBB", "BB", "B", "CCC", "D") > M <- matrix(c(90.81, 8.33, 0.68, 0.06, 0.08, 0.02, 0.01, 0.01, > 0.70, 90.65, 7.79, 0.64, 0.06, 0.13, 0.02, 0.01, > 0.09, 2.27, 91.05, 5.52, 0.74, 0.26, 0.01, 0.06, > 0.02, 0.33, 5.95, 85.93, 5.30, 1.17, 1.12, 0.18, > 0.03, 0.14, 0.67, 7.73, 80.53, 8.84, 1.00, 1.06, > 0.01, 0.11, 0.24, 0.43, 6.48, 83.46, 4.07, 5.20, > 0.21, 0, 0.22, 1.30, 2.38, 11.24, 64.86, 19.79, > 0, 0, 0, 0, 0, 0, 0, 100 > )/100, 8, 8, dimnames = list(rc, rc), byrow = TRUE) > > then followed through with the steps: > > nth_root <- X %*% L_star %*% X_inv Despite my (distant) physics training, I am no matrix mechanic, so I cannot comment on that method. I would instead search for an nth-root matrix function :::: http://search.r-project.org/cgi-bin/namazu.cgi?query=nth+root+of+matrix&max=100&result=normal&sort=score&idxname=functions&idxname=vignettes&idxname=views And finding one in package 'pracma', see if it succeeds: > nthroot(M, 12) AAA AA A BBB BB B CCC D AAA 0.9919988 0.8129311 0.6597444 0.5389055 0.5519810 0.4917592 0.4641589 0.4641589 AA 0.6613401 0.9918530 0.8084034 0.6564198 0.5389055 0.5747716 0.4917592 0.4641589 A 0.5574256 0.7294611 0.9922170 0.7855279 0.6644097 0.6089493 0.4641589 0.5389055 BBB 0.4917592 0.6211686 0.7904537 0.9874431 0.7828700 0.6902644 0.6877567 0.5905718 BB 0.5086590 0.5783322 0.6589304 0.8078827 0.9821168 0.8169667 0.6812921 0.6846083 B 0.4641589 0.5668255 0.6049010 0.6350224 0.7960944 0.9850460 0.7658309 0.7816283 CCC 0.5982072 0.0000000 0.6005307 0.6963517 0.7323434 0.8334839 0.9645648 0.8737164 D 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 1.0000000 > nthroot(M, 12)^12 AAA AA A BBB BB B CCC D AAA 0.9081 0.0833 0.0068 0.0006 0.0008 0.0002 0.0001 0.0001 AA 0.0070 0.9065 0.0779 0.0064 0.0006 0.0013 0.0002 0.0001 A 0.0009 0.0227 0.9105 0.0552 0.0074 0.0026 0.0001 0.0006 BBB 0.0002 0.0033 0.0595 0.8593 0.0530 0.0117 0.0112 0.0018 BB 0.0003 0.0014 0.0067 0.0773 0.8053 0.0884 0.0100 0.0106 B 0.0001 0.0011 0.0024 0.0043 0.0648 0.8346 0.0407 0.0520 CCC 0.0021 0.0000 0.0022 0.0130 0.0238 0.1124 0.6486 0.1979 D 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 > all.equal(M , nthroot(M, 12)^12) [1] TRUE Success! -- David. > > But the check (going back 12 to the power again) doesn't yield the > original > matrix. Now some rounding errors can be expected, but I didn't > expect a > perfectly diagonal matrix, when the initial matrix isn't diagonal at > all. >> round(nth_root^12,4) > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] > [1,] 0.9078 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0 > [2,] 0.0000 0.9053 0.0000 0.0000 0.0000 0.0000 0.0000 0 > [3,] 0.0000 0.0000 0.9079 0.0000 0.0000 0.0000 0.0000 0 > [4,] 0.0000 0.0000 0.0000 0.8553 0.0000 0.0000 0.0000 0 > [5,] 0.0000 0.0000 0.0000 0.0000 0.7998 0.0000 0.0000 0 > [6,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.8285 0.0000 0 > [7,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.6457 0 > [8,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1 > > Any takers > -- David Winsemius, MD Heritage Laboratories West Hartford, CT From peter.langfelder at gmail.com Fri Nov 4 21:04:26 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Fri, 4 Nov 2011 13:04:26 -0700 Subject: [R] 12th Root of a Square (Transition) Matrix In-Reply-To: <1320402848675-3989618.post@n4.nabble.com> References: <3080FA352049DF49B5FCD17896109AD576CBA2C3@BL2PRD0102MB008.prod.exchangelabs.com> <4C1B46C3.7010505@gmail.com> <1320402848675-3989618.post@n4.nabble.com> Message-ID: Is it just me or are you confusing the 12th root of a matrix with taking the 12th root of each entry? Because your formula involving the eigenvectors and eigenvalues calculates the 12th root of the matrix, while round(nth_root^12,4) will print out a matrix whose components are powers of 12 of the nth_root, which is very different. To find the 12th power of a matrix, you can either search for an appropriate function, or do res = nth_root; for (p in 1:11) res = res %*% nth_root then compare res to your original matrix M. Peter On Fri, Nov 4, 2011 at 3:34 AM, clangkamp wrote: > I have tried this method, but the result is not working, at least not as I > expect: > I used the CreditMetrics package transition matrix > rc <- c("AAA", "AA", "A", "BBB", "BB", "B", "CCC", "D") > M <- matrix(c(90.81, 8.33, 0.68, 0.06, 0.08, 0.02, 0.01, 0.01, > 0.70, 90.65, 7.79, 0.64, 0.06, 0.13, 0.02, 0.01, > 0.09, 2.27, 91.05, 5.52, 0.74, 0.26, 0.01, 0.06, > 0.02, 0.33, 5.95, 85.93, 5.30, 1.17, 1.12, 0.18, > 0.03, 0.14, 0.67, 7.73, 80.53, 8.84, 1.00, 1.06, > 0.01, 0.11, 0.24, 0.43, 6.48, 83.46, 4.07, 5.20, > 0.21, 0, 0.22, 1.30, 2.38, 11.24, 64.86, 19.79, > 0, 0, 0, 0, 0, 0, 0, 100 > )/100, 8, 8, dimnames = list(rc, rc), byrow = TRUE) > > then followed through with the steps: > > nth_root <- X %*% L_star %*% X_inv > > But the check (going back 12 to the power again) doesn't yield the original > matrix. Now some rounding errors can be expected, but I didn't expect a > perfectly diagonal matrix, when the initial matrix isn't diagonal at all. >> round(nth_root^12,4) > ? ? ? [,1] ? [,2] ? [,3] ? [,4] ? [,5] ? [,6] ? [,7] [,8] > [1,] 0.9078 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 ? ?0 > [2,] 0.0000 0.9053 0.0000 0.0000 0.0000 0.0000 0.0000 ? ?0 > [3,] 0.0000 0.0000 0.9079 0.0000 0.0000 0.0000 0.0000 ? ?0 > [4,] 0.0000 0.0000 0.0000 0.8553 0.0000 0.0000 0.0000 ? ?0 > [5,] 0.0000 0.0000 0.0000 0.0000 0.7998 0.0000 0.0000 ? ?0 > [6,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.8285 0.0000 ? ?0 > [7,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.6457 ? ?0 > [8,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 ? ?1 > > Any takers > > > ----- > Christian Langkamp > christian.langkamp-at-gmxpro.de > > -- > View this message in context: http://r.789695.n4.nabble.com/12th-Root-of-a-Square-Transition-Matrix-tp2259736p3989618.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Sent from my Linux computer. Way better than iPad :) From Achim.Zeileis at uibk.ac.at Fri Nov 4 21:03:32 2011 From: Achim.Zeileis at uibk.ac.at (Achim Zeileis) Date: Fri, 4 Nov 2011 21:03:32 +0100 (CET) Subject: [R] nproc parameter in efpFunctional In-Reply-To: <1320402432656-3989598.post@n4.nabble.com> References: <1320228608589-3972419.post@n4.nabble.com> <1320306970603-3984605.post@n4.nabble.com> <1320402432656-3989598.post@n4.nabble.com> Message-ID: On Fri, 4 Nov 2011, bonda wrote: > The 2006 CSDA paper is really very informative, perhaps, I'm trying to > understand the things lying beyond. If we have e.g. k=3, then taking > nproc=3 for the functional maxBB we get a critical value (boundary) > > maxBB$computeCritval(0.05,nproc=3) > [1] 1.544421, > > and this for nproc=NULL (Bonferroni approximation) will be > > maxBB$computeCritval(0.05) > [1] 1.358099. No. In the latter case no Bonferroni approximation is applied. If you want to use it, you can do so via the rule of thumb R> maxBB$computeCritval(0.05/3, nproc = 1) [1] 1.547175 which essentially matches the critical value computed for nproc = 3. If you use the more precise value 1 - (1 - 0.05)^(1/3) instead of 0.05/3, you get a match (up to some small numerical differences). Setting nproc=NULL is only possible in efpFunctional(): efpFunctional() sets up the computeCritval() and computePval() functions via simulation methods (unless closed form solutions are supplied). For the simulation two strategies are available: Simulate nproc = 1, 2, 3, ... explicitly. Simulate only nproc = 1 and apply a Bonferroni correction. The last option is chosen if you set nproc=NULL -- it makes only sense if you aggregate via the maximum across the components. The resulting computeCritval() and computePval() function always need to have the correct nproc supplied (i.e., nproc=NULL makes no sense). > Aggregating 3 Brownian bridges first over components, we obtain time series > process. Now, we wonder if maximum value of the process (aggregation over > time) lies over boundary. Which boundary - 1.544421 or 1.358099 - should one > take? They look too different and, for instance, lead to "unfair computing" > of empirical size (as rejection rate of null hypothesis) or empirical power > (as acception rate of alternative). > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/nproc-parameter-in-efpFunctional-tp3972419p3989598.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From dwinsemius at comcast.net Fri Nov 4 21:33:55 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Nov 2011 16:33:55 -0400 Subject: [R] Lattice plots and missing x-axis labels on second page In-Reply-To: <00A769DD3A6D594C9902BC12EF1CAF8E064CE722@SOAANCMSG01.soa.alaska.gov> References: <00A769DD3A6D594C9902BC12EF1CAF8E064CE6EC@SOAANCMSG01.soa.alaska.gov> <00A769DD3A6D594C9902BC12EF1CAF8E064CE722@SOAANCMSG01.soa.alaska.gov> Message-ID: On Nov 2, 2011, at 8:23 PM, Evans, David G (DFG) wrote: > I should say I'm using Windows-7, R version 2.13.0 and lattice version > 0.19-33. I've pared down my code to this : > > pdat = read.table("RGRAPHSDGE.csv",header=T,sep=",",fill=T) > print(xyplot(pdat$NITRATE~pdat$DATEYR|pdat$WELL, I generally try to avoid building lattice plots this way. It is better to use data=pdat and shorten the formula specification. That way the functions can use the aggregate information in the dataframe. Try: xyplot( NITRATE~ DATEYR| WELL, data=pdat, > as.table=TRUE, > layout=c(3,4), You may want to try layout=c(3,4,2) See help(xyplot) subsection layout, where Sarkar says that pages are somethimes incorrectly calculated. > xlab="Year", > ylab="Nitrate mg / litre", > strip=FALSE > )) > > First 3 lines of pdat looks like this: > WELL DATEYR NITRATE > 1 ALASKA CHILDRENS SERVICES 1993.836 0.81 > 2 ALASKA CHILDRENS SERVICES 1994.850 0.91 > 3 ALASKA CHILDRENS SERVICES 1995.803 0.94 > .... > Thanks again. > > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org > ] > On Behalf Of Evans, David G (DFG) > Sent: Wednesday, November 02, 2011 3:24 PM > To: r-help at r-project.org > Subject: [R] Lattice plots and missing x-axis labels on second page > > Hello, > I'm trying to make a lattice plot (using xyplot()). I have included a > "layout=c(3,4)" statement, giving me 12 plots per page and an > "as.table=TRUE" statement, directing the way the plots are laid > out. I > have 18 plots altogether and so 6 of them end up on the second page. > Everything looks fine for the first page, but the x-axis labels (e.g. > 1993, 1994...) are all missing on the second page. The x-axis > variable > name ("Year") is there at the bottom, however. Any help is > appreciated. Thanks. > > David G. Evans > Biometrician > Division of Sport Fish > Alaska Dept . of Fish and Game > Anchorage, Ak 99518 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From dwinsemius at comcast.net Fri Nov 4 21:38:24 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Nov 2011 16:38:24 -0400 Subject: [R] Lattice plots and missing x-axis labels on second page In-Reply-To: <00A769DD3A6D594C9902BC12EF1CAF8E064CE722@SOAANCMSG01.soa.alaska.gov> References: <00A769DD3A6D594C9902BC12EF1CAF8E064CE6EC@SOAANCMSG01.soa.alaska.gov> <00A769DD3A6D594C9902BC12EF1CAF8E064CE722@SOAANCMSG01.soa.alaska.gov> Message-ID: <1935E478-9ABD-4244-9260-D66068E1FF1E@comcast.net> On Nov 2, 2011, at 8:23 PM, Evans, David G (DFG) wrote: > I should say I'm using Windows-7, R version 2.13.0 and lattice version > 0.19-33. I've pared down my code to this : > > pdat = read.table("RGRAPHSDGE.csv",header=T,sep=",",fill=T) > print(xyplot(pdat$NITRATE~pdat$DATEYR|pdat$WELL, I generally try to avoid building lattice plots this way. It is better to use data=pdat and shorten the formula specification. That way the functions can use the aggregate information in the dataframe. Try: xyplot( NITRATE~ DATEYR| WELL, data=pdat, > as.table=TRUE, > layout=c(3,4), You may want to try layout=c(3,4,2) See help(xyplot) subsection layout, where Sarkar says that pages are somethimes incorrectly calculated. > xlab="Year", > ylab="Nitrate mg / litre", > strip=FALSE > )) > > First 3 lines of pdat looks like this: > WELL DATEYR NITRATE > 1 ALASKA CHILDRENS SERVICES 1993.836 0.81 > 2 ALASKA CHILDRENS SERVICES 1994.850 0.91 > 3 ALASKA CHILDRENS SERVICES 1995.803 0.94 > .... > Thanks again. > > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org > ] > On Behalf Of Evans, David G (DFG) > Sent: Wednesday, November 02, 2011 3:24 PM > To: r-help at r-project.org > Subject: [R] Lattice plots and missing x-axis labels on second page > > Hello, > I'm trying to make a lattice plot (using xyplot()). I have included a > "layout=c(3,4)" statement, giving me 12 plots per page and an > "as.table=TRUE" statement, directing the way the plots are laid > out. I > have 18 plots altogether and so 6 of them end up on the second page. > Everything looks fine for the first page, but the x-axis labels (e.g. > 1993, 1994...) are all missing on the second page. The x-axis > variable > name ("Year") is there at the bottom, however. Any help is > appreciated. Thanks. > > David G. Evans > Biometrician > Division of Sport Fish > Alaska Dept . of Fish and Game > Anchorage, Ak 99518 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From dwinsemius at comcast.net Fri Nov 4 22:37:50 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Nov 2011 17:37:50 -0400 Subject: [R] 12th Root of a Square (Transition) Matrix In-Reply-To: References: <3080FA352049DF49B5FCD17896109AD576CBA2C3@BL2PRD0102MB008.prod.exchangelabs.com> <4C1B46C3.7010505@gmail.com> <1320402848675-3989618.post@n4.nabble.com> Message-ID: <9C6FCB9A-8142-4248-85B9-2C37E8479759@comcast.net> On Nov 4, 2011, at 4:04 PM, Peter Langfelder wrote: > Is it just me or are you confusing the 12th root of a matrix with > taking the 12th root of each entry? I think I got confused as well. Thanks for clarifying. > Because your formula involving the > eigenvectors and eigenvalues calculates the 12th root of the matrix, > while > > round(nth_root^12,4) > > will print out a matrix whose components are powers of 12 of the > nth_root, which is very different. To find the 12th power of a matrix, > you can either search for an appropriate function, or do > > res = nth_root; > for (p in 1:11) > res = res %*% nth_root The 12th (matrix) root of M: e^( 1/n * log(M) ) > require(Matrix) > M1.12 <- expm( (1/12)*logm(M) ) > res = M1.12; nth_root=M1.12 > for (p in 1:11) + res = res %*% nth_root # Check accuracy > round(res, 4) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,] 0.9081 0.0833 0.0068 0.0006 0.0008 0.0002 0.0001 0.0001 [2,] 0.0070 0.9065 0.0779 0.0064 0.0006 0.0013 0.0002 0.0001 [3,] 0.0009 0.0227 0.9105 0.0552 0.0074 0.0026 0.0001 0.0006 [4,] 0.0002 0.0033 0.0595 0.8593 0.0530 0.0117 0.0112 0.0018 [5,] 0.0003 0.0014 0.0067 0.0773 0.8053 0.0884 0.0100 0.0106 [6,] 0.0001 0.0011 0.0024 0.0043 0.0648 0.8346 0.0407 0.0520 [7,] 0.0021 0.0000 0.0022 0.0130 0.0238 0.1124 0.6486 0.1979 [8,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 > M AAA AA A BBB BB B CCC D AAA 0.9081 0.0833 0.0068 0.0006 0.0008 0.0002 0.0001 0.0001 AA 0.0070 0.9065 0.0779 0.0064 0.0006 0.0013 0.0002 0.0001 A 0.0009 0.0227 0.9105 0.0552 0.0074 0.0026 0.0001 0.0006 BBB 0.0002 0.0033 0.0595 0.8593 0.0530 0.0117 0.0112 0.0018 BB 0.0003 0.0014 0.0067 0.0773 0.8053 0.0884 0.0100 0.0106 B 0.0001 0.0011 0.0024 0.0043 0.0648 0.8346 0.0407 0.0520 CCC 0.0021 0.0000 0.0022 0.0130 0.0238 0.1124 0.6486 0.1979 D 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 Rather good agreement to 4 decimal places anyway. > > then compare res to your original matrix M. > > Peter > > On Fri, Nov 4, 2011 at 3:34 AM, clangkamp > wrote: >> I have tried this method, but the result is not working, at least >> not as I >> expect: >> I used the CreditMetrics package transition matrix >> rc <- c("AAA", "AA", "A", "BBB", "BB", "B", "CCC", "D") >> M <- matrix(c(90.81, 8.33, 0.68, 0.06, 0.08, 0.02, 0.01, 0.01, >> 0.70, 90.65, 7.79, 0.64, 0.06, 0.13, 0.02, 0.01, >> 0.09, 2.27, 91.05, 5.52, 0.74, 0.26, 0.01, 0.06, >> 0.02, 0.33, 5.95, 85.93, 5.30, 1.17, 1.12, 0.18, >> 0.03, 0.14, 0.67, 7.73, 80.53, 8.84, 1.00, 1.06, >> 0.01, 0.11, 0.24, 0.43, 6.48, 83.46, 4.07, 5.20, >> 0.21, 0, 0.22, 1.30, 2.38, 11.24, 64.86, 19.79, >> 0, 0, 0, 0, 0, 0, 0, 100 >> )/100, 8, 8, dimnames = list(rc, rc), byrow = TRUE) >> >> then followed through with the steps: >> >> nth_root <- X %*% L_star %*% X_inv >> >> But the check (going back 12 to the power again) doesn't yield the >> original >> matrix. Now some rounding errors can be expected, but I didn't >> expect a >> perfectly diagonal matrix, when the initial matrix isn't diagonal >> at all. >>> round(nth_root^12,4) >> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] >> [1,] 0.9078 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0 >> [2,] 0.0000 0.9053 0.0000 0.0000 0.0000 0.0000 0.0000 0 >> [3,] 0.0000 0.0000 0.9079 0.0000 0.0000 0.0000 0.0000 0 >> [4,] 0.0000 0.0000 0.0000 0.8553 0.0000 0.0000 0.0000 0 >> [5,] 0.0000 0.0000 0.0000 0.0000 0.7998 0.0000 0.0000 0 >> [6,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.8285 0.0000 0 >> [7,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.6457 0 >> [8,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1 >> >> Any takers >> >> >> ----- >> Christian Langkamp >> christian.langkamp-at-gmxpro.de >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/12th-Root-of-a-Square-Transition-Matrix-tp2259736p3989618.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Sent from my Linux computer. Way better than iPad :) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From peter.langfelder at gmail.com Fri Nov 4 23:10:20 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Fri, 4 Nov 2011 15:10:20 -0700 Subject: [R] 12th Root of a Square (Transition) Matrix In-Reply-To: <9C6FCB9A-8142-4248-85B9-2C37E8479759@comcast.net> References: <3080FA352049DF49B5FCD17896109AD576CBA2C3@BL2PRD0102MB008.prod.exchangelabs.com> <4C1B46C3.7010505@gmail.com> <1320402848675-3989618.post@n4.nabble.com> <9C6FCB9A-8142-4248-85B9-2C37E8479759@comcast.net> Message-ID: On Fri, Nov 4, 2011 at 2:37 PM, David Winsemius wrote: > > The 12th (matrix) root of M: e^( 1/n * log(M) ) > >> require(Matrix) >> M1.12 <- expm( (1/12)*logm(M) ) I like this - haven't thought of the matrix algebra functions in Matrix. Thanks, Peter From Jose.Iparraguirre at ageuk.org.uk Fri Nov 4 16:18:13 2011 From: Jose.Iparraguirre at ageuk.org.uk (Jose Iparraguirre) Date: Fri, 4 Nov 2011 15:18:13 +0000 Subject: [R] How to delete only those rows in a dataframe in which all records are missing In-Reply-To: References: Message-ID: It does! Thanks, Jos? -----Original Message----- From: R. Michael Weylandt [mailto:michael.weylandt at gmail.com] Sent: 04 November 2011 15:18 To: Jose Iparraguirre Cc: r-help at r-project.org Subject: Re: [R] How to delete only those rows in a dataframe in which all records are missing Perhaps something like this will work. df[!(rowSums(is.na(df))==NCOL(df)),] Michael On Fri, Nov 4, 2011 at 9:27 AM, Jose Iparraguirre wrote: > Hi, > > Imagine I have the following data frame: > >> a <- c(1,NA,3) >> b <- c(2,NA,NA) >> c <- data.frame(cbind(a,b)) >> c > ? a ?b > 1 ?1 ?2 > 2 NA NA > 3 ?3 NA > > I want to delete the second row. If I use na.omit, that would also affect the third row. I tried to use a loop and an ifelse clause with is.na to get R identify that row in which all records are missing, as opposed to the first row in which no records are missing or the third one, in which only one record is missing. How can I get R identify the row in which all records are missing? Or, how can I get R delete/omit only this row? > Thanks in advance, > > Jos? > > > Jos? Iparraguirre > Chief Economist > Age UK > > T 020 303 31482 > E Jose.Iparraguirre at ageuk.org.uk > > Tavis House, 1- 6 Tavistock Square > London, WC1H 9NB > www.ageuk.org.uk | ageukblog.org.uk | @AgeUKPA > > > Age UK ?Improving later life > > www.ageuk.org.uk > > > > > > ------------------------------- > > Age UK is a registered charity and company limited by guarantee, (registered charity number 1128267, registered company number 6825798). Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA. > > For the purposes of promoting Age UK Insurance, Age UK is an Appointed Representative of Age UK Enterprises Limited, Age UK is an Introducer Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth Access for the purposes of introducing potential annuity and health cash plans customers respectively. ?Age UK Enterprises Limited, JLT Benefit Solutions Limited and Simplyhealth Access are all authorised and regulated by the Financial Services Authority. > > > > > > ------------------------------ > > This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you receive a message in error, please advise the sender and delete immediately. > > > > Except where this email is sent in the usual course of our business, any opinions expressed in this email are those of the author and do not necessarily reflect the opinions of Age UK or its subsidiaries and associated companies. Age UK monitors all e-mail transmissions passing through its network and may block or modify mails which are deemed to be unsuitable. > > > > > > Age Concern England (charity number 261794) and Help the Aged (charity number 272786) and their trading and other associated companies merged on 1st April 2009. ?Together they have formed the Age UK Group, dedicated to improving the lives of people in later life. ?The three national Age Concerns in Scotland, Northern Ireland and Wales have also merged with Help the Aged in these nations to form three registered charities: Age Scotland, Age NI, Age Cymru. > > > > > > > > > > > > > > > > > > > > > > > > > > > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > Age UK Improving later life www.ageuk.org.uk ------------------------------- Age UK is a registered charity and company limited by guarantee, (registered charity number 1128267, registered company number 6825798). Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA. For the purposes of promoting Age UK Insurance, Age UK is an Appointed Representative of Age UK Enterprises Limited, Age UK is an Introducer Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth Access for the purposes of introducing potential annuity and health cash plans customers respectively. Age UK Enterprises Limited, JLT Benefit Solutions Limited and Simplyhealth Access are all authorised and regulated by the Financial Services Authority. ------------------------------ This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you receive a message in error, please advise the sender and delete immediately. Except where this email is sent in the usual course of our business, any opinions expressed in this email are those of the author and do not necessarily reflect the opinions of Age UK or its subsidiaries and associated companies. Age UK monitors all e-mail transmissions passing through its network and may block or modify mails which are deemed to be unsuitable. Age Concern England (charity number 261794) and Help the Aged (charity number 272786) and their trading and other associated companies merged on 1st April 2009. Together they have formed the Age UK Group, dedicated to improving the lives of people in later life. The three national Age Concerns in Scotland, Northern Ireland and Wales have also merged with Help the Aged in these nations to form three registered charities: Age Scotland, Age NI, Age Cymru. From katherinejstewart at gmail.com Fri Nov 4 17:55:04 2011 From: katherinejstewart at gmail.com (K Stewart) Date: Fri, 4 Nov 2011 09:55:04 -0700 (PDT) Subject: [R] Determining r2 values for a SEM Message-ID: <1320425704598-3990855.post@n4.nabble.com> Hello, I have been using the SEM package and developed 4 models that all have an adequate fit. I require r2 values for the variables in my SEM and cannot find a way to get r2 values for a SEM. I have attempted using the smc (squared multiple correlation) function, but this only takes the initial covariance matrix and not the fitted SEM as arguements, so the r2 values cannot be correct. Is there a way to determine r2 values for an SEM in the SEM package or another way to get these values in R? Thank you for your assistance, Katherine Stewart -- View this message in context: http://r.789695.n4.nabble.com/Determining-r2-values-for-a-SEM-tp3990855p3990855.html Sent from the R help mailing list archive at Nabble.com. From francois.pepin at sequentainc.com Fri Nov 4 18:50:41 2011 From: francois.pepin at sequentainc.com (Francois Pepin) Date: Fri, 4 Nov 2011 10:50:41 -0700 Subject: [R] Reading parameters from dataframe and loading as objects In-Reply-To: <1320387887417-3989150.post@n4.nabble.com> References: <1320387887417-3989150.post@n4.nabble.com> Message-ID: Hi, assign is your friend here: apply(data,1,function(x)assign(x[1],x[2],envir = .GlobalEnv)) As a note, you probably don't want to use data as a variable because it overwrites the data function, leading to unwanted side-effects if you ever use it. Cheers, Fran?ois Pepin Scientist Sequenta, Inc. 400 E. Jamie Court, Suite 301 South San Francisco, CA 94080 650 243 3929 p francois.pepin at sequentainc.com www.sequentainc.com The contents of this e-mail message and any attachments are intended solely for the addressee(s) named in this message. This communication is intended to be and to remain confidential and may be subject to applicable attorney/client and/or work product privileges. If you are not the intended recipient of this message, or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and its attachments. Do not deliver, distribute or copy this message and/or any attachments and if you are not the intended recipient, do not disclose the contents or take any action in reliance upon the information contained in this communication or any attachments. On Nov 3, 2011, at 23:24 , Aher wrote: > Hi List, > > I want to read several parameters from data frame and load them as object > into R session, Is there any package or function in R for this?? > > Here is example > > param <-c("clust_num", "minsamp_size", "maxsamp_size", "min_pct", "max_pct") > value <-c(15, 20000, 200000, 0.001, .999) > data <- data.frame ( cbind(param , value)) > data > param value > 1 clust_num 15 > 2 minsamp_size 20000 > 3 maxsamp_size 2e+05 > 4 min_pct 0.001 > 5 max_pct 0.999 > > My data contains many such parameters, I need to read each parameter and its > value from the data and load it as objects in R session as below: > > clust_num <- 15 > minsamp_size <-20000 > maxsamp_size <-2e+05 > min_pct <-0.001 > max_pct <-0.999 > > The way right now I am doing it is as creating as many variables as > parameters in the data frame and one observation for value of each > parameter. > example: > clust_num minsamp_size maxsamp_size min_pct max_pct > 15 20000 200000 0.001 0.999 > > data$ clust_num , data$minsamp_size, ..... > > Is there any better way for doing this? > > > -- > View this message in context: http://r.789695.n4.nabble.com/Reading-parameters-from-dataframe-and-loading-as-objects-tp3989150p3989150.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jmf at ib.usp.br Fri Nov 4 19:06:18 2011 From: jmf at ib.usp.br (JulianaMF) Date: Fri, 4 Nov 2011 11:06:18 -0700 (PDT) Subject: [R] error message In-Reply-To: References: <1295368328987-3223412.post@n4.nabble.com> Message-ID: <1320429978716-3991100.post@n4.nabble.com> Hello, I am a PhD candidate at University of Sao Paulo, finishing the thesis and know nothing about R language... I am trying to analyze a structure data set (STRs) with 82 samples and 10 loci (missing data = -9) and I keep getting the same Error in text[(last line - n + 1):last line] : only 0's may be mixed with negative subscripts message. I read this thread and tried the traceback () function, but all I got after I typed it was 1: read.structure(file = "SimStru3.str")... When I import the data using the read.table function, it works well, but then I am not being able to transform the data frame in a genind object... I know I may be asking very stupid things, but I really could use the help... Thank you so much! Juliana -- View this message in context: http://r.789695.n4.nabble.com/error-message-tp3223412p3991100.html Sent from the R help mailing list archive at Nabble.com. From aziem at us.ci.org Fri Nov 4 19:31:08 2011 From: aziem at us.ci.org (Andrew Ziem) Date: Fri, 4 Nov 2011 18:31:08 +0000 Subject: [R] Decision tree model using rpart ( classification References: <1320388593513-3989162.post@n4.nabble.com> <1320395527772-3989320.post@n4.nabble.com> Message-ID: aajit75 yahoo.co.in> writes: > fit <- rpart(decile ~., method="class", > control=rpart.control(minsplit=min_obs_split, cp=c_c_factor), > data=dtm_ip) > > In A and B target variable 'segment' is from the clustering data using same > set of input variables , while in C target variable 'decile' is derived from > behavioural variables and input variables are from profile data. Number of > rows in the input table in all three cases are same. What is the value of modeling the deciles as the target? They are a lower resolution version of information you already have, and without this model that doesn't finish fitting you should already be able to assign a decile to every customer. Andrew From tmdalbey at gmail.com Fri Nov 4 19:55:20 2011 From: tmdalbey at gmail.com (TimothyDalbey) Date: Fri, 4 Nov 2011 11:55:20 -0700 (PDT) Subject: [R] HoltWinters in R 2.14.0 Message-ID: <1320432920314-3991247.post@n4.nabble.com> Hey All, First time on these forums. Thanks in advance. Soooo... I have a process that was functioning well before the 2.14 update. Now the HoltWinters function is throwing an error whereby I get the following: Error in HoltWinters(sales.ts) : optimization failure I've been looking around to determine why this happens (see if I can test the data beforehand) but I haven't come across anything. Any help appreciated! -- View this message in context: http://r.789695.n4.nabble.com/HoltWinters-in-R-2-14-0-tp3991247p3991247.html Sent from the R help mailing list archive at Nabble.com. From mark_difford at yahoo.co.uk Fri Nov 4 20:06:12 2011 From: mark_difford at yahoo.co.uk (Mark Difford) Date: Fri, 4 Nov 2011 12:06:12 -0700 (PDT) Subject: [R] Determining r2 values for a SEM In-Reply-To: <1320425704598-3990855.post@n4.nabble.com> References: <1320425704598-3990855.post@n4.nabble.com> Message-ID: <1320433572560-3991279.post@n4.nabble.com> On Nov 04, 2011 at 6:55pm Katherine Stewart wrote: > Is there a way to determine r2 values for an SEM in the SEM package or > another way to get > these values in R? Katherine, rsquare.sem() in package sem.additions will do it for you. Regards, Mark. ----- Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Determining-r2-values-for-a-SEM-tp3990855p3991279.html Sent from the R help mailing list archive at Nabble.com. From christopherleesimons at gmail.com Fri Nov 4 20:21:09 2011 From: christopherleesimons at gmail.com (Christopher Simons) Date: Fri, 4 Nov 2011 15:21:09 -0400 Subject: [R] Unused Arguments Error Even Though I'm Using Them? Message-ID: Greetings, I am running into an error I can't seem to get past; something tells me I am making an obvious mistake as I am new to R, but everything looks fine to me. Basically, I'm getting the message "unused arguments" even though my arguments are all being used. I have posted my code at the following URL: http://textsnip.com/8af5b2 Thanks for any help, CLS From suzzyhenderson at gmail.com Fri Nov 4 21:07:10 2011 From: suzzyhenderson at gmail.com (suzzy) Date: Fri, 4 Nov 2011 13:07:10 -0700 (PDT) Subject: [R] ANCOVA with many levels of one factor Message-ID: <1320437230132-3991474.post@n4.nabble.com> I am trying to do and ANCOVA with ten sites that I want to compare condition at, with Length as a covariate. Most examples I have found only deal with two levels and I am unsure if the same code applies for more than two levels. Here is what I have, and I just wanted to double check that I am on the right track.... ancova<-lm(Condition~site+Length+site:Length) summary(ancova) anova(ancova) ancova1<-update(ancova,~.-site:Length) anova(ancova,ancova1) summary(ancova1) Any help/suggestions welcome! Thanks! -- View this message in context: http://r.789695.n4.nabble.com/ANCOVA-with-many-levels-of-one-factor-tp3991474p3991474.html Sent from the R help mailing list archive at Nabble.com. From a.khaleghei at gmail.com Fri Nov 4 21:42:44 2011 From: a.khaleghei at gmail.com (Akram Khaleghei Ghosheh balagh) Date: Fri, 4 Nov 2011 16:42:44 -0400 Subject: [R] how to interpret vglm output? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wang0501 at umn.edu Fri Nov 4 21:59:57 2011 From: wang0501 at umn.edu (Jeremy Wang) Date: Fri, 4 Nov 2011 15:59:57 -0500 Subject: [R] Creating a sequence from two samples with several constraints (frequency and repeats) Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From martin.remo.studer at gmail.com Fri Nov 4 22:04:35 2011 From: martin.remo.studer at gmail.com (Martin Studer) Date: Fri, 4 Nov 2011 14:04:35 -0700 (PDT) Subject: [R] XLConnect Error In-Reply-To: <1320335995789-3986491.post@n4.nabble.com> References: <1309200534372-3628528.post@n4.nabble.com> <4E08ECF2.6020307@statistik.tu-dortmund.de> <1320335995789-3986491.post@n4.nabble.com> Message-ID: <1320440675826-3991681.post@n4.nabble.com> Hi Daniel, you can get Java from http://www.java.com/en/download/manual.jsp?locale=en Simply download and follow the instructions. Hope that helps. Martin -- View this message in context: http://r.789695.n4.nabble.com/XLConnect-Error-tp3628528p3991681.html Sent from the R help mailing list archive at Nabble.com. From AZiem at us.ci.org Fri Nov 4 22:56:34 2011 From: AZiem at us.ci.org (Andrew Ziem ) Date: Fri, 4 Nov 2011 15:56:34 -0600 Subject: [R] bug calculating ROC with caret and earth? Message-ID: <50486F3885905241A1890719DAFDD3447FE90A4E24@gmc0050.ci.org> Does caret have a bug calculating ROC with earth? ?When using caret and earth on any of my data sets, caret's ROC never varies. ?This could mean earth is finding the same model (for example, because of using an nprune parameter that is too high). ?However, if that were true, sensitivity and specificity would also not vary, but they do vary. Also, I verified nprune is not too high. I am attaching sample output from R 2.14.0 on Windows 7 64-bit with earth 3.2 and caret 5.07. I don't have this problem with caret and ctree. Andrew -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: caret-earth-ROC-invariant.log.txt URL: From norman at khine.net Fri Nov 4 23:06:33 2011 From: norman at khine.net (Norman Khine) Date: Fri, 4 Nov 2011 23:06:33 +0100 Subject: [R] representing wind date using windrose Message-ID: hello, i am new to R and want to use it for a small project to draw a wind data from a microclimate datasource, can someone give me an example of how i can represent this in a neat way? for example, i have: speed, direction 0.3,NNE 0.45,NNE 0.32,NE 0.28,N 0.30,NE how do i put this data to get a windrose graph? many thanks norman -- %>>> "".join( [ {'*':'@','^':'.'}.get(c,None) or chr(97+(ord(c)-83)%26) for c in ",adym,*)&uzq^zqf" ] ) From jfrabetti at sdsc.edu Fri Nov 4 23:18:28 2011 From: jfrabetti at sdsc.edu (jo) Date: Fri, 4 Nov 2011 15:18:28 -0700 (PDT) Subject: [R] How to use 'prcomp' with CLUSPLOT? In-Reply-To: References: <47CA6EA672AF4A4DA62A7DCCFEDC9ABB0BB34CF0@XMAIL-MBX-AH1.AD.UCSD.EDU> <47CA6EA672AF4A4DA62A7DCCFEDC9ABB0BB34D18@XMAIL-MBX-AH1.AD.UCSD.EDU> Message-ID: <1320445108437-3991868.post@n4.nabble.com> Hello Michael, Thank you for replying to my post! That was an interesting solution - good to know, but I am now getting a different error: /Error in if (length(clus) != n) stop("The clustering vector is of incorrect length") : argument is of length zero/ which brought me here: https://svn.r-project.org/R-packages/trunk/cluster/R/plotpart.q I am trying to figure that out now... FYI, as a test set, one could just delete columns until they are <= to the number of rows... clusplot has some nice extras, but I am also looking at just plotting w/pca... Thank you again, Jo -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-prcomp-with-CLUSPLOT-tp3989022p3991868.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Fri Nov 4 23:50:37 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Nov 2011 18:50:37 -0400 Subject: [R] representing wind date using windrose In-Reply-To: References: Message-ID: There are several windrose functions in various packages. Try this RSiteSearch("winrose") -- David On Nov 4, 2011, at 6:06 PM, Norman Khine wrote: > hello, > i am new to R and want to use it for a small project to draw a wind > data from a microclimate datasource, can someone give me an example of > how i can represent this in a neat way? > > for example, i have: > speed, direction > 0.3,NNE > 0.45,NNE > 0.32,NE > 0.28,N > 0.30,NE > > how do i put this data to get a windrose graph? > > many thanks > > norman > > -- > > %>>> "".join( [ {'*':'@','^':'.'}.get(c,None) or > chr(97+(ord(c)-83)%26) for c in ",adym,*)&uzq^zqf" ] ) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From dwinsemius at comcast.net Fri Nov 4 23:57:13 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Nov 2011 18:57:13 -0400 Subject: [R] Reading parameters from dataframe and loading as objects In-Reply-To: References: <1320387887417-3989150.post@n4.nabble.com> Message-ID: <782D7C7B-E7FE-41B7-83B9-9A0C711CFA0E@comcast.net> On Nov 4, 2011, at 1:50 PM, "Francois Pepin" wrote: > Hi, > > assign is your friend here: > apply(data,1,function(x)assign(x[1],x[2],envir = .GlobalEnv)) > > As a note, you probably don't want to use data as a variable because it overwrites the data function, leading to unwanted side-effects if you ever use it. While it is true that using "data" as an object name is a bad choice, the specific reason offered is incorrect. Functions are kept in a different list from other named objects and creating a "data" data.frame will NOT overwrite the data function. -- David. > > Cheers, > > Fran?ois Pepin > Scientist > > Sequenta, Inc. > 400 E. Jamie Court, Suite 301 > South San Francisco, CA 94080 > > 650 243 3929 p > > francois.pepin at sequentainc.com > www.sequentainc.com > > The contents of this e-mail message and any attachments are intended solely for the addressee(s) named in this message. This communication is intended to be and to remain confidential and may be subject to applicable attorney/client and/or work product privileges. If you are not the intended recipient of this message, or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and its attachments. Do not deliver, distribute or copy this message and/or any attachments and if you are not the intended recipient, do not disclose the contents or take any action in reliance upon the information contained in this communication or any attachments. > > On Nov 3, 2011, at 23:24 , Aher wrote: > >> Hi List, >> >> I want to read several parameters from data frame and load them as object >> into R session, Is there any package or function in R for this?? >> >> Here is example >> >> param <-c("clust_num", "minsamp_size", "maxsamp_size", "min_pct", "max_pct") >> value <-c(15, 20000, 200000, 0.001, .999) >> data <- data.frame ( cbind(param , value)) >> data >> param value >> 1 clust_num 15 >> 2 minsamp_size 20000 >> 3 maxsamp_size 2e+05 >> 4 min_pct 0.001 >> 5 max_pct 0.999 >> >> My data contains many such parameters, I need to read each parameter and its >> value from the data and load it as objects in R session as below: >> >> clust_num <- 15 >> minsamp_size <-20000 >> maxsamp_size <-2e+05 >> min_pct <-0.001 >> max_pct <-0.999 >> >> The way right now I am doing it is as creating as many variables as >> parameters in the data frame and one observation for value of each >> parameter. >> example: >> clust_num minsamp_size maxsamp_size min_pct max_pct >> 15 20000 200000 0.001 0.999 >> >> data$ clust_num , data$minsamp_size, ..... >> >> Is there any better way for doing this? >> >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Reading-parameters-from-dataframe-and-loading-as-objects-tp3989150p3989150.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From michael.weylandt at gmail.com Sat Nov 5 00:15:52 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 4 Nov 2011 19:15:52 -0400 Subject: [R] representing wind date using windrose In-Reply-To: References: Message-ID: Try this: http://rss.acs.unt.edu/Rdoc/library/climatol/html/rosavent.html Michael On Fri, Nov 4, 2011 at 6:06 PM, Norman Khine wrote: > hello, > i am new to R and want to use it for a small project to draw a wind > data from a microclimate datasource, can someone give me an example of > how i can represent this in a neat way? > > for example, i have: > speed, direction > 0.3,NNE > 0.45,NNE > 0.32,NE > 0.28,N > 0.30,NE > > how do i put this data to get a windrose graph? > > many thanks > > norman > > -- > > %>>> "".join( [ {'*':'@','^':'.'}.get(c,None) or > chr(97+(ord(c)-83)%26) for c in ",adym,*)&uzq^zqf" ] ) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From michael.weylandt at gmail.com Sat Nov 5 00:21:55 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 4 Nov 2011 19:21:55 -0400 Subject: [R] HoltWinters in R 2.14.0 In-Reply-To: <1320432920314-3991247.post@n4.nabble.com> References: <1320432920314-3991247.post@n4.nabble.com> Message-ID: I believe there were some changes to Holt-Winters, specifically in re optimization that probably lead to your problem, but you'll have to provide more details. See the NEWS file for citations about the change. If you put example code/data others may be able to help you -- I haven't updated yet so I can't be of much help. Michael On Fri, Nov 4, 2011 at 2:55 PM, TimothyDalbey wrote: > Hey All, > > First time on these forums. ?Thanks in advance. > > Soooo... ?I have a process that was functioning well before the 2.14 update. > Now the HoltWinters function is throwing an error whereby I get the > following: > > Error in HoltWinters(sales.ts) : optimization failure > > I've been looking around to determine why this happens (see if I can test > the data beforehand) but I haven't come across anything. > > Any help appreciated! > > > -- > View this message in context: http://r.789695.n4.nabble.com/HoltWinters-in-R-2-14-0-tp3991247p3991247.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From jfox at mcmaster.ca Sat Nov 5 00:23:31 2011 From: jfox at mcmaster.ca (John Fox) Date: Fri, 4 Nov 2011 19:23:31 -0400 Subject: [R] Determining r2 values for a SEM In-Reply-To: <1320433572560-3991279.post@n4.nabble.com> References: <1320425704598-3990855.post@n4.nabble.com> <1320433572560-3991279.post@n4.nabble.com> Message-ID: <009f01cc9b48$c4defb40$4e9cf1c0$@mcmaster.ca> Dear Mark and Katherine, As well, version 2.0-0 of the sem package on R-Forge, and soon to be on CRAN, includes R^2 values as part of the summary() output (along with many other enhancements). Best, John > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Mark Difford > Sent: November-04-11 3:06 PM > To: r-help at r-project.org > Subject: Re: [R] Determining r2 values for a SEM > > On Nov 04, 2011 at 6:55pm Katherine Stewart wrote: > > > Is there a way to determine r2 values for an SEM in the SEM package > > or another way to get these values in R? > > Katherine, > > rsquare.sem() in package sem.additions will do it for you. > > Regards, Mark. > > ----- > Mark Difford (Ph.D.) > Research Associate > Botany Department > Nelson Mandela Metropolitan University > Port Elizabeth, South Africa > -- > View this message in context: > http://r.789695.n4.nabble.com/Determining-r2-values-for-a-SEM- > tp3990855p3991279.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From michael.weylandt at gmail.com Sat Nov 5 00:26:41 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 4 Nov 2011 19:26:41 -0400 Subject: [R] Unused Arguments Error Even Though I'm Using Them? In-Reply-To: References: Message-ID: I can't see anything in your code at first glance that would lead to that error: could you perhaps provide some self-contained code that reproduces that error? Michael On Fri, Nov 4, 2011 at 3:21 PM, Christopher Simons wrote: > Greetings, > > I am running into an error I can't seem to get past; something tells > me I am making an obvious mistake as I am new to R, but everything > looks fine to me. ?Basically, I'm getting the message "unused > arguments" even though my arguments are all being used. ?I have posted > my code at the following URL: > > http://textsnip.com/8af5b2 > > Thanks for any help, > > > CLS > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From clint at ecy.wa.gov Sat Nov 5 00:40:13 2011 From: clint at ecy.wa.gov (Clint Bowman) Date: Fri, 4 Nov 2011 16:40:13 -0700 (PDT) Subject: [R] representing wind date using windrose In-Reply-To: References: Message-ID: I'm also very impressed with openair , also Clint -- Clint Bowman INTERNET: clint at ecy.wa.gov Air Quality Modeler INTERNET: clint at math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600 FAX: (360) 407-7534 Olympia, WA 98504-7600 USPS: PO Box 47600, Olympia, WA 98504-7600 Parcels: 300 Desmond Drive, Lacey, WA 98503-1274 On Fri, 4 Nov 2011, R. Michael Weylandt wrote: > Try this: http://rss.acs.unt.edu/Rdoc/library/climatol/html/rosavent.html > > Michael > > On Fri, Nov 4, 2011 at 6:06 PM, Norman Khine wrote: >> hello, >> i am new to R and want to use it for a small project to draw a wind >> data from a microclimate datasource, can someone give me an example of >> how i can represent this in a neat way? >> >> for example, i have: >> speed, direction >> 0.3,NNE >> 0.45,NNE >> 0.32,NE >> 0.28,N >> 0.30,NE >> >> how do i put this data to get a windrose graph? >> >> many thanks >> >> norman >> >> -- >> >> %>>> "".join( [ {'*':'@','^':'.'}.get(c,None) or >> chr(97+(ord(c)-83)%26) for c in ",adym,*)&uzq^zqf" ] ) >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From michael.weylandt at gmail.com Sat Nov 5 00:56:04 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 4 Nov 2011 19:56:04 -0400 Subject: [R] Creating a sequence from two samples with several constraints (frequency and repeats) In-Reply-To: References: Message-ID: I believe the permute package is set up to generate restricted permutations. Michael On Fri, Nov 4, 2011 at 4:59 PM, Jeremy Wang wrote: > I'm attempting to create a sequence for an experiment and am hoping I can > use R to create it. It has several constraints: > > (1) It is made up of two sequences (red and green) that have 4 different > repeating triplets (e.g. T1=ABC T2=DEF T3=GHI JKL) > (2) Each sequence has the following constraints: (a) there cannot be > repeating triplets (e.g. T1 T1), (b) there cannot be repeating triplet > pairs (e.g. T1 T2 T1 T2) > (3) Triplets occur with the following frequency: T1=20, T2=23, T3=26, T4=36 > (same for red and green sequences) > (4) Red and green sequences are then interleaved, such that you never get > more than 6 in a row of one color. > > (For those that are interested, I'm trying to replicate Turke-Browne, > Junge, & Scholl, 2005, *JEP*, but with frequency manipulations to the > triplets) > > Thanks in advance for any help. > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From michael.weylandt at gmail.com Sat Nov 5 00:58:12 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 4 Nov 2011 19:58:12 -0400 Subject: [R] error message In-Reply-To: <1320429978716-3991100.post@n4.nabble.com> References: <1295368328987-3223412.post@n4.nabble.com> <1320429978716-3991100.post@n4.nabble.com> Message-ID: Could you provide an example of your code? The error is coming up because lastLine - n + 1 < 0 but obviously I can't tell you why it's happening in your code without seeing it. Michael On Fri, Nov 4, 2011 at 2:06 PM, JulianaMF wrote: > Hello, > I am a PhD candidate at University of Sao Paulo, finishing the thesis and > know nothing about R language... I am trying to analyze a structure data set > (STRs) with 82 samples and 10 loci (missing data = -9) and I keep getting > the same Error in text[(last line - n + 1):last line] : only 0's may be > mixed with negative subscripts message. I read this thread and tried the > traceback () function, but all I got after I typed it was 1: > read.structure(file = "SimStru3.str")... > When I import the data using the read.table function, it works well, but > then I am not being able to transform the data frame in a genind object... I > know I may be asking very stupid things, but I really could use the help... > Thank you so much! > Juliana > > -- > View this message in context: http://r.789695.n4.nabble.com/error-message-tp3223412p3991100.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From zndeana at ku.edu Sat Nov 5 01:38:20 2011 From: zndeana at ku.edu (Md Desa, Zairul Nor Deana Binti) Date: Sat, 5 Nov 2011 00:38:20 +0000 Subject: [R] set seed for random draws Message-ID: Hello, all! I need help on these two problems: 1) If I want to randomly draw numbers from standard normal (or other distributions) in loops e.g.: ty=0; ks=0 for (i in 1:5) { set.seed(14537+i) k<-rnorm(1) ks[i]<-.3*k+.9 if (ty==0) { while ((ks<.2)||(ks>3)) { #set.seed(13237+i*100) k<-rnorm(1) ks[i]-.3*k+.9 } } } .... .... .... } Question: Here I draw initial a, then if the drawn initial a satisfied 2 conditions I redraw a. I set.seed(13237) in the first draw of a, should I set.seed() in the redraw part? 2) I also have more loops after this i loop that also draw from normal(0,1). I want to randomly draws from normal(0,1) for loop j (inside loop j I draw another random numbers from N(0,1)) My question: Should I or shouldn't I set seed again and again for each loop? Why or why not. I guess this problem concerned about setting seed as I want to have different number for each i. Thanks! Deana From michael.weylandt at gmail.com Sat Nov 5 01:59:52 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 4 Nov 2011 20:59:52 -0400 Subject: [R] set seed for random draws In-Reply-To: References: Message-ID: This might be more fundamental, but why do you feel the need to reset the seed each loop? There's nothing that suggests you need to... Michael On Fri, Nov 4, 2011 at 8:38 PM, Md Desa, Zairul Nor Deana Binti wrote: > Hello, all! > I need help on these two problems: > > 1) If I want to randomly draw numbers from standard normal (or other distributions) in loops e.g.: > ?ty=0; ks=0 > for (i in 1:5) { > ? ? ? ?set.seed(14537+i) > ? ? ? ?k<-rnorm(1) > ? ? ? ?ks[i]<-.3*k+.9 > ? ? ? ?if (ty==0) { > ? ? ? ? ? ?while ((ks<.2)||(ks>3)) { > ? ? ? ? ? ?#set.seed(13237+i*100) > ? ? ? ? ? ?k<-rnorm(1) > ? ? ? ? ? ?ks[i]-.3*k+.9 } > ? ? ? ?} > ? ? } > .... > .... > .... > ? ? } > > Question: Here I draw initial a, then if the drawn initial a satisfied 2 conditions I redraw a. I set.seed(13237) in the first draw of a, should I set.seed() in the redraw part? > > 2) I also have more loops after this i loop that also draw from normal(0,1). I want to randomly draws from normal(0,1) for loop j (inside loop j I draw another random numbers from N(0,1)) > My question: Should I or shouldn't I set seed again and again for each loop? Why or why not. > > I guess this problem concerned about setting seed as I want to have different number for each i. > > Thanks! > > Deana > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From clint at ucsd.edu Sat Nov 5 00:33:46 2011 From: clint at ucsd.edu (clint) Date: Fri, 4 Nov 2011 16:33:46 -0700 (PDT) Subject: [R] How to avoid ifelse statement converting factor to character In-Reply-To: References: Message-ID: <1320449626143-3992067.post@n4.nabble.com> most all those posts were right in sourcing the problem....its just that nobody actually offered a viable solution the problem is that the new level "C" was not one of the original levels in $social status add "C" as a level and then just do the ol fashioned way and it works just fine do this: data$SOCIAL_STATUS <- factor(data$SOCIAL_STATUS, levels = c(levels(data$SOCIAL_STATUS), "C")) then this: data$SOCIAL_STATUS<-ifelse(data$SOCIAL_STATUS=="B" & data$MALE>4, "C", data$SOCIAL_STATUS) it is sooo much more helpful when someone who has addressed the specific problem being asked replies instead of people just throwing random ideas out there -- View this message in context: http://r.789695.n4.nabble.com/How-to-avoid-ifelse-statement-converting-factor-to-character-tp895726p3992067.html Sent from the R help mailing list archive at Nabble.com. From syen at utk.edu Sat Nov 5 00:26:03 2011 From: syen at utk.edu (Steven Yen) Date: Fri, 4 Nov 2011 19:26:03 -0400 Subject: [R] Matrix element-by-element multiplication Message-ID: <1988657a-bd29-47f6-a85e-a088c62b85b5@kedge2.utk.tennessee.edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From nick_pan88 at yahoo.gr Sat Nov 5 00:49:14 2011 From: nick_pan88 at yahoo.gr (nick_pan) Date: Fri, 4 Nov 2011 16:49:14 -0700 (PDT) Subject: [R] nested "for" loops Message-ID: <1320450554839-3992089.post@n4.nabble.com> Hi all , I have written a code with nested "for" loops . The aim is to estimate the maximum likelihood by creating 3 vectors with the same length( sequence ) and then to utilize 3 "for" loops to make combinations among the 3 vectors , which are (length)^3 in number , and find the one that maximize the likelihood ( maximum likelihood estimator). The code I created, runs but I think something goes wrong...because when I change the length of the vectors but not the bounds the result is the same!!! I will give a simple example(irrelevant but proportional to the above) to make it more clear... Lets say we want to find the combination that maximize the multiplication of the entries of some vectors. V1<-c(1,2,3) V2<-c(5, 2 , 4) V3<-c( 4, 3, 6) The combination we look for is ( 3 , 5 , 6) that give us 3*5*6 = 90 If I apply the following in R , I won't take this result V1<-c(1,2,3) V2<-c(5, 2 , 4) V3<-c( 4, 3, 6) for( i in V1){ for( j in V2) { for( k in V3){ l<- i*j*k } } } l Then " l<- i*j*k " is number and not vector(of all multiplications of all the combinations) , and is 3*4*6 = 72. How can I fix the code? -- View this message in context: http://r.789695.n4.nabble.com/nested-for-loops-tp3992089p3992089.html Sent from the R help mailing list archive at Nabble.com. From michael.weylandt at gmail.com Sat Nov 5 02:10:27 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 4 Nov 2011 21:10:27 -0400 Subject: [R] Matrix element-by-element multiplication In-Reply-To: <1988657a-bd29-47f6-a85e-a088c62b85b5@kedge2.utk.tennessee.edu> References: <1988657a-bd29-47f6-a85e-a088c62b85b5@kedge2.utk.tennessee.edu> Message-ID: Did you even try? a <- 1:3 x <- matrix(c(1,2,3,2,4,6,3,6,9),3) a*x [,1] [,2] [,3] [1,] 1 2 3 [2,] 4 8 12 [3,] 9 18 27 Michael On Fri, Nov 4, 2011 at 7:26 PM, Steven Yen wrote: > is there a way to do element-by-element multiplication as in Gauss > and MATLAB, as shown below? Thanks. > > --- > a > > ? ? ? ?1.0000000 > ? ? ? ?2.0000000 > ? ? ? ?3.0000000 > x > > ? ? ? ?1.0000000 ? ? ? ?2.0000000 ? ? ? ?3.0000000 > ? ? ? ?2.0000000 ? ? ? ?4.0000000 ? ? ? ?6.0000000 > ? ? ? ?3.0000000 ? ? ? ?6.0000000 ? ? ? ?9.0000000 > a.*x > > ? ? ? ?1.0000000 ? ? ? ?2.0000000 ? ? ? ?3.0000000 > ? ? ? ?4.0000000 ? ? ? ?8.0000000 ? ? ? ?12.000000 > ? ? ? ?9.0000000 ? ? ? ?18.000000 ? ? ? ?27.000000 > > > -- > Steven T. Yen, Professor of Agricultural Economics > The University of Tennessee > http://web.utk.edu/~syen/ > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From michael.weylandt at gmail.com Sat Nov 5 02:15:58 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 4 Nov 2011 21:15:58 -0400 Subject: [R] nested "for" loops In-Reply-To: <1320450554839-3992089.post@n4.nabble.com> References: <1320450554839-3992089.post@n4.nabble.com> Message-ID: Your problem is that you redefine l each time through the loops and don't record old values; you could do so by using c() for concatenation, but perhaps this is what you are looking for: exp(rowSums(log(expand.grid(V1, V2, V3)))) Hope this helps, Michael On Fri, Nov 4, 2011 at 7:49 PM, nick_pan wrote: > Hi all , I have written a code with nested "for" loops . > The aim is to estimate the maximum likelihood by creating 3 vectors with the > same length( sequence ) > and then to utilize 3 "for" loops to make combinations among the 3 vectors , > which are (length)^3 in number , and find the one that maximize the > likelihood ( maximum likelihood estimator). > > > The code I created, runs but I think something goes wrong...because when I > change the length of the vectors but not the bounds the result is the > same!!! > > I will give a simple example(irrelevant but proportional to the above) to > make it more clear... > > Lets say we want to find the combination that maximize the multiplication of > the entries of some vectors. > > V1<-c(1,2,3) > V2<-c(5, 2 , 4) > V3<-c( 4, 3, 6) > > The combination we look for is ( 3 , 5 , 6) that give us 3*5*6 = 90 > > If I apply the following in R , I won't take this result > > V1<-c(1,2,3) > V2<-c(5, 2 , 4) > V3<-c( 4, 3, 6) > > for( i in V1){ > ?for( j in V2) { > ? ? for( k in V3){ > > l<- i*j*k > > } > } > } > l > > Then " l<- i*j*k " is ?number and not vector(of all multiplications of all > the combinations) , and is 3*4*6 = 72. > > How can I fix the code? > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/nested-for-loops-tp3992089p3992089.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From dwinsemius at comcast.net Sat Nov 5 02:40:43 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Nov 2011 21:40:43 -0400 Subject: [R] How to avoid ifelse statement converting factor to character In-Reply-To: <1320449626143-3992067.post@n4.nabble.com> References: <1320449626143-3992067.post@n4.nabble.com> Message-ID: <31AFFC9B-C01C-4747-AF46-42B20D5BC792@comcast.net> On Nov 4, 2011, at 7:33 PM, clint wrote: > most all those posts were right in sourcing the problem....its just that > nobody actually offered a viable solution Very few problems that are posed with a test dataset go unanswered. > > the problem is that the new level "C" was not one of the original levels in > $social status > > add "C" as a level and then just do the ol fashioned way and it works just > fine > > do this: > data$SOCIAL_STATUS <- factor(data$SOCIAL_STATUS, levels = > c(levels(data$SOCIAL_STATUS), "C")) > > then this: > data$SOCIAL_STATUS<-ifelse(data$SOCIAL_STATUS=="B" & data$MALE>4, "C", > data$SOCIAL_STATUS) > > it is sooo much more helpful when someone who has addressed the specific > problem being asked replies instead of people just throwing random ideas out > there > Just a little Friday night trolling? This was a posting from 2 years ago. -- David > > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-avoid-ifelse-statement-converting-factor-to-character-tp895726p3992067.html > Sent from the R help mailing list archive at Nabble.com. > This is a mailing list. Nabble users who fail to include context who replying to ancient threads should not be throwing stones. > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From dwinsemius at comcast.net Sat Nov 5 03:02:13 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Nov 2011 22:02:13 -0400 Subject: [R] 12th Root of a Square (Transition) Matrix In-Reply-To: References: <3080FA352049DF49B5FCD17896109AD576CBA2C3@BL2PRD0102MB008.prod.exchangelabs.com> <4C1B46C3.7010505@gmail.com> <1320402848675-3989618.post@n4.nabble.com> <9C6FCB9A-8142-4248-85B9-2C37E8479759@comcast.net> Message-ID: <4C01CD3E-C1E1-4A37-A116-5EACE98A9278@comcast.net> This is just one of many 12-th roots. (Peter knows this i'm sure.) The negative of this would also be an nth root, and I read that there are quite few others that arise from solutions based on permuting negatives of eigen values of a triangularized form. But as I said , I'm not a matrix mechanic, so no code for that. -- David. On Nov 4, 2011, at 6:10 PM, Peter Langfelder wrote: > On Fri, Nov 4, 2011 at 2:37 PM, David Winsemius wrote: > >> >> The 12th (matrix) root of M: e^( 1/n * log(M) ) >> >>> require(Matrix) >>> M1.12 <- expm( (1/12)*logm(M) ) > > I like this - haven't thought of the matrix algebra functions in Matrix. > > Thanks, > > Peter From ripley at stats.ox.ac.uk Sat Nov 5 04:34:41 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Sat, 5 Nov 2011 03:34:41 +0000 (GMT) Subject: [R] HoltWinters in R 2.14.0 In-Reply-To: References: <1320432920314-3991247.post@n4.nabble.com> Message-ID: On Fri, 4 Nov 2011, R. Michael Weylandt wrote: > I believe there were some changes to Holt-Winters, specifically in re > optimization that probably lead to your problem, but you'll have to > provide more details. See the NEWS file for citations about the > change. If you put example code/data others may be able to help you -- > I haven't updated yet so I can't be of much help. > > Michael > > > On Fri, Nov 4, 2011 at 2:55 PM, TimothyDalbey wrote: >> Hey All, >> >> First time on these forums. ?Thanks in advance. >> >> Soooo... ?I have a process that was functioning well before the 2.14 update. >> Now the HoltWinters function is throwing an error whereby I get the >> following: >> >> Error in HoltWinters(sales.ts) : optimization failure Most likely it was incorrect before. You cannot assume that it was actually 'functioning well': all the cases where we have seen this message it was giving incorrect answers before and not detecting them. And in all those cases the model was a bad fit and using starting values for the optimization helped. >> I've been looking around to determine why this happens (see if I can test >> the data beforehand) but I haven't come across anything. >> >> Any help appreciated! -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From jim at bitwrit.com.au Sat Nov 5 05:01:34 2011 From: jim at bitwrit.com.au (Jim Lemon) Date: Sat, 05 Nov 2011 15:01:34 +1100 Subject: [R] barplot as histogram In-Reply-To: <4EB42911.2080707@atl.lmco.com> References: <4EB42911.2080707@atl.lmco.com> Message-ID: <4EB4B51E.9050603@bitwrit.com.au> On 11/05/2011 05:04 AM, Jesse Brown wrote: > Hello: > > I'm dealing with an issue currently that I'm not sure the best way to > approach. I've got a very large (10G+) dataset that I'm trying to create > a histogram for. I don't seem to be able to use hist directly as I can > not create an R vector of size greater than 2.2G. I considered > condensing the data previous to loading it into R and just plotting the > frequencies as a barplot; unfortunately, barplot does not support > plotting the values according to a set of x-axis positions. > > What I have is something similar to: > > ys <- c(12,3,7,22,10) > xs <- c(1,30,35,39,60) > > and I'd like the bars (ys) to appear at the positions described by xs. I > can get this to work on smaller sets by filling zero values in for > missing ys for the entire range of xs but in my case this would again > create a vector too large for R. > > Is there another way to use the two vectors to create a simulated > frequency histogram? Is there a way to create a histogram object (as > returned by hist) from the condensed data so that plot would handle it > correctly? > Hi Jesse, I think that barp (plotrix) will get you out of trouble. Jim From rkevinburton at charter.net Sat Nov 5 05:26:53 2011 From: rkevinburton at charter.net (Kevin Burton) Date: Fri, 4 Nov 2011 23:26:53 -0500 Subject: [R] acf? Message-ID: <006d01cc9b73$255eb5d0$701c2170$@charter.net> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ggrothendieck at gmail.com Sat Nov 5 06:13:16 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Sat, 5 Nov 2011 01:13:16 -0400 Subject: [R] acf? In-Reply-To: <006d01cc9b73$255eb5d0$701c2170$@charter.net> References: <006d01cc9b73$255eb5d0$701c2170$@charter.net> Message-ID: On Sat, Nov 5, 2011 at 12:26 AM, Kevin Burton wrote: > I started to check what I thought I knew with autocovariance and it doesn?t > jive with the the calculations given by ?R?. I was wondering if there is > some scaling or something that I am not aware of. > > > > Take the example > > > > ? ?d <- 1:10 > > ? ?(a <- acf(d, type="covariance", demean=FALSE, plot=FALSE)) > > > > Autocovariances of series ?d?, by lag > > > > ? 0 ? ?1 ? ?2 ? ?3 ? ?4 ? ?5 ? ?6 ? ?7 ? ?8 ? ?9 > > 38.5 33.0 27.6 22.4 17.5 13.0 ?9.0 ?5.6 ?2.9 ?1.0 > > > > But when I calculate it manually (for lag of 1) like: > > > > ? ?y1 <- d ? mean(d) > > ? ?dl <- c(d[-1], d[1]) > > ? ?y2 <- dl ? mean(d) > > ? ?mean(y1*y2) > > [1] 3.75 > > > > What am I missing to get this basic concept? Isn?t it E[(Yt ? ut)(Ys ? us)]? > Try this: > d <- 1:10 > dm <- d - mean(d) > sum(dm[-1] * dm[-10]) / 10 [1] 5.775 > acf(d, type = "cov", plot = FALSE)[1] Autocovariances of series ?d?, by lag 1 5.78 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From nick_pan88 at yahoo.gr Sat Nov 5 03:14:26 2011 From: nick_pan88 at yahoo.gr (nick_pan) Date: Fri, 4 Nov 2011 19:14:26 -0700 (PDT) Subject: [R] nested "for" loops In-Reply-To: References: <1320450554839-3992089.post@n4.nabble.com> Message-ID: <1320459266708-3992324.post@n4.nabble.com> Thank you , this works but I have to do it with nested for loops... Could you suggest me a way ? -- View this message in context: http://r.789695.n4.nabble.com/nested-for-loops-tp3992089p3992324.html Sent from the R help mailing list archive at Nabble.com. From tmdalbey at gmail.com Sat Nov 5 06:02:09 2011 From: tmdalbey at gmail.com (TimothyDalbey) Date: Fri, 4 Nov 2011 22:02:09 -0700 (PDT) Subject: [R] HoltWinters in R 2.14.0 In-Reply-To: References: <1320432920314-3991247.post@n4.nabble.com> Message-ID: <7505DDA2-521F-4499-885F-79E5C18EAA44@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pburns at pburns.seanet.com Sat Nov 5 10:00:47 2011 From: pburns at pburns.seanet.com (Patrick Burns) Date: Sat, 05 Nov 2011 09:00:47 +0000 Subject: [R] set seed for random draws In-Reply-To: References: Message-ID: <4EB4FB3F.80604@pburns.seanet.com> I'm suspecting this is confusion about default behavior. R automatically updates the random seed when random numbers are generated (or other random operations are performed). The original poster may have experienced systems where it is up to the user to change the seed. I'd suggest two rules of thumb when coming up against something in R that you aren't sure about: 1. If it is a mundane task, R probably takes care of it. 2. Experiment to see what happens. Of course you could read documentation, but no one does that. On 05/11/2011 00:59, R. Michael Weylandt wrote: > This might be more fundamental, but why do you feel the need to reset > the seed each loop? There's nothing that suggests you need to... > > Michael > > On Fri, Nov 4, 2011 at 8:38 PM, Md Desa, Zairul Nor Deana Binti > wrote: >> Hello, all! >> I need help on these two problems: >> >> 1) If I want to randomly draw numbers from standard normal (or other distributions) in loops e.g.: >> ty=0; ks=0 >> for (i in 1:5) { >> set.seed(14537+i) >> k<-rnorm(1) >> ks[i]<-.3*k+.9 >> if (ty==0) { >> while ((ks<.2)||(ks>3)) { >> #set.seed(13237+i*100) >> k<-rnorm(1) >> ks[i]-.3*k+.9 } >> } >> } >> .... >> .... >> .... >> } >> >> Question: Here I draw initial a, then if the drawn initial a satisfied 2 conditions I redraw a. I set.seed(13237) in the first draw of a, should I set.seed() in the redraw part? >> >> 2) I also have more loops after this i loop that also draw from normal(0,1). I want to randomly draws from normal(0,1) for loop j (inside loop j I draw another random numbers from N(0,1)) >> My question: Should I or shouldn't I set seed again and again for each loop? Why or why not. >> >> I guess this problem concerned about setting seed as I want to have different number for each i. >> >> Thanks! >> >> Deana >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Patrick Burns pburns at pburns.seanet.com twitter: @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') From mmstat at comcast.net Sat Nov 5 14:07:02 2011 From: mmstat at comcast.net (mmstat at comcast.net) Date: Sat, 5 Nov 2011 13:07:02 +0000 (UTC) Subject: [R] 3-D ellipsoid equations In-Reply-To: <825598475.1813011.1320466638440.JavaMail.root@sz0115a.westchester.pa.mail.comcast.net> Message-ID: <1570217015.1818095.1320498422219.JavaMail.root@sz0115a.westchester.pa.mail.comcast.net> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rh at knut-krueger.de Sat Nov 5 14:07:52 2011 From: rh at knut-krueger.de (Knut Krueger) Date: Sat, 5 Nov 2011 14:07:52 +0100 Subject: [R] similar package in R like "SKEW CALCULATOR"? Message-ID: <4EB53528.7030905@knut-krueger.de> Hi to all is there a similar package like the SKEW CALCULATOR from Peter Nonacs (University of California - Department of Ecology and Evolutionary Biology) http://www.eeb.ucla.edu/Faculty/Nonacs/shareware.htm Kind Regards Knut From patrick.giraudoux at univ-fcomte.fr Sat Nov 5 14:27:29 2011 From: patrick.giraudoux at univ-fcomte.fr (Patrick Giraudoux) Date: Sat, 05 Nov 2011 14:27:29 +0100 Subject: [R] How to write a shapefile with projection Message-ID: <4EB539C1.8010807@univ-fcomte.fr> > Hi, > > Sorry i have put such a detailed question to the list about writing a shapefile with projection. I realized that if i use writeOGR from rgdal and not the other write shapefile functions i can get a shapefile with projection recognized by ArcGIS. The command is (in case anybody wonders): > > ?writeOGR(crest.sp, "I:\\LA_levee\\Shape", "llev_crest_pts6", driver = "ESRI Shapefile") > > where crest.sp is a spatial point data frame with projection. > > Thanks, > > Monica Indeed. writePointsShape() does not write the projection file, but using the function showWKT from rgdal, you can also create one like that: writePointsShape(crest.sp,"crest") cat(showWKT(proj4string(crest.sp)),file="crest.prj") Patrick From dwinsemius at comcast.net Sat Nov 5 15:00:06 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 5 Nov 2011 10:00:06 -0400 Subject: [R] similar package in R like "SKEW CALCULATOR"? In-Reply-To: <4EB53528.7030905@knut-krueger.de> References: <4EB53528.7030905@knut-krueger.de> Message-ID: <2B4E4E97-DFEB-4EF0-8E29-3CE5741E5150@comcast.net> From that hand waving description it would be difficult to tell. Sounds like a reinvention of the Pareto Index, for which you can find many packages that provide facilities: http://finzi.psych.upenn.edu/cgi-bin/namazu.cgi?query=Pareto+index&max=100&result=normal&sort=score&idxname=functions&idxname=vignettes&idxname=views -- David. On Nov 5, 2011, at 9:07 AM, Knut Krueger wrote: > > Hi to all > is there a similar package like the SKEW CALCULATOR from > Peter Nonacs (University of California - Department of Ecology and Evolutionary Biology) > > http://www.eeb.ucla.edu/Faculty/Nonacs/shareware.htm > > > Kind Regards Knut > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From julia.lira at hotmail.co.uk Sat Nov 5 15:05:07 2011 From: julia.lira at hotmail.co.uk (Julia Lira) Date: Sat, 5 Nov 2011 14:05:07 +0000 Subject: [R] linear against nonlinear alternatives - quantile regression Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From kristian.langgaard.lind at gmail.com Sat Nov 5 15:08:59 2011 From: kristian.langgaard.lind at gmail.com (Kristian Lind) Date: Sat, 5 Nov 2011 15:08:59 +0100 Subject: [R] Error in eigen(a$hessian) : infinite or missing values in 'x' Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.weylandt at gmail.com Sat Nov 5 15:21:05 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt ) Date: Sat, 5 Nov 2011 10:21:05 -0400 Subject: [R] nested "for" loops In-Reply-To: <1320459266708-3992324.post@n4.nabble.com> References: <1320450554839-3992089.post@n4.nabble.com> <1320459266708-3992324.post@n4.nabble.com> Message-ID: Why do you need to do it with nested for loops? It is of course possible - and I hinted how to do it in my first email - but there's no reason as far as I can see to do so, particularly as a means of MLE. Sounds suspiciously like homework... Michael On Nov 4, 2011, at 10:14 PM, nick_pan wrote: > Thank you , this works but I have to do it with nested for loops... > > Could you suggest me a way ? > > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/nested-for-loops-tp3992089p3992324.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ggrothendieck at gmail.com Sat Nov 5 15:32:45 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Sat, 5 Nov 2011 10:32:45 -0400 Subject: [R] zoo performance regression noticed (1.6-5 is faster...) In-Reply-To: References: <20111104163407.GA20540@translab.its.uci.edu> Message-ID: On Fri, Nov 4, 2011 at 1:02 PM, Gabor Grothendieck wrote: > On Fri, Nov 4, 2011 at 12:56 PM, Gabor Grothendieck > wrote: >> On Fri, Nov 4, 2011 at 12:34 PM, James Marca >> wrote: >>> Good morning, >>> >>> I have discovered what I believe to be a performance regression >>> between Zoo 1.6x and Zoo 1.7-6 in the application of rollapply. >>> On zoo 1.6x, rollapply of my function over my data takes about 20 >>> minutes. Using 1.7-6, the same code takes about 6 hours. >>> >>> R --version >>> R version 2.13.1 (2011-07-08) >>> Copyright (C) 2011 The R Foundation for Statistical Computing >>> ISBN 3-900051-07-0 >>> Platform: x86_64-pc-linux-gnu (64-bit) >>> >>> Two versions of zoo 1.6 run *fast* ?On one machine I am running >>> >>> ?less /usr/lib64/R/library/zoo/DESCRIPTION >>> ?Package: zoo >>> ?Version: 1.6-3 >>> ?Date: 2010-04-23 >>> ?Title: Z's ordered observations >>> ?... >>> ?Packaged: 2010-04-23 07:28:47 UTC; zeileis >>> ?Repository: CRAN >>> ?Date/Publication: 2010-04-23 07:43:54 >>> ?Built: R 2.10.1; ; 2010-04-25 06:41:34 UTC; unix >>> >>> (Thankfully I forgot to upgrade.packages() on this machine!) >>> >>> On the other >>> >>> ?Package: zoo >>> ?Version: 1.6-5 >>> ?Date: 2011-04-08 >>> ?... >>> ?Packaged: 2011-04-08 17:13:47 UTC; zeileis >>> ?Repository: CRAN >>> ?Date/Publication: 2011-04-08 17:27:47 >>> ?Built: R 2.13.1; ; 2011-11-04 15:49:54 UTC; unix >>> >>> I have stripped out zoo 1.7-6 from all my machines. >>> >>> I tried to ensure all libraries were identical on the two machines >>> (using lsof), and after finally downgrading zoo I got the second >>> machine to be as fast as the first, so I am quite certain the >>> difference in speed is down to the Zoo version used. >>> >>> My code runs a fairly simple function over a time series using the >>> following call to process a year of 30s data (9 columns, about a >>> million rows): >>> >>> ? ?vals <- rollapply(data=ts.data[,c(n.3.cols, o.3.cols,volocc.cols)] >>> ? ? ? ? ? ? ? ? ?,width=40 >>> ? ? ? ? ? ? ? ? ?,FUN=rolling.function.fn(n.cols=n.3.cols,o.cols=o.3.cols,vo.cols=volocc.cols) >>> ? ? ? ? ? ? ? ? ?,by.column=FALSE >>> ? ? ? ? ? ? ? ? ?,align='right') >>> >>> >>> (The rolling.function.fn call returns a function that is initialized >>> with the initial call above (a trick I learned from Javascript)) >>> >>> If this is a known situation with the new 1.7 generation Zoo, my >>> apologies and I'll go away. ?If my code could be turned into a useful >>> test, I'd be happy to help out as much as I'm able. ?Given the extreme >>> runtime difference though, I thought I should offer my help in this >>> case, since zoo is such a useful package in my work. >> >> This was a known problem and was fixed but if its still there then >> there must be some other condition under which it can occur as well. >> If you can provide a small self contained reproducible example it >> would help in tracking it down. >> >> -- >> Statistics & Software Consulting >> GKX Group, GKX Associates Inc. >> tel: 1-877-GKX-GROUP >> email: ggrothendieck at gmail.com >> > > Also, as a workaround you can try this to use an old rollapply in a > new version of zoo: > > library(zoo) > source("http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/rollapply.R?revision=817&root=zoo") > rollapply(...whatever...) > Have looked at it and there is now a performance improvement in the development version of rollapply that gives an order of magnitude performance boost in the following case: > library(zoo) > n <- 10000 > z <- zoo(cbind(a = 1:n, b = 1:n)) > system.time(rollapplyr(z, 10, sum, by.column = FALSE)) user system elapsed 8.80 0.02 8.97 > > # download rollapply rev 913 from svn repo and rerun > source("http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/rollapply.R?revision=913&root=zoo") > system.time(rollapplyr(z, 10, sum, by.column = FALSE)) user system elapsed 0.52 0.02 0.53 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From jimmy_ba37 at hotmail.com Sat Nov 5 15:40:30 2011 From: jimmy_ba37 at hotmail.com (Jimmy Barrera) Date: Sat, 5 Nov 2011 09:40:30 -0500 Subject: [R] unsuscribe In-Reply-To: <4C01CD3E-C1E1-4A37-A116-5EACE98A9278@comcast.net> References: <3080FA352049DF49B5FCD17896109AD576CBA2C3@BL2PRD0102MB008.prod.exchangelabs.com>, <4C1B46C3.7010505@gmail.com>, <1320402848675-3989618.post@n4.nabble.com>, , <9C6FCB9A-8142-4248-85B9-2C37E8479759@comcast.net>, , <4C01CD3E-C1E1-4A37-A116-5EACE98A9278@comcast.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From f.harrell at vanderbilt.edu Sat Nov 5 15:43:15 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Sat, 5 Nov 2011 07:43:15 -0700 (PDT) Subject: [R] linear against nonlinear alternatives - quantile regression In-Reply-To: References: Message-ID: <1320504195354-3993416.post@n4.nabble.com> Just to address a piece of this - in the case in which you are currently focusing on only one quantile, the rms package can help by fitting restricted cubic splines for covariate effects, and then run anova to test for nonlinearity (sometimes a dubious practice because if you then remove nonlinear terms you are mildly cheating). require(rms) f <- Rq(y ~ x1 + rcs(x2,4), tau=.25) anova(f) # tests associations and nonlinearity of x2 Frank Julia Lira wrote: > > Dear all, > > I would like to know whether any specification test for linear against > nonlinear model hypothesis has been implemented in R using the quantreg > package. > > I could read papers concerning this issue, but they haven't been > implemented at R. As far as I know, we only have two specification tests > in this line: anova.rq and Khmaladze.test. The first one test equality and > significance of the slopes across quantiles and the latter one test if the > linear specification is model of location or location and scale shift. > > Do you have any suggestion? > > Thanks a lot! > > Best regards, > > Julia > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@ mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/linear-against-nonlinear-alternatives-quantile-regression-tp3993327p3993416.html Sent from the R help mailing list archive at Nabble.com. From girit at biopticon.com Sat Nov 5 14:37:53 2011 From: girit at biopticon.com (Cem Girit) Date: Sat, 5 Nov 2011 09:37:53 -0400 Subject: [R] List of user installed packages Message-ID: <010f01cc9bc0$200bc490$60234db0$@biopticon.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bbolker at gmail.com Sat Nov 5 17:06:57 2011 From: bbolker at gmail.com (Ben Bolker) Date: Sat, 5 Nov 2011 16:06:57 +0000 Subject: [R] similar package in R like "SKEW CALCULATOR"? References: <4EB53528.7030905@knut-krueger.de> <2B4E4E97-DFEB-4EF0-8E29-3CE5741E5150@comcast.net> Message-ID: David Winsemius comcast.net> writes: > > From that hand waving description it would be difficult to tell. Sounds like a reinvention of the Pareto > Index, for which you can find many packages that provide facilities: > > http://finzi.psych.upenn.edu/cgi-bin/namazu.cgi?query=Pareto+index&max=100& > result=normal&sort=score&idxname=functions&idxname=vignettes&idxname=views > I think the author is looking for specific measures of "reproductive skew", a term from behavioral ecology/evolutionary biology. (PS for first-time questioners: you should not assume that general readers of the R-help list know much about your particular subject area. Short definitions and web references are helpful.) Based on a quick RSiteSearch("{reproductive skew}") library(sos) findFn('{reproductive skew}') googling "reproductive skew CRAN" searching for "reproductive skew" at http://rseek.org I don't think so ... There isn't an R list for evolutionary or behavioral biology, as far as I know, but you might try asking this question on the r-sig-ecology mailing list. I suspect it would not be terribly hard to implement these methods in R, but I can't find any evidence that anyone has done it and made it publicly available. From michael.weylandt at gmail.com Sat Nov 5 17:11:15 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Sat, 5 Nov 2011 12:11:15 -0400 Subject: [R] List of user installed packages In-Reply-To: <010f01cc9bc0$200bc490$60234db0$@biopticon.com> References: <010f01cc9bc0$200bc490$60234db0$@biopticon.com> Message-ID: I think the installed.packages() function can give you what you need, specifically look at the priority argument. Also check this out http://stackoverflow.com/questions/1401904/painless-way-to-install-a-new-version-of-r Michael On Sat, Nov 5, 2011 at 9:37 AM, Cem Girit wrote: > Hello, > > > > ? ? ? ? ? ? ? ?I am going to install the new version of R 2.14.1. After the > installation, I want to copy my installed packages to the new library. But > since over time I forgot which ones I installed I want to get a list of all > the packages I installed among the packages installed initially by the > R-installer. Is this possible? > > > > Cem > > > > Cem Girit, PhD > > > > Biopticon Corporation > > 182 Nassau Street, Suite 204 > > Princeton, NJ 08542 > > Tel: (609)-853-0231 > > Email:girit at biopticon.com > > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From mmstat at comcast.net Sat Nov 5 18:15:37 2011 From: mmstat at comcast.net (mmstat at comcast.net) Date: Sat, 5 Nov 2011 17:15:37 +0000 (UTC) Subject: [R] 3-D ellipsoid equations update2. Error message when I run R code. In-Reply-To: <1570217015.1818095.1320498422219.JavaMail.root@sz0115a.westchester.pa.mail.comcast.net> Message-ID: <1889696942.1826936.1320513337588.JavaMail.root@sz0115a.westchester.pa.mail.comcast.net> + Hello, I want to delete prior questions online but am getting an error message? Please see R code in enclosed file. I don't understand the error message. The parametric equations of an ellipsoid can be written in terms of spherical coordinates. The three spherical coordinates are converted to Cartesian coordinates by X=a cos (?) sin(?) Y=b sin(?) sin(?) Z=c cos(?) The parameter ? varies from 0 to 2 ? and ? varies from 0 to ? . Here ( X o , Y o ,Z o ) is the center of the ellipsoid, and ? is the angle of rotation. I need to come up with an expression for the ellipsoid expressed parametrically as the path of a point in 3- space. I think that it is something like the following: x (alpha)<- x0 + a * cos(theta) * cos(alpha) - b * sin(theta) * sin(alpha) y(alpha) <- y0 + a * cos(theta) * sin(alpha) + b * sin(theta) * cos(alpha) z (alpha)<- z0 + a * cos(theta) * sin(alpha) + c * sin(theta) * cos(alpha) Do I have these equations correct? Most of the books I have read use eigenvectors. The eigenvectors of course consist of the direction cosines. My difficulty is going from that approach to the approach that Alberto Monteiro took in his message on the 9 October 2006. I understand the R code and am using it for a two-dimensional ellipse problem. There does not seem to be allowance for the new coordinates of the center of the ellipsoid under the transformation when using direction cosines. By that I mean adding the centroid coordinates would not be necessary. I need to come up with an example where I do it both ways(as above and using direction cosines). My confusion lies in the fact that rather than one rotational angle theta there are 9 direction cosines. Can you assist with this. Sincerely, Mary A. Marion -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 3d.r.txt URL: From dwinsemius at comcast.net Sat Nov 5 18:32:31 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 5 Nov 2011 13:32:31 -0400 Subject: [R] unsuscribe In-Reply-To: References: <3080FA352049DF49B5FCD17896109AD576CBA2C3@BL2PRD0102MB008.prod.exchangelabs.com> <4C1B46C3.7010505@gmail.com> <1320402848675-3989618.post@n4.nabble.com> <9C6FCB9A-8142-4248-85B9-2C37E8479759@comcast.net> <4C01CD3E-C1E1-4A37-A116-5EACE98A9278@comcast.net> Message-ID: <4928B3A5-6B3E-44AB-B085-21D4F2B515D3@comcast.net> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Sat Nov 5 18:42:34 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 5 Nov 2011 13:42:34 -0400 Subject: [R] linear against nonlinear alternatives - quantile regression In-Reply-To: <1320504195354-3993416.post@n4.nabble.com> References: <1320504195354-3993416.post@n4.nabble.com> Message-ID: I suppose this constitutes thread drift, but your simple example, Frank, made wonder if Rq() accepts a vector argument for tau. I seem to remember that Koencker's rq() does.. Normally I would consult the help page, but the power is still out here in Central Connecticut and I am corresponding with a less capable device. I am guessing that if Rq() does accept such a vector that the form of the nonlinearity would be imposed at all levels of tau. -- David On Nov 5, 2011, at 10:43 AM, Frank Harrell wrote: > Just to address a piece of this - in the case in which you are currently > focusing on only one quantile, the rms package can help by fitting > restricted cubic splines for covariate effects, and then run anova to test > for nonlinearity (sometimes a dubious practice because if you then remove > nonlinear terms you are mildly cheating). > > require(rms) > f <- Rq(y ~ x1 + rcs(x2,4), tau=.25) > anova(f) # tests associations and nonlinearity of x2 > > Frank > > Julia Lira wrote: >> >> Dear all, >> >> I would like to know whether any specification test for linear against >> nonlinear model hypothesis has been implemented in R using the quantreg >> package. >> >> I could read papers concerning this issue, but they haven't been >> implemented at R. As far as I know, we only have two specification tests >> in this line: anova.rq and Khmaladze.test. The first one test equality and >> significance of the slopes across quantiles and the latter one test if the >> linear specification is model of location or location and scale shift. >> >> Do you have any suggestion? >> >> Thanks a lot! >> >> Best regards, >> >> Julia >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@ mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > ----- > Frank Harrell > Department of Biostatistics, Vanderbilt University > -- > View this message in context: http://r.789695.n4.nabble.com/linear-against-nonlinear-alternatives-quantile-regression-tp3993327p3993416.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From carl at witthoft.com Sat Nov 5 18:57:31 2011 From: carl at witthoft.com (Carl Witthoft) Date: Sat, 05 Nov 2011 13:57:31 -0400 Subject: [R] nested "for" loops Message-ID: <4EB5790B.8030604@witthoft.com> If in fact this is homework, you will do yourself, your classmates, and possibly your teacher if you let them know that, at least in R, almost anything you can do in a for() loop can be done easier and faster with vectorization. If you teacher can't comprehend this, get him fired. a<-c(4,6,3) b<- c( 9,4,1) d <- c(4,7,2,8) winning.value <- max(outer(a,outer(b,d,"*"),"*")) From: R. Michael Weylandt Date: Sat, 05 Nov 2011 10:21:05 -0400 Why do you need to do it with nested for loops? It is of course possible - and I hinted how to do it in my first email - but there's no reason as far as I can see to do so, particularly as a means of MLE. Sounds suspiciously like homework... Michael On Nov 4, 2011, at 10:14 PM, nick_pan wrote: > Thank you , this works but I have to do it with nested for loops... > > Could you suggest me a way ? > -- Sent from my Cray XK6 "Pendeo-navem mei anguillae plena est." From erich.neuwirth at univie.ac.at Sat Nov 5 19:00:33 2011 From: erich.neuwirth at univie.ac.at (Erich Neuwirth) Date: Sat, 5 Nov 2011 19:00:33 +0100 Subject: [R] List of user installed packages In-Reply-To: <010f01cc9bc0$200bc490$60234db0$@biopticon.com> References: <010f01cc9bc0$200bc490$60234db0$@biopticon.com> Message-ID: <80BE218D-BBF3-482D-9E5F-D6C641DE89DF@univie.ac.at> Running rownames(installed.packages()) will tell you the names of all packages of the version of R in which you are running the command. http://cran.r-project.org/doc/FAQ/R-FAQ.html#R-Add_002dOn-Packages tells you the names of the packages which were installed with R itself. On Nov 5, 2011, at 2:37 PM, Cem Girit wrote: > Hello, > > > > I am going to install the new version of R 2.14.1. After the > installation, I want to copy my installed packages to the new library. But > since over time I forgot which ones I installed I want to get a list of all > the packages I installed among the packages installed initially by the > R-installer. Is this possible? > > > > Cem > > > > Cem Girit, PhD > > > > Biopticon Corporation > > 182 Nassau Street, Suite 204 > > Princeton, NJ 08542 > > Tel: (609)-853-0231 > > Email:girit at biopticon.com > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: Message signed with OpenPGP using GPGMail URL: From Greg.Snow at imail.org Sat Nov 5 19:01:53 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Sat, 5 Nov 2011 12:01:53 -0600 Subject: [R] Export to .txt In-Reply-To: <1320185704350-3965699.post@n4.nabble.com> References: <1320185704350-3965699.post@n4.nabble.com> Message-ID: Look at the txtStart function in the TeachingDemos package. It works like sink but also includes commands as well as output. Though I have never tried it with browser() (and it does not always include the results of errors). Another option in to use some type of editor that links with R such as emacs/ESS or tinn-R (or other) and then save the entire transcript. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of stat.kk Sent: Tuesday, November 01, 2011 4:15 PM To: r-help at r-project.org Subject: [R] Export to .txt Hi, I would like to export all my workspace (even with the evaluation of commands) to the text file. I know about the sink() function but it doesnt work as I would like. My R-function looks like this: there are instructions for user displayed by cat() command and browser() commands for fulfilling them. While using the sink() command the instructions dont display :( Can anyone help me with a equivalent command to File - Save to file... option? Thank you very much. -- View this message in context: http://r.789695.n4.nabble.com/Export-to-txt-tp3965699p3965699.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From jmf at ib.usp.br Sat Nov 5 20:12:17 2011 From: jmf at ib.usp.br (JulianaMF) Date: Sat, 5 Nov 2011 12:12:17 -0700 (PDT) Subject: [R] error message In-Reply-To: References: <1295368328987-3223412.post@n4.nabble.com> <1320429978716-3991100.post@n4.nabble.com> Message-ID: <1320520337578-3994067.post@n4.nabble.com> Hello Michael, Sorry, I am just starting to lear all this. Here is one of my input files (from a .str file) in which the first column are the individuals, the second is the pop info (in this case I am stating that I have one pop because I am still trying to find out which are the clusters in my sample) and the others are the 10 loci. The genotypes for each individual are in 2 rows and missing data is -9. I tried both R and R studio and I am working from a Mac. I followed all the routine SimStru <- read.structure(file = "SimStru.str") and then I answered all the prompted questions (I am writing from my memory because the files are back at the lab): 82 individuals (820 genotypes), 10 loci, no column with marker name, column with pop info = 2, other column = 1 (names for each sample), row with marker names = 1, missing data = -9. I didn't change any of the other default settings. And once I answered all the questions, I got that error message. Thank you so much to be willing to help and sorry about my ignorance, I am new to this! Ind Pop 134 220 28 18 414 24 42 58 423 12 S3 1 349 163 267 316 287 412 275 234 164 351 S3 1 369 165 267 336 287 424 275 238 188 351 S5 1 345 163 271 316 287 360 187 234 152 343 S5 1 365 163 283 336 287 388 187 246 152 615 S9 1 353 163 275 300 287 400 231 234 164 347 S9 1 361 163 279 336 287 416 275 234 170 351 S10 1 325 -9 259 316 287 384 299 234 140 331 S10 1 357 -9 279 328 287 400 299 234 152 351 S15 1 377 163 267 316 287 416 259 234 134 339 S15 1 385 163 283 344 287 416 267 254 164 363 S17 1 333 163 263 316 285 380 179 234 164 355 S17 1 381 163 287 328 287 400 179 238 164 399 S21 1 333 163 271 356 285 388 219 250 158 335 S21 1 377 165 271 360 285 416 219 270 200 355 S22 1 373 163 251 316 285 404 211 234 158 335 S22 1 377 163 279 352 287 404 211 234 158 355 S26 1 377 163 259 324 285 424 -9 254 170 327 S26 1 405 163 283 324 287 424 -9 254 188 351 S28 1 333 163 267 324 287 380 187 246 164 351 S28 1 333 163 267 336 287 416 295 246 164 355 S32 1 321 163 259 300 285 396 291 250 152 343 S32 1 365 165 259 348 285 396 291 250 164 351 S33 1 325 163 263 -9 287 408 231 238 140 371 S33 1 357 163 263 -9 287 432 251 246 146 371 S37 1 361 163 267 320 285 416 195 254 164 343 S37 1 361 165 275 320 287 420 195 254 170 355 S38 1 377 163 267 348 287 416 223 234 164 339 S38 1 385 165 275 388 287 416 223 254 164 363 S40 1 373 163 255 336 287 384 191 234 158 347 S40 1 381 163 267 348 287 416 239 250 170 495 S44 1 333 163 279 300 287 408 255 238 146 351 S44 1 333 163 283 316 287 416 255 246 146 387 S45 1 345 163 -9 -9 287 412 187 234 158 -9 S45 1 389 163 -9 -9 287 416 215 234 176 -9 S49 1 349 163 271 312 285 388 191 238 164 351 S49 1 357 163 283 344 287 400 191 238 182 495 S52 1 353 163 259 300 287 380 195 258 -9 335 S52 1 385 163 279 320 287 424 195 258 -9 351 S56 1 325 163 267 300 287 400 263 226 146 355 S56 1 389 163 283 316 287 400 263 238 188 407 S57 1 357 163 263 300 287 412 199 234 146 343 S57 1 389 163 271 316 287 412 199 242 176 391 S61 1 369 163 287 316 287 380 175 246 164 339 S61 1 393 163 287 316 287 432 191 250 182 347 S62 1 377 163 275 316 287 392 -9 254 164 363 S62 1 385 165 275 348 287 392 -9 254 164 363 S63 1 333 163 267 332 285 388 175 234 158 347 S63 1 401 163 271 340 287 408 199 234 164 355 S67 1 377 163 251 320 287 412 187 246 158 351 S67 1 377 163 275 352 287 416 187 246 158 379 S68 1 377 163 287 332 287 404 291 238 164 339 S68 1 405 163 287 348 287 420 291 246 176 591 S69 1 325 163 255 308 285 364 191 226 -9 343 S69 1 353 163 279 316 285 416 307 242 -9 359 S71 1 385 163 255 328 285 384 251 230 164 343 S71 1 385 165 267 352 287 424 251 230 168 363 S73 1 341 159 259 308 285 416 191 230 164 335 S73 1 369 163 259 324 285 416 191 230 176 467 S74 1 -9 -9 271 320 287 384 183 226 152 347 S74 1 -9 -9 271 348 287 408 227 226 170 455 S75 1 373 163 251 308 285 380 195 234 158 351 S75 1 385 163 279 348 287 400 195 234 164 591 S79 1 333 163 275 304 287 400 211 218 170 339 S79 1 369 163 279 348 287 400 267 234 176 339 S80 1 393 163 259 316 287 388 255 230 170 407 S80 1 401 163 267 332 287 388 255 266 176 587 S86 1 369 163 283 316 285 388 203 242 152 355 S86 1 377 163 283 324 287 396 203 242 170 359 S87 1 333 163 279 300 287 408 255 242 146 351 S87 1 333 163 283 316 287 416 255 246 146 387 S92 1 321 163 259 316 287 384 211 230 164 343 S92 1 393 163 259 352 287 416 247 234 170 435 S97 1 325 163 263 316 287 380 151 230 152 343 S97 1 373 163 287 320 287 400 151 230 152 347 S98 1 357 159 283 316 287 408 191 246 152 355 S98 1 369 163 283 352 287 416 259 250 152 499 S99 1 373 163 255 332 285 372 195 234 140 351 S99 1 389 163 283 332 287 392 219 258 152 403 S100 1 365 163 263 316 287 384 203 238 152 355 S100 1 381 165 287 336 287 416 271 242 158 371 S101 1 325 163 263 320 285 416 243 246 158 355 S101 1 361 163 279 340 287 416 259 250 158 503 S102 1 361 163 271 316 285 384 239 250 152 355 S102 1 393 165 287 356 287 420 239 254 176 359 S103 1 353 163 263 304 287 388 135 246 146 343 S103 1 365 163 283 320 287 420 187 250 146 359 S104 1 361 163 287 304 -9 360 179 230 146 363 S104 1 365 163 291 312 -9 396 179 250 176 411 S106 1 333 163 263 316 287 416 295 234 140 327 S106 1 361 163 271 356 287 428 295 238 170 367 S109 1 333 163 271 316 287 404 151 250 152 351 S109 1 389 165 283 316 287 424 151 250 158 355 S110 1 365 163 279 316 285 404 195 250 188 343 S110 1 365 163 287 340 287 404 255 254 188 347 S111 1 329 163 275 320 287 420 203 238 140 399 S111 1 361 167 275 340 287 420 203 262 176 475 S112 1 325 163 251 316 287 416 -9 238 146 367 S112 1 353 165 287 316 287 416 -9 246 176 367 S113 1 353 163 271 352 -9 404 211 242 140 327 S113 1 373 163 291 352 -9 408 211 250 152 355 S116 1 361 163 279 316 287 388 227 226 152 355 S116 1 361 163 279 316 287 420 255 230 182 371 S117 1 353 163 291 336 287 388 191 234 140 343 S117 1 377 163 291 336 287 428 195 254 158 415 S118 1 325 163 271 336 287 384 179 238 140 367 S118 1 361 163 271 344 287 384 203 254 158 603 S122 1 325 163 259 300 -9 388 219 234 158 347 S122 1 385 163 279 320 -9 400 227 242 164 483 S124 1 329 163 255 312 287 412 211 246 158 343 S124 1 353 163 267 312 287 420 299 262 158 347 S125 1 337 163 263 344 287 380 191 242 140 363 S125 1 365 165 275 344 287 380 191 242 152 423 S126 1 357 162 263 316 287 400 183 234 152 347 S126 1 357 163 263 320 287 416 239 242 164 351 S128 1 325 163 279 312 287 388 215 250 146 359 S128 1 357 163 287 344 287 408 259 250 170 359 S129 1 305 161 -9 -9 285 396 -9 230 152 355 S129 1 305 165 -9 -9 285 396 -9 250 152 355 S130 1 353 163 263 316 287 400 271 238 152 375 S130 1 377 163 271 316 287 412 271 254 158 375 S131 1 325 163 263 316 -9 408 187 230 158 351 S131 1 377 163 267 316 -9 408 219 234 170 355 S133 1 325 163 279 308 287 408 279 230 164 335 S133 1 397 163 283 316 287 416 279 238 182 351 S134 1 345 163 279 308 287 388 235 238 176 347 S134 1 365 163 279 340 287 388 235 250 188 499 S135 1 325 155 275 308 287 412 251 234 170 339 S135 1 361 163 283 332 287 432 251 254 176 351 S136 1 349 155 271 344 287 400 195 238 140 351 S136 1 365 163 275 360 287 408 223 250 176 355 S138 1 369 165 275 336 285 396 187 234 164 331 S138 1 397 165 283 360 287 396 187 254 188 343 S139 1 373 163 279 360 285 388 231 238 140 387 S139 1 373 163 279 376 287 420 231 250 170 483 S141 1 365 163 275 308 287 412 179 234 140 395 S141 1 365 163 283 348 287 420 203 250 170 399 S142 1 333 163 267 300 287 408 179 238 164 455 S142 1 393 163 267 340 287 420 179 246 170 507 S143 1 325 163 255 324 287 388 255 234 176 343 S143 1 373 165 267 324 287 420 255 246 188 343 S147 1 369 163 271 300 287 388 227 238 170 351 S147 1 385 163 271 324 287 416 295 250 182 351 S149 1 325 163 283 304 287 404 187 238 164 343 S149 1 345 163 295 304 287 404 219 246 176 395 S150 1 353 163 263 300 285 388 191 226 158 347 S150 1 377 163 287 312 287 420 247 250 182 347 S151 1 365 163 259 316 285 360 175 234 134 447 S151 1 389 163 259 320 287 396 239 238 176 447 S154 1 333 163 275 340 285 380 195 234 158 363 S154 1 361 163 279 356 285 408 215 234 194 395 S155 1 361 163 255 308 287 416 151 234 134 331 S155 1 373 163 267 320 287 428 195 234 182 347 S156 1 361 -9 251 300 287 408 207 238 158 343 S156 1 381 -9 287 316 287 416 235 238 158 355 S157 1 365 163 271 348 285 396 195 230 152 347 S157 1 385 163 275 348 285 416 231 254 170 455 S158 1 377 163 279 308 285 408 183 234 140 323 S158 1 445 163 283 352 287 416 259 234 182 355 S159 1 357 163 263 324 285 380 211 246 152 327 S159 1 357 163 283 356 287 408 255 246 164 347 S161 1 329 163 259 308 287 384 191 234 140 359 S161 1 329 163 263 308 287 400 243 250 164 359 S162 1 353 163 283 316 287 396 183 242 182 -9 S162 1 385 163 283 316 287 404 183 242 182 -9 -- View this message in context: http://r.789695.n4.nabble.com/error-message-tp3223412p3994067.html Sent from the R help mailing list archive at Nabble.com. From michael.weylandt at gmail.com Sat Nov 5 20:20:28 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt ) Date: Sat, 5 Nov 2011 15:20:28 -0400 Subject: [R] error message In-Reply-To: <1320520337578-3994067.post@n4.nabble.com> References: <1295368328987-3223412.post@n4.nabble.com> <1320429978716-3991100.post@n4.nabble.com> <1320520337578-3994067.post@n4.nabble.com> Message-ID: What prompted the questions? If it was a specifically biological/genetic package the bioconductor list can probably provide better help than this list. Michael On Nov 5, 2011, at 3:12 PM, JulianaMF wrote: > Hello Michael, > Sorry, I am just starting to lear all this. > Here is one of my input files (from a .str file) in which the first column > are the individuals, the second is the pop info (in this case I am stating > that I have one pop because I am still trying to find out which are the > clusters in my sample) and the others are the 10 loci. The genotypes for > each individual are in 2 rows and missing data is -9. I tried both R and R > studio and I am working from a Mac. I followed all the routine > SimStru <- read.structure(file = "SimStru.str") > and then I answered all the prompted questions (I am writing from my memory > because the files are back at the lab): 82 individuals (820 genotypes), 10 > loci, no column with marker name, column with pop info = 2, other column = 1 > (names for each sample), row with marker names = 1, missing data = -9. I > didn't change any of the other default settings. And once I answered all the > questions, I got that error message. > Thank you so much to be willing to help and sorry about my ignorance, I am > new to this! > Ind Pop 134 220 28 18 414 24 42 58 423 12 > S3 1 349 163 267 316 287 412 275 234 164 351 > S3 1 369 165 267 336 287 424 275 238 188 351 > S5 1 345 163 271 316 287 360 187 234 152 343 > S5 1 365 163 283 336 287 388 187 246 152 615 > S9 1 353 163 275 300 287 400 231 234 164 347 > S9 1 361 163 279 336 287 416 275 234 170 351 > S10 1 325 -9 259 316 287 384 299 234 140 331 > S10 1 357 -9 279 328 287 400 299 234 152 351 > S15 1 377 163 267 316 287 416 259 234 134 339 > S15 1 385 163 283 344 287 416 267 254 164 363 > S17 1 333 163 263 316 285 380 179 234 164 355 > S17 1 381 163 287 328 287 400 179 238 164 399 > S21 1 333 163 271 356 285 388 219 250 158 335 > S21 1 377 165 271 360 285 416 219 270 200 355 > S22 1 373 163 251 316 285 404 211 234 158 335 > S22 1 377 163 279 352 287 404 211 234 158 355 > S26 1 377 163 259 324 285 424 -9 254 170 327 > S26 1 405 163 283 324 287 424 -9 254 188 351 > S28 1 333 163 267 324 287 380 187 246 164 351 > S28 1 333 163 267 336 287 416 295 246 164 355 > S32 1 321 163 259 300 285 396 291 250 152 343 > S32 1 365 165 259 348 285 396 291 250 164 351 > S33 1 325 163 263 -9 287 408 231 238 140 371 > S33 1 357 163 263 -9 287 432 251 246 146 371 > S37 1 361 163 267 320 285 416 195 254 164 343 > S37 1 361 165 275 320 287 420 195 254 170 355 > S38 1 377 163 267 348 287 416 223 234 164 339 > S38 1 385 165 275 388 287 416 223 254 164 363 > S40 1 373 163 255 336 287 384 191 234 158 347 > S40 1 381 163 267 348 287 416 239 250 170 495 > S44 1 333 163 279 300 287 408 255 238 146 351 > S44 1 333 163 283 316 287 416 255 246 146 387 > S45 1 345 163 -9 -9 287 412 187 234 158 -9 > S45 1 389 163 -9 -9 287 416 215 234 176 -9 > S49 1 349 163 271 312 285 388 191 238 164 351 > S49 1 357 163 283 344 287 400 191 238 182 495 > S52 1 353 163 259 300 287 380 195 258 -9 335 > S52 1 385 163 279 320 287 424 195 258 -9 351 > S56 1 325 163 267 300 287 400 263 226 146 355 > S56 1 389 163 283 316 287 400 263 238 188 407 > S57 1 357 163 263 300 287 412 199 234 146 343 > S57 1 389 163 271 316 287 412 199 242 176 391 > S61 1 369 163 287 316 287 380 175 246 164 339 > S61 1 393 163 287 316 287 432 191 250 182 347 > S62 1 377 163 275 316 287 392 -9 254 164 363 > S62 1 385 165 275 348 287 392 -9 254 164 363 > S63 1 333 163 267 332 285 388 175 234 158 347 > S63 1 401 163 271 340 287 408 199 234 164 355 > S67 1 377 163 251 320 287 412 187 246 158 351 > S67 1 377 163 275 352 287 416 187 246 158 379 > S68 1 377 163 287 332 287 404 291 238 164 339 > S68 1 405 163 287 348 287 420 291 246 176 591 > S69 1 325 163 255 308 285 364 191 226 -9 343 > S69 1 353 163 279 316 285 416 307 242 -9 359 > S71 1 385 163 255 328 285 384 251 230 164 343 > S71 1 385 165 267 352 287 424 251 230 168 363 > S73 1 341 159 259 308 285 416 191 230 164 335 > S73 1 369 163 259 324 285 416 191 230 176 467 > S74 1 -9 -9 271 320 287 384 183 226 152 347 > S74 1 -9 -9 271 348 287 408 227 226 170 455 > S75 1 373 163 251 308 285 380 195 234 158 351 > S75 1 385 163 279 348 287 400 195 234 164 591 > S79 1 333 163 275 304 287 400 211 218 170 339 > S79 1 369 163 279 348 287 400 267 234 176 339 > S80 1 393 163 259 316 287 388 255 230 170 407 > S80 1 401 163 267 332 287 388 255 266 176 587 > S86 1 369 163 283 316 285 388 203 242 152 355 > S86 1 377 163 283 324 287 396 203 242 170 359 > S87 1 333 163 279 300 287 408 255 242 146 351 > S87 1 333 163 283 316 287 416 255 246 146 387 > S92 1 321 163 259 316 287 384 211 230 164 343 > S92 1 393 163 259 352 287 416 247 234 170 435 > S97 1 325 163 263 316 287 380 151 230 152 343 > S97 1 373 163 287 320 287 400 151 230 152 347 > S98 1 357 159 283 316 287 408 191 246 152 355 > S98 1 369 163 283 352 287 416 259 250 152 499 > S99 1 373 163 255 332 285 372 195 234 140 351 > S99 1 389 163 283 332 287 392 219 258 152 403 > S100 1 365 163 263 316 287 384 203 238 152 355 > S100 1 381 165 287 336 287 416 271 242 158 371 > S101 1 325 163 263 320 285 416 243 246 158 355 > S101 1 361 163 279 340 287 416 259 250 158 503 > S102 1 361 163 271 316 285 384 239 250 152 355 > S102 1 393 165 287 356 287 420 239 254 176 359 > S103 1 353 163 263 304 287 388 135 246 146 343 > S103 1 365 163 283 320 287 420 187 250 146 359 > S104 1 361 163 287 304 -9 360 179 230 146 363 > S104 1 365 163 291 312 -9 396 179 250 176 411 > S106 1 333 163 263 316 287 416 295 234 140 327 > S106 1 361 163 271 356 287 428 295 238 170 367 > S109 1 333 163 271 316 287 404 151 250 152 351 > S109 1 389 165 283 316 287 424 151 250 158 355 > S110 1 365 163 279 316 285 404 195 250 188 343 > S110 1 365 163 287 340 287 404 255 254 188 347 > S111 1 329 163 275 320 287 420 203 238 140 399 > S111 1 361 167 275 340 287 420 203 262 176 475 > S112 1 325 163 251 316 287 416 -9 238 146 367 > S112 1 353 165 287 316 287 416 -9 246 176 367 > S113 1 353 163 271 352 -9 404 211 242 140 327 > S113 1 373 163 291 352 -9 408 211 250 152 355 > S116 1 361 163 279 316 287 388 227 226 152 355 > S116 1 361 163 279 316 287 420 255 230 182 371 > S117 1 353 163 291 336 287 388 191 234 140 343 > S117 1 377 163 291 336 287 428 195 254 158 415 > S118 1 325 163 271 336 287 384 179 238 140 367 > S118 1 361 163 271 344 287 384 203 254 158 603 > S122 1 325 163 259 300 -9 388 219 234 158 347 > S122 1 385 163 279 320 -9 400 227 242 164 483 > S124 1 329 163 255 312 287 412 211 246 158 343 > S124 1 353 163 267 312 287 420 299 262 158 347 > S125 1 337 163 263 344 287 380 191 242 140 363 > S125 1 365 165 275 344 287 380 191 242 152 423 > S126 1 357 162 263 316 287 400 183 234 152 347 > S126 1 357 163 263 320 287 416 239 242 164 351 > S128 1 325 163 279 312 287 388 215 250 146 359 > S128 1 357 163 287 344 287 408 259 250 170 359 > S129 1 305 161 -9 -9 285 396 -9 230 152 355 > S129 1 305 165 -9 -9 285 396 -9 250 152 355 > S130 1 353 163 263 316 287 400 271 238 152 375 > S130 1 377 163 271 316 287 412 271 254 158 375 > S131 1 325 163 263 316 -9 408 187 230 158 351 > S131 1 377 163 267 316 -9 408 219 234 170 355 > S133 1 325 163 279 308 287 408 279 230 164 335 > S133 1 397 163 283 316 287 416 279 238 182 351 > S134 1 345 163 279 308 287 388 235 238 176 347 > S134 1 365 163 279 340 287 388 235 250 188 499 > S135 1 325 155 275 308 287 412 251 234 170 339 > S135 1 361 163 283 332 287 432 251 254 176 351 > S136 1 349 155 271 344 287 400 195 238 140 351 > S136 1 365 163 275 360 287 408 223 250 176 355 > S138 1 369 165 275 336 285 396 187 234 164 331 > S138 1 397 165 283 360 287 396 187 254 188 343 > S139 1 373 163 279 360 285 388 231 238 140 387 > S139 1 373 163 279 376 287 420 231 250 170 483 > S141 1 365 163 275 308 287 412 179 234 140 395 > S141 1 365 163 283 348 287 420 203 250 170 399 > S142 1 333 163 267 300 287 408 179 238 164 455 > S142 1 393 163 267 340 287 420 179 246 170 507 > S143 1 325 163 255 324 287 388 255 234 176 343 > S143 1 373 165 267 324 287 420 255 246 188 343 > S147 1 369 163 271 300 287 388 227 238 170 351 > S147 1 385 163 271 324 287 416 295 250 182 351 > S149 1 325 163 283 304 287 404 187 238 164 343 > S149 1 345 163 295 304 287 404 219 246 176 395 > S150 1 353 163 263 300 285 388 191 226 158 347 > S150 1 377 163 287 312 287 420 247 250 182 347 > S151 1 365 163 259 316 285 360 175 234 134 447 > S151 1 389 163 259 320 287 396 239 238 176 447 > S154 1 333 163 275 340 285 380 195 234 158 363 > S154 1 361 163 279 356 285 408 215 234 194 395 > S155 1 361 163 255 308 287 416 151 234 134 331 > S155 1 373 163 267 320 287 428 195 234 182 347 > S156 1 361 -9 251 300 287 408 207 238 158 343 > S156 1 381 -9 287 316 287 416 235 238 158 355 > S157 1 365 163 271 348 285 396 195 230 152 347 > S157 1 385 163 275 348 285 416 231 254 170 455 > S158 1 377 163 279 308 285 408 183 234 140 323 > S158 1 445 163 283 352 287 416 259 234 182 355 > S159 1 357 163 263 324 285 380 211 246 152 327 > S159 1 357 163 283 356 287 408 255 246 164 347 > S161 1 329 163 259 308 287 384 191 234 140 359 > S161 1 329 163 263 308 287 400 243 250 164 359 > S162 1 353 163 283 316 287 396 183 242 182 -9 > S162 1 385 163 283 316 287 404 183 242 182 -9 > > -- > View this message in context: http://r.789695.n4.nabble.com/error-message-tp3223412p3994067.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jmf at ib.usp.br Sat Nov 5 21:00:29 2011 From: jmf at ib.usp.br (JulianaMF) Date: Sat, 5 Nov 2011 13:00:29 -0700 (PDT) Subject: [R] error message In-Reply-To: References: <1295368328987-3223412.post@n4.nabble.com> <1320429978716-3991100.post@n4.nabble.com> <1320520337578-3994067.post@n4.nabble.com> Message-ID: <1320523229682-3994158.post@n4.nabble.com> Humm... I was using adegenet / ade4 packages and both R and R studio prompted the questions. Sorry, I did so many searches on R help and Adegenet help that I forgot to mention the packages I was using... Juliana -- View this message in context: http://r.789695.n4.nabble.com/error-message-tp3223412p3994158.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Sat Nov 5 21:04:51 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 5 Nov 2011 16:04:51 -0400 Subject: [R] nested "for" loops In-Reply-To: <1320450554839-3992089.post@n4.nabble.com> References: <1320450554839-3992089.post@n4.nabble.com> Message-ID: <94EF479A-7C57-4396-B667-D00565CA96D6@comcast.net> You need to define "l" as a dimensioned object , either a vector or an array, and then assign the value of your calculation to the correctly indexed "location" in that object. Otherwise you are just overwriting the value each time through the loop. Use these help pages (and review "Introduction to R" ?array ?vector ?"[" On Nov 4, 2011, at 7:49 PM, nick_pan wrote: > Hi all , I have written a code with nested "for" loops . > The aim is to estimate the maximum likelihood by creating 3 vectors with the > same length( sequence ) > and then to utilize 3 "for" loops to make combinations among the 3 vectors , > which are (length)^3 in number , and find the one that maximize the > likelihood ( maximum likelihood estimator). > > > The code I created, runs but I think something goes wrong...because when I > change the length of the vectors but not the bounds the result is the > same!!! > > I will give a simple example(irrelevant but proportional to the above) to > make it more clear... > > Lets say we want to find the combination that maximize the multiplication of > the entries of some vectors. > > V1<-c(1,2,3) > V2<-c(5, 2 , 4) > V3<-c( 4, 3, 6) > > The combination we look for is ( 3 , 5 , 6) that give us 3*5*6 = 90 > > If I apply the following in R , I won't take this result > > V1<-c(1,2,3) > V2<-c(5, 2 , 4) > V3<-c( 4, 3, 6) > > for( i in V1){ > for( j in V2) { > for( k in V3){ > > l<- i*j*k > > } > } > } > l > > Then " l<- i*j*k " is number and not vector(of all multiplications of all > the combinations) , and is 3*4*6 = 72. > > How can I fix the code? > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/nested-for-loops-tp3992089p3992089.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From rolf.turner at xtra.co.nz Sat Nov 5 21:22:26 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Sun, 06 Nov 2011 09:22:26 +1300 Subject: [R] set seed for random draws In-Reply-To: <4EB4FB3F.80604@pburns.seanet.com> References: <4EB4FB3F.80604@pburns.seanet.com> Message-ID: <4EB59B02.4030801@xtra.co.nz> On 05/11/11 22:00, Patrick Burns wrote: > I'd suggest two rules of thumb when coming > up against something in R that you aren't > sure about: > > 1. If it is a mundane task, R probably > takes care of it. > > 2. Experiment to see what happens. > > > Of course you could read documentation, but > no one does that. Fortune nomination! cheers, Rolf From fmora at oikos.unam.mx Sat Nov 5 22:01:53 2011 From: fmora at oikos.unam.mx (Francisco Mora Ardila) Date: Sat, 5 Nov 2011 15:01:53 -0600 Subject: [R] testing significance of axis loadings from multivariate dudi.mix Message-ID: <20111105205325.M26251@oikos.unam.mx> Hi all I?m trying to tests the significance of loadings from a ordination of 46 variables (caategorical, ordinal and nominal). I used dudi.mix from ade4 for the ordination. A years ago Jari Oksanen wrote this script implementing Peres-Neto et al. 2003 (Ecology) bootstraping method: netoboot <- function (x, permutations=1000, ...) { pcnull <- princomp(x, cor = TRUE, ...) res <- pcnull$loadings out <- matrix(0, nrow=nrow(res), ncol=ncol(res)) N <- nrow(x) for (i in 1:permutations) { pc <- princomp(x[sample(N, replace=TRUE), ], cor = TRUE ...) pred <- predict(pc, newdata = x) r <- cor(pcnull$scores, pred) k <- apply(abs(r), 2, which.max) reve <- sign(diag(r[k,])) sol <- pc$loadings[ ,k] sol <- sweep(sol, 2, reve, "*") out <- out + ifelse(res > 0, sol <= 0, sol >= 0) } out/permutations } I tried to aply it to the case of dudi.mix instead of princomp this way: netoboot1<-function (x, permutations=1000,...) { dudinull <- dudi.mix(x, scannf = FALSE, nf = 3) res <- dudinull$c1 out <- matrix(0, nrow=nrow(res), ncol=ncol(res)) N <- nrow(x) for (i in 1:permutations) { dudi <- dudi.mix(x[sample(N, replace=TRUE), ], scannf = FALSE, nf = 3) pred <- predict(dudi, newdata = x) r <- cor(dudinull$li, pred) k <- apply(abs(r), 2, which.max) reve <- sign(diag(r[k,])) sol <- dudi$c1[ ,k] sol <- sweep(sol, 2, reve, "*") out <- out + ifelse(res > 0, sol <= 0, sol >= 0) } out/permutations } But a problem arised with the predict function: it doesn?t seem to work with an object from dudi.mix and I dont understand why. Can somebody tell me why? Any suggestions to modify the script or to use other method? Thanks in advance. Francisco Francisco Mora Ardila Laboratorio de Biodiversidad y Funcionamiento del Ecosistema Centro de Investigaciones en Ecosistemas UNAM-Campus Morelia Tel 3222777 ext. 42621 Morelia , MIchoac?n, M?xico. -- Open WebMail Project (http://openwebmail.org) From zndeana at ku.edu Sun Nov 6 00:22:43 2011 From: zndeana at ku.edu (Md Desa, Zairul Nor Deana Binti) Date: Sat, 5 Nov 2011 23:22:43 +0000 Subject: [R] set seed for random draws In-Reply-To: <4EB59B02.4030801@xtra.co.nz> References: <4EB4FB3F.80604@pburns.seanet.com>,<4EB59B02.4030801@xtra.co.nz> Message-ID: Thank you everybody for the helpful advices. Basically, I try to figure out why I get different numbers as there are more than one seed for a loop within a loop. Well, I guest I got it now. Because every time random seed is called or specified it'll output different random numbers, as it's requested. Thanks! D ________________________________________ From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] on behalf of Rolf Turner [rolf.turner at xtra.co.nz] Sent: Saturday, November 05, 2011 3:22 PM To: Patrick Burns Cc: r-help at r-project.org; Achim.Zeileis at uibk.ac.at Subject: Re: [R] set seed for random draws On 05/11/11 22:00, Patrick Burns wrote: > I'd suggest two rules of thumb when coming > up against something in R that you aren't > sure about: > > 1. If it is a mundane task, R probably > takes care of it. > > 2. Experiment to see what happens. > > > Of course you could read documentation, but > no one does that. Fortune nomination! cheers, Rolf ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From gunter.berton at gene.com Sun Nov 6 00:52:29 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Sat, 5 Nov 2011 16:52:29 -0700 Subject: [R] nested "for" loops In-Reply-To: <4EB5790B.8030604@witthoft.com> References: <4EB5790B.8030604@witthoft.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ericstrom at aol.com Sat Nov 5 21:47:30 2011 From: ericstrom at aol.com (eric) Date: Sat, 5 Nov 2011 13:47:30 -0700 (PDT) Subject: [R] install.packages problem Message-ID: <1320526050344-3994239.post@n4.nabble.com> I'm trying to install the rdatamarket package. I did an install.packages('rdatamarket') command but got an error about half way through the install as follows: * installing *source* package ?RCurl? ... checking for curl-config... no Cannot find curl-config ERROR: configuration failed for package ?RCurl? The install continued after the error but looks like it was completed. I'm trying to figure out what the error means and how I fix it. Here's what I'm seeing ...ideas on how to address this would be appreciated : install.packages('rdatamarket') Installing package(s) into ?/home/eric/R/i486-pc-linux-gnu-library/2.13? (as ?lib? is unspecified) --- Please select a CRAN mirror for use in this session --- also installing the dependencies ?RCurl?, ?RJSONIO? trying URL 'http://lib.stat.cmu.edu/R/CRAN/src/contrib/RCurl_1.7-0.tar.gz' Content type 'application/x-gzip' length 813252 bytes (794 Kb) opened URL ================================================== downloaded 794 Kb trying URL 'http://lib.stat.cmu.edu/R/CRAN/src/contrib/RJSONIO_0.96-0.tar.gz' Content type 'application/x-gzip' length 1144519 bytes (1.1 Mb) opened URL ================================================== downloaded 1.1 Mb trying URL 'http://lib.stat.cmu.edu/R/CRAN/src/contrib/rdatamarket_0.6.3.tar.gz' Content type 'application/x-gzip' length 12432 bytes (12 Kb) opened URL ================================================== downloaded 12 Kb * installing *source* package ?RCurl? ... checking for curl-config... no Cannot find curl-config ERROR: configuration failed for package ?RCurl? * removing ?/home/eric/R/i486-pc-linux-gnu-library/2.13/RCurl? * installing *source* package ?RJSONIO? ... Trying to find libjson.h header file checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed USE_LOCAL = "" Using local libjson code. Copying files /tmp/RtmpFw9QeX/R.INSTALL4ebf657f/RJSONIO configure: creating ./config.status config.status: creating src/Makevars config.status: creating cleanup ** libs gcc -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -std=gnu99 -O3 -pipe -g -c ConvertUTF.c -o ConvertUTF.o g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONChildren.cpp -o JSONChildren.o g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONDebug.cpp -o JSONDebug.o g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONIterators.cpp -o JSONIterators.o g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONMemory.cpp -o JSONMemory.o g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONNode.cpp -o JSONNode.o g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONNode_Mutex.cpp -o JSONNode_Mutex.o g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONStream.cpp -o JSONStream.o g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONValidator.cpp -o JSONValidator.o g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONWorker.cpp -o JSONWorker.o g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONWriter.cpp -o JSONWriter.o g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSON_Base64.cpp -o JSON_Base64.o gcc -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -std=gnu99 -O3 -pipe -g -c JSON_parser.c -o JSON_parser.o gcc -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -std=gnu99 -O3 -pipe -g -c RJSON.c -o RJSON.o g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c internalJSONNode.cpp -o internalJSONNode.o g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c libjson.cpp -o libjson.o gcc -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 -DJSON_NO_EXCEPTIONS=1 -fpic -std=gnu99 -O3 -pipe -g -c rlibjson.c -o rlibjson.o g++ -shared -o RJSONIO.so ConvertUTF.o JSONChildren.o JSONDebug.o JSONIterators.o JSONMemory.o JSONNode.o JSONNode_Mutex.o JSONStream.o JSONValidator.o JSONWorker.o JSONWriter.o JSON_Base64.o JSON_parser.o RJSON.o internalJSONNode.o libjson.o rlibjson.o -L/usr/lib/R/lib -lR installing to /home/eric/R/i486-pc-linux-gnu-library/2.13/RJSONIO/libs ** R ** inst ** preparing package for lazy loading in method for ?toJSON? with signature ?"AsIs"?: no definition for class "AsIs" ** help *** installing help indices ** building package indices ... ** testing if installed package can be loaded * DONE (RJSONIO) ERROR: dependency ?RCurl? is not available for package ?rdatamarket? * removing ?/home/eric/R/i486-pc-linux-gnu-library/2.13/rdatamarket? The downloaded packages are in ?/tmp/Rtmpz8PqWE/downloaded_packages? > require(rdatamarket) Loading required package: rdatamarket Warning message: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, : there is no package called 'rdatamarket' -- View this message in context: http://r.789695.n4.nabble.com/install-packages-problem-tp3994239p3994239.html Sent from the R help mailing list archive at Nabble.com. From johndarrenwood at gmail.com Sat Nov 5 17:33:12 2011 From: johndarrenwood at gmail.com (John Darrenwood) Date: Sat, 5 Nov 2011 18:33:12 +0200 Subject: [R] ANESRAKE package: Inappropriate error message, given the data Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From kindlychung at gmail.com Sat Nov 5 18:49:51 2011 From: kindlychung at gmail.com (Kaiyin Zhong) Date: Sun, 6 Nov 2011 01:49:51 +0800 Subject: [R] Correlation between matrices Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From julia.lira at hotmail.co.uk Sat Nov 5 19:02:46 2011 From: julia.lira at hotmail.co.uk (Julia Lira) Date: Sat, 5 Nov 2011 18:02:46 +0000 Subject: [R] linear against nonlinear alternatives - quantile regression In-Reply-To: References: , <1320504195354-3993416.post@n4.nabble.com>, Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From nick_pan88 at yahoo.gr Sat Nov 5 19:20:48 2011 From: nick_pan88 at yahoo.gr (nick_pan) Date: Sat, 5 Nov 2011 11:20:48 -0700 (PDT) Subject: [R] nested "for" loops In-Reply-To: References: <1320450554839-3992089.post@n4.nabble.com> <1320459266708-3992324.post@n4.nabble.com> Message-ID: <1320517248611-3993917.post@n4.nabble.com> I found the way out - it was because the borders of the vectors was close enough thats why I had the same result while I was adding points to the sequence. The example I gave was irrelevant but I made in order to find out that the problem was. Thank you all for your answers. -- View this message in context: http://r.789695.n4.nabble.com/nested-for-loops-tp3992089p3993917.html Sent from the R help mailing list archive at Nabble.com. From scottdaniel25 at gmail.com Sun Nov 6 00:20:36 2011 From: scottdaniel25 at gmail.com (ScottDaniel) Date: Sat, 5 Nov 2011 16:20:36 -0700 (PDT) Subject: [R] Doing dist on separate objects in a text file Message-ID: <1320535236339-3994515.post@n4.nabble.com> So I have a text file that looks like this: "Label" "X" "Y" "Slice" 1 "Field_1_R3D_D3D_PRJ_w617.tif" 348 506 1 2 "Field_1_R3D_D3D_PRJ_w617.tif" 359 505 1 3 "Field_1_R3D_D3D_PRJ_w617.tif" 356 524 1 4 "Field_1_R3D_D3D_PRJ_w617.tif" 2 0 1 5 "Field_1_R3D_D3D_PRJ_w617.tif" 412 872 1 6 "Field_1_R3D_D3D_PRJ_w617.tif" 422 863 1 7 "Field_1_R3D_D3D_PRJ_w617.tif" 429 858 1 8 "Field_1_R3D_D3D_PRJ_w617.tif" 429 880 1 9 "Field_1_R3D_D3D_PRJ_w617.tif" 437 865 1 10 "Field_1_R3D_D3D_PRJ_w617.tif" 447 855 1 11 "Field_1_R3D_D3D_PRJ_w617.tif" 450 868 1 12 "Field_1_R3D_D3D_PRJ_w617.tif" 447 875 1 13 "Field_1_R3D_D3D_PRJ_w617.tif" 439 885 1 14 "Field_1_R3D_D3D_PRJ_w617.tif" 2 8 1 What it represents are the locations of centromeres per nucleus in a microscope image. What I need to do is do a dist() on each grouping (the grouping being separated by the low values of x and y's) and then compute an average. The part that I'm having trouble with is writing code that will allow R to separate these objects. Do I have to find some way of creating separate data frames for each object? Or is there a way to parse the file and generate a single data frame of all the pairwise distances? Any suggestions or example code would be much appreciated. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Doing-dist-on-separate-objects-in-a-text-file-tp3994515p3994515.html Sent from the R help mailing list archive at Nabble.com. From jdnewmil at dcn.davis.ca.us Sun Nov 6 03:11:30 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Sat, 05 Nov 2011 19:11:30 -0700 Subject: [R] nested "for" loops In-Reply-To: References: <4EB5790B.8030604@witthoft.com> Message-ID: <229f4379-8d5b-42a1-9406-399541cc752c@email.android.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.weylandt at gmail.com Sun Nov 6 03:57:35 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Sat, 5 Nov 2011 22:57:35 -0400 Subject: [R] Doing dist on separate objects in a text file In-Reply-To: <1320535236339-3994515.post@n4.nabble.com> References: <1320535236339-3994515.post@n4.nabble.com> Message-ID: Perhaps split() directly or more abstractly tapply() from base or one of the d_ply() from plyr? Michael On Sat, Nov 5, 2011 at 7:20 PM, ScottDaniel wrote: > So I have a text file that looks like this: > ? ? ? ?"Label" "X" ? ? "Y" ? ? "Slice" > 1 ? ? ? "Field_1_R3D_D3D_PRJ_w617.tif" ?348 ? ? 506 ? ? 1 > 2 ? ? ? "Field_1_R3D_D3D_PRJ_w617.tif" ?359 ? ? 505 ? ? 1 > 3 ? ? ? "Field_1_R3D_D3D_PRJ_w617.tif" ?356 ? ? 524 ? ? 1 > 4 ? ? ? "Field_1_R3D_D3D_PRJ_w617.tif" ?2 ? ? ? 0 ? ? ? 1 > 5 ? ? ? "Field_1_R3D_D3D_PRJ_w617.tif" ?412 ? ? 872 ? ? 1 > 6 ? ? ? "Field_1_R3D_D3D_PRJ_w617.tif" ?422 ? ? 863 ? ? 1 > 7 ? ? ? "Field_1_R3D_D3D_PRJ_w617.tif" ?429 ? ? 858 ? ? 1 > 8 ? ? ? "Field_1_R3D_D3D_PRJ_w617.tif" ?429 ? ? 880 ? ? 1 > 9 ? ? ? "Field_1_R3D_D3D_PRJ_w617.tif" ?437 ? ? 865 ? ? 1 > 10 ? ? ?"Field_1_R3D_D3D_PRJ_w617.tif" ?447 ? ? 855 ? ? 1 > 11 ? ? ?"Field_1_R3D_D3D_PRJ_w617.tif" ?450 ? ? 868 ? ? 1 > 12 ? ? ?"Field_1_R3D_D3D_PRJ_w617.tif" ?447 ? ? 875 ? ? 1 > 13 ? ? ?"Field_1_R3D_D3D_PRJ_w617.tif" ?439 ? ? 885 ? ? 1 > 14 ? ? ?"Field_1_R3D_D3D_PRJ_w617.tif" ?2 ? ? ? 8 ? ? ? 1 > > What it represents are the locations of centromeres per nucleus in a > microscope image. What I need to do is do a dist() on each grouping (the > grouping being separated by the low values of x and y's) and then compute an > average. The part that I'm having trouble with is writing code that will > allow R to separate these objects. Do I have to find some way of creating > separate data frames for each object? Or is there a way to parse the file > and generate a single data frame of all the pairwise distances? Any > suggestions or example code would be much appreciated. Thanks! > > -- > View this message in context: http://r.789695.n4.nabble.com/Doing-dist-on-separate-objects-in-a-text-file-tp3994515p3994515.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From ahz001 at gmail.com Sun Nov 6 04:57:06 2011 From: ahz001 at gmail.com (Andrew Z) Date: Sat, 5 Nov 2011 21:57:06 -0600 Subject: [R] install.packages problem In-Reply-To: <1320526050344-3994239.post@n4.nabble.com> References: <1320526050344-3994239.post@n4.nabble.com> Message-ID: On Sat, Nov 5, 2011 at 2:47 PM, eric wrote: > I'm trying to install the rdatamarket package. I did an > install.packages('rdatamarket') command but got an error about half way > through the install as follows: > > * installing *source* package ?RCurl? ... > checking for curl-config... no > Cannot find curl-config > ERROR: configuration failed for package ?RCurl? > > The install continued after the error but looks like it was completed. I'm > trying to figure out what the error means and how I fix it. I think you are on Linux and missing the library headers to build against the curl. On Fedora or RedHat you would do something like 'sudo yum -y install curl-devel' and on Ubuntu it may be 'sudo apt-get install curl-dev' Andrew From michael.weylandt at gmail.com Sun Nov 6 05:02:17 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Sun, 6 Nov 2011 00:02:17 -0400 Subject: [R] nested "for" loops In-Reply-To: <1320517248611-3993917.post@n4.nabble.com> References: <1320450554839-3992089.post@n4.nabble.com> <1320459266708-3992324.post@n4.nabble.com> <1320517248611-3993917.post@n4.nabble.com> Message-ID: No idea how this relates to what you said originally but glad you got it all worked out. And let us all reiterate: really, don't use nested for loops...there's a better way: promise! Michael On Sat, Nov 5, 2011 at 2:20 PM, nick_pan wrote: > I found the way out - it was because the borders of the vectors was close > enough thats why I had the same result while I was adding points to the > sequence. The example I gave was irrelevant but I made in order to find out > that the problem was. > Thank you all for your answers. > > > -- > View this message in context: http://r.789695.n4.nabble.com/nested-for-loops-tp3992089p3993917.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From michael.weylandt at gmail.com Sun Nov 6 05:21:34 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Sun, 6 Nov 2011 00:21:34 -0400 Subject: [R] Matrix element-by-element multiplication In-Reply-To: <3a6945a1-67cc-4a87-b795-542e67942f98@kedge2.utk.tennessee.edu> References: <1988657a-bd29-47f6-a85e-a088c62b85b5@kedge2.utk.tennessee.edu> <3a6945a1-67cc-4a87-b795-542e67942f98@kedge2.utk.tennessee.edu> Message-ID: There are a few (nasty?) side-effects to c(), one of which is stripping a matrix of its dimensionality. E.g., x <- matrix(1:4, 2) c(x) [1] 1 2 3 4 So that's probably what happened to you. R has a somewhat odd feature of not really considering a pure vector as a column or row vector but being willing to change it to either: e.g. y <- 1:2 x %*% y y %*% x y %*% y while matrix(y) %*% x throws an error, which can also trip folks up. You might also note that x * y and y*x return the same thing in this problem. Getting back to your problem: what are v and b and what are you hoping to get done? Specifically, what happened when you tried v*b (give the exact error message). It seems likely that they are non-conformable matrices, but here non-conformable for element-wise multiplication doesn't mean the same thing as it does for matrix multiplication. E.g., x <- matrix(1:4,2) y <- matrix(1:6,2) dim(x) [1] 2 2 dim(y) [1] 2 3 x * y -- here R seems to want matrices with identical dimensions, but i can't promise that. x %*% y does work. Hope this helps and yes I know it can seem crazy at first, but there really is reason behind it at the end of the tunnel, Michael On Sun, Nov 6, 2011 at 12:11 AM, Steven Yen wrote: > My earlier attempt > > ?? dp <- v*b > > did not work. Then, > > ?? dp <- c(v)*b > > worked. > > Confused, > > Steven > > At 09:10 PM 11/4/2011, you wrote: > > Did you even try? > > a <- 1:3 > x <-? matrix(c(1,2,3,2,4,6,3,6,9),3) > a*x > > ???? [,1] [,2] [,3] > [1,]??? 1??? 2??? 3 > [2,]??? 4??? 8?? 12 > [3,]??? 9?? 18?? 27 > > Michael > > On Fri, Nov 4, 2011 at 7:26 PM, Steven Yen wrote: >> is there a way to do element-by-element multiplication as in Gauss >> and MATLAB, as shown below? Thanks. >> >> --- >> a >> >>??????? 1.0000000 >>??????? 2.0000000 >>??????? 3.0000000 >> x >> >>??????? 1.0000000??????? 2.0000000??????? 3.0000000 >>??????? 2.0000000??????? 4.0000000??????? 6.0000000 >>??????? 3.0000000??????? 6.0000000??????? 9.0000000 >> a.*x >> >>??????? 1.0000000??????? 2.0000000??????? 3.0000000 >>??????? 4.0000000??????? 8.0000000??????? 12.000000 >>??????? 9.0000000??????? 18.000000??????? 27.000000 >> >> >> -- >> Steven T. Yen, Professor of Agricultural Economics >> The University of Tennessee >> http://web.utk.edu/~syen/ >>??????? [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Steven T. Yen, Professor of Agricultural Economics > The University of Tennessee > http://web.utk.edu/~syen/ From michael.weylandt at gmail.com Sun Nov 6 05:27:33 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Sun, 6 Nov 2011 00:27:33 -0400 Subject: [R] set seed for random draws In-Reply-To: References: <4EB4FB3F.80604@pburns.seanet.com> <4EB59B02.4030801@xtra.co.nz> Message-ID: I think it's easier than you are making it: the random seed is created in a "pretty-random" way when you first use it and then it is updated with each call to rDIST(). For example, set.seed(1) x1 <- .Random.seed rnorm(1) x2 <- .Random.seed rnorm(1) x3 <- .Random.seed identical(x1, x2) FALSE identical(x1, x3) FALSE identical(x2, x3) FALSE set.seed(1) identical(x1, .Random.seed) TRUE rnorm(2) identical(x3, .Random.seed) TRUE But the period for the random seed to repeat is very, very long so you don't have to think about it unless you really need to (or for reproducible simulations) Michael On Sat, Nov 5, 2011 at 7:22 PM, Md Desa, Zairul Nor Deana Binti wrote: > Thank you everybody for the helpful advices. > Basically, I try to figure out why I get different numbers as there are more than one seed for a loop within a loop. Well, I guest I got it now. Because every time random seed is called or specified it'll output different random numbers, as it's requested. > > Thanks! > > D > ________________________________________ > From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] on behalf of Rolf Turner [rolf.turner at xtra.co.nz] > Sent: Saturday, November 05, 2011 3:22 PM > To: Patrick Burns > Cc: r-help at r-project.org; Achim.Zeileis at uibk.ac.at > Subject: Re: [R] set seed for random draws > > On 05/11/11 22:00, Patrick Burns wrote: > > >> I'd suggest two rules of thumb when coming >> up against something in R that you aren't >> sure about: >> >> 1. If it is a mundane task, R probably >> takes care of it. >> >> 2. Experiment to see what happens. >> >> >> Of course you could read documentation, but >> no one does that. > > > > Fortune nomination! > > ? ? cheers, > > ? ? ? ? Rolf > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From djmuser at gmail.com Sun Nov 6 05:53:34 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Sat, 5 Nov 2011 21:53:34 -0700 Subject: [R] Correlation between matrices In-Reply-To: References: Message-ID: Hi: I don't think you want to keep these objects separate; it's better to combine everything into a data frame. Here's a variation of your example - the x variable ends up being a mouse, but you may have another variable that's more appropriate to plot so take this as a starting point. One plot uses the ggplot2 package, the other uses the lattice and latticeExtra packages. library('ggplot2') regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain', 'cerebellum') mice = paste('mouse', 1:5, sep='') elem <- c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme') # Generate a data frame from the combinations of # mice, regions and elem: d <- data.frame(expand.grid(mice = mice, regions = regions, elem = elem), y = rnorm(125)) # Create a numeric version of mice d$mouse <- as.numeric(d$mice) # A function to return regression coefficients coefun <- function(df) coef(lm(y ~ mouse), data = df) # Apply to all regions * elem combinations coefs <- ddply(d, .(regions, elem), coefun) names(coefs) <- c('regions', 'elem', 'b0', 'b1') # Generate the plot using package ggplot2: ggplot(d, aes(x = mouse, y = y)) + geom_point(size = 2.5) + geom_abline(data = coefs, aes(intercept = b0, slope = b1), size = 1) + facet_grid(elem ~ regions) # Same plot in lattice: library('lattice') library('latticeExtra') p <- xyplot(y ~ mouse | elem + regions, data = d, type = c('p', 'r'), layout = c(5, 5)) HTH, Dennis On Sat, Nov 5, 2011 at 10:49 AM, Kaiyin Zhong wrote: >> regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain', > 'cerebellum') >> mice = paste('mouse', 1:5, sep='') >> for (n in c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')) { > + ? assign(n, as.data.frame(replicate(5, rnorm(5)))) > + } >> names(Cu) = names(Zn) = names(Fe) = names(Ca) = names(Enzyme) = regions >> row.names(Cu) = row.names(Zn) = row.names(Fe) = row.names(Ca) = > row.names(Enzyme) = mice >> Cu > ? ? ? ? ? cortex hippocampus brain_stem ?mid_brain cerebellum > mouse1 -0.5436573 -0.31486713 ?0.1039148 -0.3908665 -1.0849112 > mouse2 ?1.4559136 ?1.75731752 -2.1195118 -0.9894767 ?0.3609033 > mouse3 -0.6735427 -0.04666507 ?0.9641000 ?0.4683339 ?0.7419944 > mouse4 ?0.6926557 -0.47820023 ?1.3560802 ?0.9967562 -1.3727874 > mouse5 ?0.2371585 ?0.20031393 -1.4978517 ?0.7535148 ?0.5632443 >> Zn > ? ? ? ? ? ?cortex hippocampus brain_stem ?mid_brain ?cerebellum > mouse1 -0.66424043 ? 0.6664478 ?1.1983546 ?0.0319403 ?0.41955740 > mouse2 -1.14510448 ? 1.5612235 ?0.3210821 ?0.4094753 ?1.01637466 > mouse3 -0.85954416 ? 2.8275458 -0.6922565 -0.8182307 -0.06961242 > mouse4 ?0.03606034 ?-0.7177256 ?0.7067217 ?0.2036655 -0.25542524 > mouse5 ?0.67427572 ? 0.6171704 ?0.1044267 -1.8636174 -0.07654666 >> Fe > ? ? ? ? ? cortex hippocampus ?brain_stem ?mid_brain cerebellum > mouse1 ?1.8337008 ? 2.0884261 ?0.29730413 -1.6884804 ?0.8336137 > mouse2 -0.2734139 ?-0.5728439 ?0.63791556 -0.6232828 -1.1352224 > mouse3 -0.4795082 ? 0.1627235 ?0.21775206 ?1.0751584 -0.5581422 > mouse4 ?1.7125147 ?-0.5830600 ?1.40597896 -0.2815305 ?0.3776360 > mouse5 -0.3469067 ?-0.4813120 -0.09606797 ?1.0970077 -1.1234038 >> Ca > ? ? ? ? ? cortex hippocampus ?brain_stem ? mid_brain cerebellum > mouse1 -0.7663354 ? 0.8595091 ?1.33803798 -1.17651576 ?0.8299963 > mouse2 -0.7132260 ?-0.2626811 ?0.08025079 -2.40924271 ?0.7883005 > mouse3 -0.7988904 ?-0.1144639 -0.65901136 ?0.42462227 ?0.7068755 > mouse4 ?0.3880393 ? 0.5570068 -0.49969135 ?0.06633009 -1.3497228 > mouse5 ?1.0077684 ? 0.6023264 -0.57387762 ?0.25919461 -0.9337281 >> Enzyme > ? ? ? ? ? cortex hippocampus ?brain_stem ?mid_brain cerebellum > mouse1 ?1.3430936 ? 0.5335819 -0.56992947 ?1.3565803 -0.8323391 > mouse2 ?1.0520850 ?-1.0201124 ?0.89600005 ?1.4719880 ?1.0854768 > mouse3 -0.2802482 ? 0.6863323 -1.37483570 -0.7790174 ?0.2446761 > mouse4 -0.1916415 ?-0.4566571 ?1.93365932 ?1.3493848 ?0.2130424 > mouse5 -1.0349593 ?-0.1940268 -0.07216321 -0.2968288 ?1.7406905 > > In each anatomic region, I would like to calculate the correlation between > Enzyme activity and each of the concentrations of Cu, Zn, Fe, and Ca, and > do a scatter plot with a tendency line, organizing those plots into a grid. > See the image below for the desired effect: > http://postimage.org/image/62brra6jn/ > How can I achieve this? > > Thank you in advance. > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From ripley at stats.ox.ac.uk Sun Nov 6 07:24:14 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Sun, 6 Nov 2011 06:24:14 +0000 (GMT) Subject: [R] install.packages problem In-Reply-To: <1320526050344-3994239.post@n4.nabble.com> References: <1320526050344-3994239.post@n4.nabble.com> Message-ID: This is something missing from your (unstated) Linux installation. curl-config is part of the original libcurl sources, but Linux distributors tend to separte it out. *How* they do so is non-standard: Fedora and other RPM-based distributions tend to use libcurl-devel Debian and related tend to use libcurl-dev You need to figure this out for your distribution and install the missing piece. On Sat, 5 Nov 2011, eric wrote: > I'm trying to install the rdatamarket package. I did an > install.packages('rdatamarket') command but got an error about half way > through the install as follows: > > * installing *source* package ?RCurl? ... > checking for curl-config... no > Cannot find curl-config > ERROR: configuration failed for package ?RCurl? > > The install continued after the error but looks like it was completed. I'm > trying to figure out what the error means and how I fix it. > > Here's what I'm seeing ...ideas on how to address this would be appreciated > : > > install.packages('rdatamarket') > Installing package(s) into ?/home/eric/R/i486-pc-linux-gnu-library/2.13? > (as ?lib? is unspecified) > --- Please select a CRAN mirror for use in this session --- > also installing the dependencies ?RCurl?, ?RJSONIO? > > trying URL 'http://lib.stat.cmu.edu/R/CRAN/src/contrib/RCurl_1.7-0.tar.gz' > Content type 'application/x-gzip' length 813252 bytes (794 Kb) > opened URL > ================================================== > downloaded 794 Kb > > trying URL > 'http://lib.stat.cmu.edu/R/CRAN/src/contrib/RJSONIO_0.96-0.tar.gz' > Content type 'application/x-gzip' length 1144519 bytes (1.1 Mb) > opened URL > ================================================== > downloaded 1.1 Mb > > trying URL > 'http://lib.stat.cmu.edu/R/CRAN/src/contrib/rdatamarket_0.6.3.tar.gz' > Content type 'application/x-gzip' length 12432 bytes (12 Kb) > opened URL > ================================================== > downloaded 12 Kb > > * installing *source* package ?RCurl? ... > checking for curl-config... no > Cannot find curl-config > ERROR: configuration failed for package ?RCurl? > * removing ?/home/eric/R/i486-pc-linux-gnu-library/2.13/RCurl? > * installing *source* package ?RJSONIO? ... > Trying to find libjson.h header file > checking for gcc... gcc > checking whether the C compiler works... yes > checking for C compiler default output file name... a.out > checking for suffix of executables... > checking whether we are cross compiling... no > checking for suffix of object files... o > checking whether we are using the GNU C compiler... yes > checking whether gcc accepts -g... yes > checking for gcc option to accept ISO C89... none needed > USE_LOCAL = "" > Using local libjson code. Copying files > /tmp/RtmpFw9QeX/R.INSTALL4ebf657f/RJSONIO > configure: creating ./config.status > config.status: creating src/Makevars > config.status: creating cleanup > ** libs > gcc -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -std=gnu99 -O3 -pipe -g -c ConvertUTF.c > -o ConvertUTF.o > g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONChildren.cpp -o > JSONChildren.o > g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONDebug.cpp -o > JSONDebug.o > g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONIterators.cpp -o > JSONIterators.o > g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONMemory.cpp -o > JSONMemory.o > g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONNode.cpp -o > JSONNode.o > g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONNode_Mutex.cpp -o > JSONNode_Mutex.o > g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONStream.cpp -o > JSONStream.o > g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONValidator.cpp -o > JSONValidator.o > g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONWorker.cpp -o > JSONWorker.o > g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSONWriter.cpp -o > JSONWriter.o > g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c JSON_Base64.cpp -o > JSON_Base64.o > gcc -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -std=gnu99 -O3 -pipe -g -c JSON_parser.c > -o JSON_parser.o > gcc -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -std=gnu99 -O3 -pipe -g -c RJSON.c -o > RJSON.o > g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c internalJSONNode.cpp -o > internalJSONNode.o > g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -O3 -pipe -g -c libjson.cpp -o libjson.o > gcc -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1 > -DJSON_NO_EXCEPTIONS=1 -fpic -std=gnu99 -O3 -pipe -g -c rlibjson.c -o > rlibjson.o > g++ -shared -o RJSONIO.so ConvertUTF.o JSONChildren.o JSONDebug.o > JSONIterators.o JSONMemory.o JSONNode.o JSONNode_Mutex.o JSONStream.o > JSONValidator.o JSONWorker.o JSONWriter.o JSON_Base64.o JSON_parser.o > RJSON.o internalJSONNode.o libjson.o rlibjson.o -L/usr/lib/R/lib -lR > installing to /home/eric/R/i486-pc-linux-gnu-library/2.13/RJSONIO/libs > ** R > ** inst > ** preparing package for lazy loading > in method for ?toJSON? with signature ?"AsIs"?: no definition for class > "AsIs" > ** help > *** installing help indices > ** building package indices ... > ** testing if installed package can be loaded > > * DONE (RJSONIO) > ERROR: dependency ?RCurl? is not available for package ?rdatamarket? > * removing ?/home/eric/R/i486-pc-linux-gnu-library/2.13/rdatamarket? > > The downloaded packages are in > ?/tmp/Rtmpz8PqWE/downloaded_packages? > >> require(rdatamarket) > Loading required package: rdatamarket > Warning message: > In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return > = TRUE, : > there is no package called 'rdatamarket' > > -- > View this message in context: http://r.789695.n4.nabble.com/install-packages-problem-tp3994239p3994239.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From ripley at stats.ox.ac.uk Sun Nov 6 07:30:18 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Sun, 6 Nov 2011 06:30:18 +0000 (GMT) Subject: [R] set seed for random draws In-Reply-To: References: <4EB4FB3F.80604@pburns.seanet.com> <4EB59B02.4030801@xtra.co.nz> Message-ID: On Sun, 6 Nov 2011, R. Michael Weylandt wrote: > I think it's easier than you are making it: the random seed is created > in a "pretty-random" way when you first use it and then it is updated Ah: It is unless you then save the workspace. If you do, then evey subsequent session starts with the same seed until you save the workspace again. So never saving and always saving works fine, but occasional saving can lead to puzzlement. That "pretty-random" way is as unpredictable as pseado-random numbers (It uses a PRNG internally.) > with each call to rDIST(). > > For example, > > set.seed(1) > x1 <- .Random.seed > rnorm(1) > x2 <- .Random.seed > rnorm(1) > x3 <- .Random.seed > > identical(x1, x2) > FALSE > > identical(x1, x3) > FALSE > > identical(x2, x3) > FALSE > > set.seed(1) > identical(x1, .Random.seed) > TRUE > > rnorm(2) > identical(x3, .Random.seed) > TRUE > > But the period for the random seed to repeat is very, very long so you > don't have to think about it unless you really need to (or for > reproducible simulations) > > Michael > > On Sat, Nov 5, 2011 at 7:22 PM, Md Desa, Zairul Nor Deana Binti > wrote: >> Thank you everybody for the helpful advices. >> Basically, I try to figure out why I get different numbers as there are more than one seed for a loop within a loop. Well, I guest I got it now. Because every time random seed is called or specified it'll output different random numbers, as it's requested. >> >> Thanks! >> >> D >> ________________________________________ >> From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] on behalf of Rolf Turner [rolf.turner at xtra.co.nz] >> Sent: Saturday, November 05, 2011 3:22 PM >> To: Patrick Burns >> Cc: r-help at r-project.org; Achim.Zeileis at uibk.ac.at >> Subject: Re: [R] set seed for random draws >> >> On 05/11/11 22:00, Patrick Burns wrote: >> >> >>> I'd suggest two rules of thumb when coming >>> up against something in R that you aren't >>> sure about: >>> >>> 1. If it is a mundane task, R probably >>> takes care of it. >>> >>> 2. Experiment to see what happens. >>> >>> >>> Of course you could read documentation, but >>> no one does that. >> >> >> >> Fortune nomination! >> >> ? ? cheers, >> >> ? ? ? ? Rolf >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From nick_pan88 at yahoo.gr Sun Nov 6 05:22:25 2011 From: nick_pan88 at yahoo.gr (nick_pan) Date: Sat, 5 Nov 2011 21:22:25 -0700 (PDT) Subject: [R] nested "for" loops In-Reply-To: References: <1320450554839-3992089.post@n4.nabble.com> <1320459266708-3992324.post@n4.nabble.com> <1320517248611-3993917.post@n4.nabble.com> Message-ID: <1320553345829-3994996.post@n4.nabble.com> I sent a post yesterday that I found out why my function didn't work. It's ok now it works. Thank you all. -- View this message in context: http://r.789695.n4.nabble.com/nested-for-loops-tp3992089p3994996.html Sent from the R help mailing list archive at Nabble.com. From kindlychung at gmail.com Sun Nov 6 07:06:50 2011 From: kindlychung at gmail.com (Kaiyin Zhong) Date: Sun, 6 Nov 2011 14:06:50 +0800 Subject: [R] Correlation between matrices In-Reply-To: References: Message-ID: Thank you Dennis, your tips are really helpful. I don't quite understand the lm(y~mouse) part; my intention was -- in pseudo code -- lm(y(Enzyme) ~ y(each elem)). In addition, attach(d) seems necessary before using lm(y~mouse), and since d$mouse has a length 125, while each elem for each region has a length 5, it generates the following error: > coefs = ddply(d, .(regions, elem), coefun) Error in model.frame.default(formula = y ~ mouse, drop.unused.levels = TRUE) : variable lengths differ (found for 'mouse') On Sun, Nov 6, 2011 at 12:53 PM, Dennis Murphy wrote: > > Hi: > > I don't think you want to keep these objects separate; it's better to > combine everything into a data frame. Here's a variation of your > example - the x variable ends up being a mouse, but you may have > another variable that's more appropriate to plot so take this as a > starting point. One plot uses the ggplot2 package, the other uses the > lattice and latticeExtra packages. > > library('ggplot2') > regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain', > ? ? ? ? ? ?'cerebellum') > mice = paste('mouse', 1:5, sep='') > elem <- c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme') > > # Generate a data frame from the combinations of > # mice, regions and elem: > d <- data.frame(expand.grid(mice = mice, regions = regions, > ? ? ? ? ? ? ? ? ? ? ? ? ? ?elem = elem), y = rnorm(125)) > # Create a numeric version of mice > d$mouse <- as.numeric(d$mice) > > # A function to return regression coefficients > coefun <- function(df) coef(lm(y ~ mouse), data = df) > # Apply to all regions * elem combinations > coefs <- ddply(d, .(regions, elem), coefun) > names(coefs) <- c('regions', 'elem', 'b0', 'b1') > > # Generate the plot using package ggplot2: > ggplot(d, aes(x = mouse, y = y)) + > ? geom_point(size = 2.5) + > ? geom_abline(data = coefs, aes(intercept = b0, slope = b1), > ? ? ? ? ? ? ? ? ? ? ? ? ? ? size = 1) + > ? facet_grid(elem ~ regions) > > # Same plot in lattice: > library('lattice') > library('latticeExtra') > p <- xyplot(y ~ mouse | elem + regions, data = d, type = c('p', 'r'), > ? ? ? ? layout = c(5, 5)) > > > HTH, > Dennis > > On Sat, Nov 5, 2011 at 10:49 AM, Kaiyin Zhong wrote: > >> regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain', > > 'cerebellum') > >> mice = paste('mouse', 1:5, sep='') > >> for (n in c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')) { > > + ? assign(n, as.data.frame(replicate(5, rnorm(5)))) > > + } > >> names(Cu) = names(Zn) = names(Fe) = names(Ca) = names(Enzyme) = regions > >> row.names(Cu) = row.names(Zn) = row.names(Fe) = row.names(Ca) = > > row.names(Enzyme) = mice > >> Cu > > ? ? ? ? ? cortex hippocampus brain_stem ?mid_brain cerebellum > > mouse1 -0.5436573 -0.31486713 ?0.1039148 -0.3908665 -1.0849112 > > mouse2 ?1.4559136 ?1.75731752 -2.1195118 -0.9894767 ?0.3609033 > > mouse3 -0.6735427 -0.04666507 ?0.9641000 ?0.4683339 ?0.7419944 > > mouse4 ?0.6926557 -0.47820023 ?1.3560802 ?0.9967562 -1.3727874 > > mouse5 ?0.2371585 ?0.20031393 -1.4978517 ?0.7535148 ?0.5632443 > >> Zn > > ? ? ? ? ? ?cortex hippocampus brain_stem ?mid_brain ?cerebellum > > mouse1 -0.66424043 ? 0.6664478 ?1.1983546 ?0.0319403 ?0.41955740 > > mouse2 -1.14510448 ? 1.5612235 ?0.3210821 ?0.4094753 ?1.01637466 > > mouse3 -0.85954416 ? 2.8275458 -0.6922565 -0.8182307 -0.06961242 > > mouse4 ?0.03606034 ?-0.7177256 ?0.7067217 ?0.2036655 -0.25542524 > > mouse5 ?0.67427572 ? 0.6171704 ?0.1044267 -1.8636174 -0.07654666 > >> Fe > > ? ? ? ? ? cortex hippocampus ?brain_stem ?mid_brain cerebellum > > mouse1 ?1.8337008 ? 2.0884261 ?0.29730413 -1.6884804 ?0.8336137 > > mouse2 -0.2734139 ?-0.5728439 ?0.63791556 -0.6232828 -1.1352224 > > mouse3 -0.4795082 ? 0.1627235 ?0.21775206 ?1.0751584 -0.5581422 > > mouse4 ?1.7125147 ?-0.5830600 ?1.40597896 -0.2815305 ?0.3776360 > > mouse5 -0.3469067 ?-0.4813120 -0.09606797 ?1.0970077 -1.1234038 > >> Ca > > ? ? ? ? ? cortex hippocampus ?brain_stem ? mid_brain cerebellum > > mouse1 -0.7663354 ? 0.8595091 ?1.33803798 -1.17651576 ?0.8299963 > > mouse2 -0.7132260 ?-0.2626811 ?0.08025079 -2.40924271 ?0.7883005 > > mouse3 -0.7988904 ?-0.1144639 -0.65901136 ?0.42462227 ?0.7068755 > > mouse4 ?0.3880393 ? 0.5570068 -0.49969135 ?0.06633009 -1.3497228 > > mouse5 ?1.0077684 ? 0.6023264 -0.57387762 ?0.25919461 -0.9337281 > >> Enzyme > > ? ? ? ? ? cortex hippocampus ?brain_stem ?mid_brain cerebellum > > mouse1 ?1.3430936 ? 0.5335819 -0.56992947 ?1.3565803 -0.8323391 > > mouse2 ?1.0520850 ?-1.0201124 ?0.89600005 ?1.4719880 ?1.0854768 > > mouse3 -0.2802482 ? 0.6863323 -1.37483570 -0.7790174 ?0.2446761 > > mouse4 -0.1916415 ?-0.4566571 ?1.93365932 ?1.3493848 ?0.2130424 > > mouse5 -1.0349593 ?-0.1940268 -0.07216321 -0.2968288 ?1.7406905 > > > > In each anatomic region, I would like to calculate the correlation between > > Enzyme activity and each of the concentrations of Cu, Zn, Fe, and Ca, and > > do a scatter plot with a tendency line, organizing those plots into a grid. > > See the image below for the desired effect: > > http://postimage.org/image/62brra6jn/ > > How can I achieve this? > > > > Thank you in advance. > > > > ? ? ? ?[[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > From mark_difford at yahoo.co.uk Sun Nov 6 10:33:21 2011 From: mark_difford at yahoo.co.uk (Mark Difford) Date: Sun, 6 Nov 2011 01:33:21 -0800 (PST) Subject: [R] testing significance of axis loadings from multivariate dudi.mix In-Reply-To: <20111105205325.M26251@oikos.unam.mx> References: <20111105205325.M26251@oikos.unam.mx> Message-ID: <1320572001109-3995350.post@n4.nabble.com> On Nov 05, 2011 at 11:01pm Francisco Mora Ardila wrote: > But a problem arised with the predict function: it doesn?t seem to work > with an object > from dudi.mix and I dont understand why. Francisco, There is no predict() method for dudi.mix() or for any of the dudi objects in ade4. I don't see why you can't get around this by doing something like the following, but you need to take account of any scaling/centring that you might do to your data before calling dudi.mix(). ## Does a dudi.mix on continuous data, so really equals a dudi.pca/princomp/PCA library(ade4) data(deug) deug.dudi <- dudi.mix(deug$tab, scann=F, nf=2) tt <- as.matrix(deug.dudi$tab) %*% as.matrix(deug.dudi$c1) ## see note below qqplot(deug.dudi$li[,1], tt[,1]) qqplot(deug.dudi$li[,2], tt[,2]) deug.princ <- princomp(deug$tab, cor=T) qqplot(predict(deug.princ)[,1], tt[,1]) ## scaling not accounted for: deug.princ <- princomp(deug$tab, cor=F) qqplot(predict(deug.princ)[,1], tt[,1]) rm(tt, deug.dudi, deug.princ) Note that in the code given above, "as.matrix(deug.dudi$tab) %*% as.matrix(deug.dudi$c1)" is based on how stats:::predict.princomp does it. Regards, Mark. ----- Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/testing-significance-of-axis-loadings-from-multivariate-dudi-mix-tp3994281p3995350.html Sent from the R help mailing list archive at Nabble.com. From benkjellson at gmail.com Sun Nov 6 12:08:03 2011 From: benkjellson at gmail.com (Ben K) Date: Sun, 6 Nov 2011 03:08:03 -0800 (PST) Subject: [R] fGarch: garchFit and include.shape/shape parameters Message-ID: <1320577683947-3995466.post@n4.nabble.com> Hello, The function garchFit in the package fGarch allows for choosing a conditional distribution, one of which is the t-distribution. The function allows specification of the shape parameter of the distribution (equal to the degrees of freedom for the t-distribution), for which the default is set to 4. The function also includes an option "include.shape", which is "a logical flag which determines if the parameter for the shape of the conditional distribution will be estimated or not." Further, it says that "if include.shape=FALSE then the shape parameter will be kept fixed during the process of parameter optimization." If I have understood things correctly, I should then set include.shape = TRUE if I want the degrees of freedom (shape parameter) to be estimated when i use garchFit with conditional distribution set to "std". *Problem:* garchFit appears to keep using the default "shape = 4" even when include.shape is set to TRUE. I have tried this in a loop where garchFit is used on a different data set in each iteration, and inspecting the saved shape parameter estimates from the model (i.e. extracted by modelname at fit$params$shape), I see that they all have the value 4. I have also tried setting "shape = NULL" (error), and shape = FALSE does not help since FALSE == 0. Have I missed something here, or is this a bug of some sort? Thanks in advance. P.S. I noticed that there are some discrepancies between the package manual and the package as it is run, concerning which conditional distributions are allowed for the garchFit function, but it is perhaps a smaller matter. -- View this message in context: http://r.789695.n4.nabble.com/fGarch-garchFit-and-include-shape-shape-parameters-tp3995466p3995466.html Sent from the R help mailing list archive at Nabble.com. From rkoenker at illinois.edu Sun Nov 6 14:38:43 2011 From: rkoenker at illinois.edu (Roger Koenker) Date: Sun, 6 Nov 2011 07:38:43 -0600 Subject: [R] linear against nonlinear alternatives - quantile regression In-Reply-To: References: , <1320504195354-3993416.post@n4.nabble.com>, Message-ID: Roger Koenker rkoenker at illinois.edu On Nov 5, 2011, at 1:02 PM, Julia Lira wrote: > > Dear David, > > Indeed rq() accepts a vector fo tau. I used the example given by > Frank to run > > fitspl4 <- summary(rq(b1 ~ rcs(x,4), tau=c(a1,a2,a3,a4))) > > and it works. > > I even can use anova() to test equality of slopes jointly across > quantiles. however, it would be interesting to test among different > specifications, e.g. rcs(x,4) against rcs(x,3). but it does not work. Probably because the models aren't nested... > > Thanks for all suggestions! > > Julia > >> From: dwinsemius at comcast.net >> Date: Sat, 5 Nov 2011 13:42:34 -0400 >> To: f.harrell at vanderbilt.edu >> CC: r-help at r-project.org >> Subject: Re: [R] linear against nonlinear alternatives - quantile >> regression >> >> I suppose this constitutes thread drift, but your simple example, >> Frank, made wonder if Rq() accepts a vector argument for tau. I >> seem to remember that Koencker's rq() does.. Normally I would >> consult the help page, but the power is still out here in Central >> Connecticut and I am corresponding with a less capable device. I am >> guessing that if Rq() does accept such a vector that the form of >> the nonlinearity would be imposed at all levels of tau. >> >> -- >> David >> >> On Nov 5, 2011, at 10:43 AM, Frank Harrell >> wrote: >> >>> Just to address a piece of this - in the case in which you are >>> currently >>> focusing on only one quantile, the rms package can help by fitting >>> restricted cubic splines for covariate effects, and then run anova >>> to test >>> for nonlinearity (sometimes a dubious practice because if you then >>> remove >>> nonlinear terms you are mildly cheating). >>> >>> require(rms) >>> f <- Rq(y ~ x1 + rcs(x2,4), tau=.25) >>> anova(f) # tests associations and nonlinearity of x2 >>> >>> Frank >>> >>> Julia Lira wrote: >>>> >>>> Dear all, >>>> >>>> I would like to know whether any specification test for linear >>>> against >>>> nonlinear model hypothesis has been implemented in R using the >>>> quantreg >>>> package. >>>> >>>> I could read papers concerning this issue, but they haven't been >>>> implemented at R. As far as I know, we only have two >>>> specification tests >>>> in this line: anova.rq and Khmaladze.test. The first one test >>>> equality and >>>> significance of the slopes across quantiles and the latter one >>>> test if the >>>> linear specification is model of location or location and scale >>>> shift. >>>> >>>> Do you have any suggestion? >>>> >>>> Thanks a lot! >>>> >>>> Best regards, >>>> >>>> Julia >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help@ mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> ----- >>> Frank Harrell >>> Department of Biostatistics, Vanderbilt University >>> -- >>> View this message in context: http://r.789695.n4.nabble.com/linear-against-nonlinear-alternatives-quantile-regression-tp3993327p3993416.html >>> Sent from the R help mailing list archive at Nabble.com. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From chee.chen at yahoo.com Sun Nov 6 15:07:06 2011 From: chee.chen at yahoo.com (Chee Chen) Date: Sun, 6 Nov 2011 09:07:06 -0500 Subject: [R] Request for Help: remove zero in fraction from tick labeling Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From GodinA at dal.ca Sun Nov 6 15:11:38 2011 From: GodinA at dal.ca (Aurelie Cosandey Godin) Date: Sun, 6 Nov 2011 10:11:38 -0400 Subject: [R] =?windows-1252?q?Deleting_rows_dataframe_in_R_conditional_to_?= =?windows-1252?q?=93if_any_of_=28a_specific_variable=29_is_equal_to=94?= Message-ID: <5BDEAADA-4558-45FA-B779-E11BA8C6D26F@dal.ca> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Sun Nov 6 15:14:12 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 6 Nov 2011 09:14:12 -0500 Subject: [R] Doing dist on separate objects in a text file In-Reply-To: <1320535236339-3994515.post@n4.nabble.com> References: <1320535236339-3994515.post@n4.nabble.com> Message-ID: <0AD472CB-6536-4A03-83BD-1390433C8F11@comcast.net> On Nov 5, 2011, at 7:20 PM, ScottDaniel wrote: > So I have a text file that looks like this: > "Label" "X" "Y" "Slice" > 1 "Field_1_R3D_D3D_PRJ_w617.tif" 348 506 1 > 2 "Field_1_R3D_D3D_PRJ_w617.tif" 359 505 1 > 3 "Field_1_R3D_D3D_PRJ_w617.tif" 356 524 1 > 4 "Field_1_R3D_D3D_PRJ_w617.tif" 2 0 1 > 5 "Field_1_R3D_D3D_PRJ_w617.tif" 412 872 1 > 6 "Field_1_R3D_D3D_PRJ_w617.tif" 422 863 1 > 7 "Field_1_R3D_D3D_PRJ_w617.tif" 429 858 1 > 8 "Field_1_R3D_D3D_PRJ_w617.tif" 429 880 1 > 9 "Field_1_R3D_D3D_PRJ_w617.tif" 437 865 1 > 10 "Field_1_R3D_D3D_PRJ_w617.tif" 447 855 1 > 11 "Field_1_R3D_D3D_PRJ_w617.tif" 450 868 1 > 12 "Field_1_R3D_D3D_PRJ_w617.tif" 447 875 1 > 13 "Field_1_R3D_D3D_PRJ_w617.tif" 439 885 1 > 14 "Field_1_R3D_D3D_PRJ_w617.tif" 2 8 1 > > What it represents are the locations of centromeres per nucleus in a > microscope image. What I need to do is do a dist() on each grouping > (the > grouping being separated by the low values of x and y's) and then > compute an > average. The part that I'm having trouble with is writing code that > will > allow R to separate these objects. I'm having trouble figuring out what you mean by "separating the objects". Each row is a separate reading, and I think you just want pairwise distances, right? > Do I have to find some way of creating > separate data frames for each object? I don't think so. You need to read this file into a data.frame which should be fairly trivial with read.table is you specify the header=TRUE parameter. > Or is there a way to parse the file > and generate a single data frame of all the pairwise distances? Then assuming there is now a data.frame named "dat" with those values: dist( cbind(dat$X, dat$Y)) One stumbling block might have been recognizing that the dist function will not work with two x and y arguments but rather requires a matrix (or something coercible to a matrix) as its first argument. This would also have worked: dist(dat[ , c("X", "Y")]) -- David. > Any > suggestions or example code would be much appreciated. Thanks! > > -- > View this message in context: http://r.789695.n4.nabble.com/Doing-dist-on-separate-objects-in-a-text-file-tp3994515p3994515.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From f.harrell at vanderbilt.edu Sun Nov 6 15:22:14 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Sun, 6 Nov 2011 06:22:14 -0800 (PST) Subject: [R] linear against nonlinear alternatives - quantile regression In-Reply-To: References: <1320504195354-3993416.post@n4.nabble.com> Message-ID: <1320589334704-3995819.post@n4.nabble.com> Roger, It's nice to see a reply from the leader in quantile regression. I wonder if I might ask a somewhat unrelated question. A few recent papers have developed ways to force quantile regression curves not to cross. Do you have plans to implement this capability in quantreg? Thanks very much for developing such a fantastic package. Frank Roger Koenker-3 wrote: > > Roger Koenker > rkoenker@ > > > > > On Nov 5, 2011, at 1:02 PM, Julia Lira wrote: > >> >> Dear David, >> >> Indeed rq() accepts a vector fo tau. I used the example given by >> Frank to run >> >> fitspl4 <- summary(rq(b1 ~ rcs(x,4), tau=c(a1,a2,a3,a4))) >> >> and it works. >> >> I even can use anova() to test equality of slopes jointly across >> quantiles. however, it would be interesting to test among different >> specifications, e.g. rcs(x,4) against rcs(x,3). but it does not work. > > Probably because the models aren't nested... >> >> Thanks for all suggestions! >> >> Julia >> >>> From: dwinsemius@ >>> Date: Sat, 5 Nov 2011 13:42:34 -0400 >>> To: f.harrell@ >>> CC: r-help@ >>> Subject: Re: [R] linear against nonlinear alternatives - quantile >>> regression >>> >>> I suppose this constitutes thread drift, but your simple example, >>> Frank, made wonder if Rq() accepts a vector argument for tau. I >>> seem to remember that Koencker's rq() does.. Normally I would >>> consult the help page, but the power is still out here in Central >>> Connecticut and I am corresponding with a less capable device. I am >>> guessing that if Rq() does accept such a vector that the form of >>> the nonlinearity would be imposed at all levels of tau. >>> >>> -- >>> David >>> >>> On Nov 5, 2011, at 10:43 AM, Frank Harrell >>> <f.harrell@> wrote: >>> >>>> Just to address a piece of this - in the case in which you are >>>> currently >>>> focusing on only one quantile, the rms package can help by fitting >>>> restricted cubic splines for covariate effects, and then run anova >>>> to test >>>> for nonlinearity (sometimes a dubious practice because if you then >>>> remove >>>> nonlinear terms you are mildly cheating). >>>> >>>> require(rms) >>>> f <- Rq(y ~ x1 + rcs(x2,4), tau=.25) >>>> anova(f) # tests associations and nonlinearity of x2 >>>> >>>> Frank >>>> >>>> Julia Lira wrote: >>>>> >>>>> Dear all, >>>>> >>>>> I would like to know whether any specification test for linear >>>>> against >>>>> nonlinear model hypothesis has been implemented in R using the >>>>> quantreg >>>>> package. >>>>> >>>>> I could read papers concerning this issue, but they haven't been >>>>> implemented at R. As far as I know, we only have two >>>>> specification tests >>>>> in this line: anova.rq and Khmaladze.test. The first one test >>>>> equality and >>>>> significance of the slopes across quantiles and the latter one >>>>> test if the >>>>> linear specification is model of location or location and scale >>>>> shift. >>>>> >>>>> Do you have any suggestion? >>>>> >>>>> Thanks a lot! >>>>> >>>>> Best regards, >>>>> >>>>> Julia >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> ______________________________________________ >>>>> R-help@ mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> >>>> ----- >>>> Frank Harrell >>>> Department of Biostatistics, Vanderbilt University >>>> -- >>>> View this message in context: >>>> http://r.789695.n4.nabble.com/linear-against-nonlinear-alternatives-quantile-regression-tp3993327p3993416.html >>>> Sent from the R help mailing list archive at Nabble.com. >>>> >>>> ______________________________________________ >>>> R-help@ mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> ______________________________________________ >>> R-help@ mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@ mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@ mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/linear-against-nonlinear-alternatives-quantile-regression-tp3993327p3995819.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Sun Nov 6 15:24:12 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 6 Nov 2011 09:24:12 -0500 Subject: [R] Matrix element-by-element multiplication In-Reply-To: References: <1988657a-bd29-47f6-a85e-a088c62b85b5@kedge2.utk.tennessee.edu> <3a6945a1-67cc-4a87-b795-542e67942f98@kedge2.utk.tennessee.edu> Message-ID: <1B7DB122-9F87-4A88-A40C-A0612EFFF4C1@comcast.net> On Nov 6, 2011, at 12:21 AM, R. Michael Weylandt wrote: > There are a few (nasty?) side-effects to c(), one of which is > stripping a matrix of its dimensionality. E.g., > > x <- matrix(1:4, 2) > c(x) > [1] 1 2 3 4 > > So that's probably what happened to you. R has a somewhat odd feature > of not really considering a pure vector as a column or row vector but > being willing to change it to either: > > e.g. > > y <- 1:2 > > x %*% y > y %*% x > y %*% y > > while matrix(y) %*% x throws an error, which can also trip folks up. > You might also note that x * y and y*x return the same thing in this > problem. > > Getting back to your problem: what are v and b and what are you hoping > to get done? Specifically, what happened when you tried v*b (give the > exact error message). It seems likely that they are non-conformable > matrices, but here non-conformable for element-wise multiplication > doesn't mean the same thing as it does for matrix multiplication. > E.g., > > x <- matrix(1:4,2) > y <- matrix(1:6,2) > > dim(x) > [1] 2 2 > > dim(y) > [1] 2 3 > > x * y -- here R seems to want matrices with identical dimensions, but > i can't promise that. > > x %*% y does work. > > Hope this helps and yes I know it can seem crazy at first, but there > really is reason behind it at the end of the tunnel, > > Michael > > > On Sun, Nov 6, 2011 at 12:11 AM, Steven Yen wrote: >> My earlier attempt >> >> dp <- v*b >> >> did not work. Because the dimensions did not work. dim(v)[1] (the rows) did not equal dim(b)[2] (the columns) since b did not have a dimension. >> Then, >> >> dp <- c(v)*b >> >> worked. It worked because of argument recycling. It did not give you a matrix result, however, because of what Michael said. c() turns a matrix into a vector, which it was all along anyway. Here's an example of argument recycling: > c(1, 2, 3) * 1:12 [1] 1 4 9 4 10 18 7 16 27 10 22 36 The 1,2,3 vector gets implicitly lengthened as would have happened with rep(c(1,2,3), 4) and then -- David. >> >> Confused, >> >> Steven >> >> At 09:10 PM 11/4/2011, you wrote: >> >> Did you even try? >> >> a <- 1:3 >> x <- matrix(c(1,2,3,2,4,6,3,6,9),3) >> a*x >> >> [,1] [,2] [,3] >> [1,] 1 2 3 >> [2,] 4 8 12 >> [3,] 9 18 27 >> >> Michael >> >> On Fri, Nov 4, 2011 at 7:26 PM, Steven Yen wrote: >>> is there a way to do element-by-element multiplication as in Gauss >>> and MATLAB, as shown below? Thanks. >>> >>> --- >>> a >>> >>> 1.0000000 >>> 2.0000000 >>> 3.0000000 >>> x >>> >>> 1.0000000 2.0000000 3.0000000 >>> 2.0000000 4.0000000 6.0000000 >>> 3.0000000 6.0000000 9.0000000 >>> a.*x >>> >>> 1.0000000 2.0000000 3.0000000 >>> 4.0000000 8.0000000 12.000000 >>> 9.0000000 18.000000 27.000000 >>> >>> -- David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Sun Nov 6 15:50:18 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 6 Nov 2011 09:50:18 -0500 Subject: [R] Request for Help: remove zero in fraction from tick labeling In-Reply-To: References: Message-ID: <126B3723-3AF8-4434-9480-7B3136A52698@comcast.net> On Nov 6, 2011, at 9:07 AM, Chee Chen wrote: > Dear All, > I would like to know how to do the following: > 1. suppose I have x values from the ordered from 0, 0.5, 1, and > would like to label these three points on the x-axis. > 2. However, R labels them as 0.0, 0.5, 1.0. But I wan5 them to be > 0, .5, 1, since the former way uses limited space of a multi-subgrap > plot by adding extra zeros as.character(c(0, .5, 1)) [1] "0" "0.5" "1" I'm guessing that you are doing some sort of potting and using these as axis labels but without code that remains a guess. > > Thank you, > Chee > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From ted.harding at wlandres.net Sun Nov 6 16:18:05 2011 From: ted.harding at wlandres.net ( (Ted Harding)) Date: Sun, 06 Nov 2011 15:18:05 -0000 (GMT) Subject: [R] Request for Help: remove zero in fraction from tick labe In-Reply-To: <126B3723-3AF8-4434-9480-7B3136A52698@comcast.net> Message-ID: On 06-Nov-11 14:50:18, David Winsemius wrote: > > On Nov 6, 2011, at 9:07 AM, Chee Chen wrote: > >> Dear All, >> I would like to know how to do the following: >> 1. suppose I have x values from the ordered from 0, 0.5, 1, and >> would like to label these three points on the x-axis. >> 2. However, R labels them as 0.0, 0.5, 1.0. But I wan5 them to be >> 0, .5, 1, since the former way uses limited space of a multi-subgrap >> plot by adding extra zeros > > as.character(c(0, .5, 1)) > [1] "0" "0.5" "1" > > I'm guessing that you are doing some sort of potting and using these > as axis labels but without code that remains a guess. > >> >> Thank you, >> Chee A general solution is exemplified by the code below. Indications of how to do this (and other customisations) by setting plot paramaters can be found in the output of ?plot.default ( and see also ?par). The code below is a modification (and simplification) of the code for the final example "##--- Log-Log Plot with custom axes" in ?plot.default. x <- sort(runif(20,0,3)) y <- x^2 plot(x, y, type="o", pch='+', col="blue", main="Plot with custom axes", ylab="Y = X^2", xlab="X", axes = FALSE, frame.plot = TRUE) x.at <- 0.5*(0:6) axis(1, at = x.at, labels = formatC(x.at, format="fg")) y.at <- (0:9) axis(2, at = y.at, labels = formatC(y.at, format="fg")) This sort of customisation does, however, usually require that you tailor the details to the specific plot you are drawing: in general, R can not be persuaded to get it "right" automatically (in particular, you will need to know the full ranges of the axes in order to get the labels right). Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 06-Nov-11 Time: 15:17:56 ------------------------------ XFMail ------------------------------ From djmuser at gmail.com Sun Nov 6 16:53:03 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Sun, 6 Nov 2011 07:53:03 -0800 Subject: [R] Correlation between matrices In-Reply-To: References: Message-ID: Hi: On Sat, Nov 5, 2011 at 11:06 PM, Kaiyin Zhong wrote: > Thank you Dennis, your tips are really helpful. > I don't quite understand the lm(y~mouse) part; my intention was -- in > pseudo code -- lm(y(Enzyme) ~ y(each elem)). As I said in my first response, I didn't quite understand what you were trying to regress so I used the mouse as a way of showing you how the code works. I think I understand what you want now, though. I'll create a data set in two ways: the first assumes you have the data as constructed in your original post and the second generates random numbers after erecting a 'scaffold' data frame. The game is to separate the enzyme data from the element data and put them into the final data frame as separate columns. Then the regression is easy if that's what you need to do. # Method 1: Generate the data as you did into separate data frames elem0 <- c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme') regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain', 'cerebellum') # Creates five 5 x 5 data frames with names V1-V5: for (n in c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')) { assign(n, as.data.frame(replicate(5, rnorm(5)))) } # Stack the chemical element data using melt() from # the reshape2 package: library('reshape2') d1 <- rbind(melt(Cu), melt(Zn), melt(Fe), melt(Ca)) # Relabel V1 - V5 with brain region names, add a factor # to distinguish individual elements and tack on the melted # Enzyme data so that it repeats in each element block d1 <- within(d1, { variable <- factor(d1$variable, labels = regions) elem <- factor(rep(elem0[1:4], each = 25)) Enzyme <- melt(Enzyme)[, 2] } ) # Plot the data using lattice and latticeExtra: library('lattice') library('latticeExtra') p <- xyplot(Enzyme ~ value | variable + elem, data = d1, type = c('p', 'r')) useOuterStrips(p) ########################################### ## Method 2: Generate the random data after setting ## up the element/region/mouse combinations ## # Generate a data frame from the combinations of # mice, regions and elem: library('ggplot2') mice <- paste('mouse', 1:5, sep = '') regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain', 'cerebellum') elem <- elem0[1:4] d0 <- data.frame(expand.grid(mice = mice, regions = regions, elem = elem)) d0 <- within(d0, { value <- rnorm(100) # generate element values Enzyme <- rnorm(25) # generate enzyme values } ) # the Enzyme values are recycled through all element blocks. # You can either adapt the lattice code above to plot d0, or you # can do the following to get an analogous plot in ggplot2. # It's easier to compute the slopes and intercepts and put # them into a data frame that ggplot() can import, so that's # what we'll do first. # A function to return regression coefficients from a # generic data frame. Since this function goes into ddply(), # the argument df is a (generic) data frame and the output # will be converted to a one-line data frame. coefun <- function(df) coef(lm(Enzyme ~ value, data = df)) # Apply the function to all regions * elem combinations. # Output is a data frame of coefficients corresponding to # each region/element combination coefs <- ddply(d0, .(regions, elem), coefun) # Rename the columns names(coefs) <- c('regions', 'elem', 'b0', 'b1') # Generate the plot using package ggplot2: ggplot(d0, aes(x = val, y = Enzyme)) + geom_point(size = 2.5) + geom_abline(data = coefs, aes(intercept = b0, slope = b1), size = 1) + xlab("") + facet_grid(elem ~ regions) > In addition, attach(d) seems necessary before using lm(y~mouse), and > since d$mouse has a length 125, while each elem for each region has a > length 5, it generates the following error: You should never need to use attach() - use the data = argument in lm() instead, where the value of data is the name of a data frame. It's always easier to use the modeling functions in R having formula interfaces with data frames. > >> coefs = ddply(d, .(regions, elem), coefun) > Error in model.frame.default(formula = y ~ mouse, drop.unused.levels = TRUE) : > ?variable lengths differ (found for 'mouse') You're clearly doing something here that's messing up the structure of the data. Study what the code (and its output) above are telling you, particularly if you're not familiar with plyr, lattice and/or ggplot2. Writing functions to insert into a **ply() function in plyr can be tricky. If you continue to have problems, please provide a reproducible example as you did here. HTH, Dennis > > > On Sun, Nov 6, 2011 at 12:53 PM, Dennis Murphy wrote: >> >> Hi: >> >> I don't think you want to keep these objects separate; it's better to >> combine everything into a data frame. Here's a variation of your >> example - the x variable ends up being a mouse, but you may have >> another variable that's more appropriate to plot so take this as a >> starting point. One plot uses the ggplot2 package, the other uses the >> lattice and latticeExtra packages. >> >> library('ggplot2') >> regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain', >> ? ? ? ? ? ?'cerebellum') >> mice = paste('mouse', 1:5, sep='') >> elem <- c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme') >> >> # Generate a data frame from the combinations of >> # mice, regions and elem: >> d <- data.frame(expand.grid(mice = mice, regions = regions, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ?elem = elem), y = rnorm(125)) >> # Create a numeric version of mice >> d$mouse <- as.numeric(d$mice) >> >> # A function to return regression coefficients >> coefun <- function(df) coef(lm(y ~ mouse), data = df) >> # Apply to all regions * elem combinations >> coefs <- ddply(d, .(regions, elem), coefun) >> names(coefs) <- c('regions', 'elem', 'b0', 'b1') >> >> # Generate the plot using package ggplot2: >> ggplot(d, aes(x = mouse, y = y)) + >> ? geom_point(size = 2.5) + >> ? geom_abline(data = coefs, aes(intercept = b0, slope = b1), >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? size = 1) + >> ? facet_grid(elem ~ regions) >> >> # Same plot in lattice: >> library('lattice') >> library('latticeExtra') >> p <- xyplot(y ~ mouse | elem + regions, data = d, type = c('p', 'r'), >> ? ? ? ? layout = c(5, 5)) >> >> >> HTH, >> Dennis >> >> On Sat, Nov 5, 2011 at 10:49 AM, Kaiyin Zhong wrote: >> >> regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain', >> > 'cerebellum') >> >> mice = paste('mouse', 1:5, sep='') >> >> for (n in c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')) { >> > + ? assign(n, as.data.frame(replicate(5, rnorm(5)))) >> > + } >> >> names(Cu) = names(Zn) = names(Fe) = names(Ca) = names(Enzyme) = regions >> >> row.names(Cu) = row.names(Zn) = row.names(Fe) = row.names(Ca) = >> > row.names(Enzyme) = mice >> >> Cu >> > ? ? ? ? ? cortex hippocampus brain_stem ?mid_brain cerebellum >> > mouse1 -0.5436573 -0.31486713 ?0.1039148 -0.3908665 -1.0849112 >> > mouse2 ?1.4559136 ?1.75731752 -2.1195118 -0.9894767 ?0.3609033 >> > mouse3 -0.6735427 -0.04666507 ?0.9641000 ?0.4683339 ?0.7419944 >> > mouse4 ?0.6926557 -0.47820023 ?1.3560802 ?0.9967562 -1.3727874 >> > mouse5 ?0.2371585 ?0.20031393 -1.4978517 ?0.7535148 ?0.5632443 >> >> Zn >> > ? ? ? ? ? ?cortex hippocampus brain_stem ?mid_brain ?cerebellum >> > mouse1 -0.66424043 ? 0.6664478 ?1.1983546 ?0.0319403 ?0.41955740 >> > mouse2 -1.14510448 ? 1.5612235 ?0.3210821 ?0.4094753 ?1.01637466 >> > mouse3 -0.85954416 ? 2.8275458 -0.6922565 -0.8182307 -0.06961242 >> > mouse4 ?0.03606034 ?-0.7177256 ?0.7067217 ?0.2036655 -0.25542524 >> > mouse5 ?0.67427572 ? 0.6171704 ?0.1044267 -1.8636174 -0.07654666 >> >> Fe >> > ? ? ? ? ? cortex hippocampus ?brain_stem ?mid_brain cerebellum >> > mouse1 ?1.8337008 ? 2.0884261 ?0.29730413 -1.6884804 ?0.8336137 >> > mouse2 -0.2734139 ?-0.5728439 ?0.63791556 -0.6232828 -1.1352224 >> > mouse3 -0.4795082 ? 0.1627235 ?0.21775206 ?1.0751584 -0.5581422 >> > mouse4 ?1.7125147 ?-0.5830600 ?1.40597896 -0.2815305 ?0.3776360 >> > mouse5 -0.3469067 ?-0.4813120 -0.09606797 ?1.0970077 -1.1234038 >> >> Ca >> > ? ? ? ? ? cortex hippocampus ?brain_stem ? mid_brain cerebellum >> > mouse1 -0.7663354 ? 0.8595091 ?1.33803798 -1.17651576 ?0.8299963 >> > mouse2 -0.7132260 ?-0.2626811 ?0.08025079 -2.40924271 ?0.7883005 >> > mouse3 -0.7988904 ?-0.1144639 -0.65901136 ?0.42462227 ?0.7068755 >> > mouse4 ?0.3880393 ? 0.5570068 -0.49969135 ?0.06633009 -1.3497228 >> > mouse5 ?1.0077684 ? 0.6023264 -0.57387762 ?0.25919461 -0.9337281 >> >> Enzyme >> > ? ? ? ? ? cortex hippocampus ?brain_stem ?mid_brain cerebellum >> > mouse1 ?1.3430936 ? 0.5335819 -0.56992947 ?1.3565803 -0.8323391 >> > mouse2 ?1.0520850 ?-1.0201124 ?0.89600005 ?1.4719880 ?1.0854768 >> > mouse3 -0.2802482 ? 0.6863323 -1.37483570 -0.7790174 ?0.2446761 >> > mouse4 -0.1916415 ?-0.4566571 ?1.93365932 ?1.3493848 ?0.2130424 >> > mouse5 -1.0349593 ?-0.1940268 -0.07216321 -0.2968288 ?1.7406905 >> > >> > In each anatomic region, I would like to calculate the correlation between >> > Enzyme activity and each of the concentrations of Cu, Zn, Fe, and Ca, and >> > do a scatter plot with a tendency line, organizing those plots into a grid. >> > See the image below for the desired effect: >> > http://postimage.org/image/62brra6jn/ >> > How can I achieve this? >> > >> > Thank you in advance. >> > >> > ? ? ? ?[[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > > From ligges at statistik.tu-dortmund.de Sun Nov 6 17:37:30 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sun, 06 Nov 2011 17:37:30 +0100 Subject: [R] List of user installed packages In-Reply-To: <80BE218D-BBF3-482D-9E5F-D6C641DE89DF@univie.ac.at> References: <010f01cc9bc0$200bc490$60234db0$@biopticon.com> <80BE218D-BBF3-482D-9E5F-D6C641DE89DF@univie.ac.at> Message-ID: <4EB6B7CA.6060003@statistik.tu-dortmund.de> Well, you could simply use everything from the old library and just apply update.packages(checkBuilt=TRUE) in order to get the packages updated for the new release. Uwe Ligges On 05.11.2011 19:00, Erich Neuwirth wrote: > Running > rownames(installed.packages()) > will tell you the names of all packages of the version of R in which you are running the command. > http://cran.r-project.org/doc/FAQ/R-FAQ.html#R-Add_002dOn-Packages > tells you the names of the packages which were installed with R itself. > > > > > On Nov 5, 2011, at 2:37 PM, Cem Girit wrote: > >> Hello, >> >> >> >> I am going to install the new version of R 2.14.1. After the >> installation, I want to copy my installed packages to the new library. But >> since over time I forgot which ones I installed I want to get a list of all >> the packages I installed among the packages installed initially by the >> R-installer. Is this possible? >> >> >> >> Cem >> >> >> >> Cem Girit, PhD >> >> >> >> Biopticon Corporation >> >> 182 Nassau Street, Suite 204 >> >> Princeton, NJ 08542 >> >> Tel: (609)-853-0231 >> >> Email:girit at biopticon.com >> >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From carl at witthoft.com Sun Nov 6 17:47:52 2011 From: carl at witthoft.com (Carl Witthoft) Date: Sun, 06 Nov 2011 11:47:52 -0500 Subject: [R] =?windows-1252?q?Deleting_rows_dataframe_in_R_conditional_to_?= =?windows-1252?q?=93if_any_of_=28a_specific_variable=29_is_equal_to=94?= Message-ID: <4EB6BA38.3060209@witthoft.com> Not too difficult. Rgames> bar<-c(3,5,7) Rgames> foo<-c(1,3,5,6,8,9,7) Rgames> match(bar,foo) [1] 2 3 7 # these are the matching positions Rgames> foo[-(match(bar,foo))] [1] 1 6 8 9 Dear list, I have been struggling for some time now with this code... I have this vector of unique ID "EID" of length 821 extracted from one of my dataframe (skate). It looks like this: > head(skate$EID) [1] "896-19" "895-8" "899-1" "899-5" "899-8" "895-7" I would like to remove the complete rows in another dataframe (t5) if any of the t5$EID is equal (a duplicate) of skate$EID. I was able to get my 'duplicated' dataframe in t5 of all my matching EID as follow: > xx<-skate$EID > t5[match(xx,t5[,26]), ]#gives me a dataframe of all matching EID in skate$EID record.t trip set month stratum NAFO unit.area time dur.set distance 8948 5 896 19 11 221 2J N12 908 15 8 8849 5 895 8 10 766 3O R36 1650 16 8 9289 5 899 1 12 743 3L V26 2052 15 8 9299 5 899 5 12 746 3L W27 1129 14 7 Where t5[,26] correspond to t5$EID column. I'm sure it's simple, but I'm not sure how to remove all of these now from my t5 dataframe! Tips would be very much appreciated! Thank you! Aurelie Cosandey-Godin Ph.D. student, Department of Biology Industrial Graduate Fellow, WWF-Canada Dalhousie University | Email: godina_at_dal.ca -- Sent from my Cray XK6 "Pendeo-navem mei anguillae plena est." From rhelpacc at gmail.com Sun Nov 6 18:15:14 2011 From: rhelpacc at gmail.com (Robert A'gata) Date: Sun, 6 Nov 2011 12:15:14 -0500 Subject: [R] Double integration using R Message-ID: Hi, I have a function that I need to do double integration: \int^T_0 \int^t_0 N(\delta / \sigma \sqrt(u)) (1-N(\delta / \sigma \sqrt(u))) du dt where N(x) is a standard normal probability of x. I start off by writing an inner integral into a function. Meaning \int^t_0 N(\delta,\sigma \sqrt(u)) (1-N(\delta,\sigma \sqrt(u))) du. Then calling integrate function on this function. This straightforward way does not seem to work. I am not sure if there is any sample code to do such integration? Thank you. Regards, Robert From huoxintong123 at 163.com Sun Nov 6 16:25:21 2011 From: huoxintong123 at 163.com (cloris) Date: Sun, 6 Nov 2011 07:25:21 -0800 (PST) Subject: [R] VAR and VECM in multivariate time series Message-ID: <1320593121202-3995951.post@n4.nabble.com> Hello to everyone! I am working on my final year project about multivariate time series. There are three variables in the multivariate time series model. I have a few questions: 1. I used acf and pacf plot and find my variables are nonstationary. But in adf.test() and pp.test(), the data are stationary. why? 2.I use VAR to get a model. y is the matrix of data set and I have made a once difference of it to make it stationary. library(tsDyn) VARselect(y,lag.max=20,type="const",season = NULL, exogen = NULL) y1=VAR(y, p = 16, type = c("const"), season = NULL, exogen = NULL, lag.max = NULL,ic = c("AIC")) summary(y1) plot(y1) How can I get estimation of AIC in this model? 3. I also get a VECM model v1=VECM(y, lag=16,beta=NULL, estim="ML") what does ETC mean in the output? and what is a number of cointegrating relationships? I want to make forecast by VECM. j=ca.jo(y,K=16,type='trace',season = NULL) j.var=vec2var(j) predict(j.var,n.ahead=80) Is this a correct way to predict VECM in R? Could anyone help me? Thank you very much -- View this message in context: http://r.789695.n4.nabble.com/VAR-and-VECM-in-multivariate-time-series-tp3995951p3995951.html Sent from the R help mailing list archive at Nabble.com. From godina at dal.ca Sun Nov 6 20:23:54 2011 From: godina at dal.ca (Aurelie Cosandey Godin) Date: Sun, 6 Nov 2011 15:23:54 -0400 Subject: [R] Combining some duplicated rows & summing one of their column Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.weylandt at gmail.com Sun Nov 6 20:56:37 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt ) Date: Sun, 6 Nov 2011 14:56:37 -0500 Subject: [R] Double integration using R In-Reply-To: References: Message-ID: There do exist packages for multi-variate integration in R, but sticking to base functions, what you've described should work but the inner integral will need to be vectorized before it's passes to the outer integral: Vectorize() can do this directly, but it won't be particularly fast since it's not true vectorization. Send real code if this doesn't help and we can take a look at it. Michael On Nov 6, 2011, at 12:15 PM, "Robert A'gata" wrote: > Hi, > > I have a function that I need to do double integration: > > \int^T_0 \int^t_0 N(\delta / \sigma \sqrt(u)) (1-N(\delta / \sigma > \sqrt(u))) du dt > > where N(x) is a standard normal probability of x. > > I start off by writing an inner integral into a function. Meaning > \int^t_0 N(\delta,\sigma \sqrt(u)) (1-N(\delta,\sigma \sqrt(u))) du. > Then calling integrate function on this function. This straightforward > way does not seem to work. I am not sure if there is any sample code > to do such integration? Thank you. > > Regards, > > Robert > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jholtman at gmail.com Sun Nov 6 21:00:32 2011 From: jholtman at gmail.com (jim holtman) Date: Sun, 6 Nov 2011 15:00:32 -0500 Subject: [R] Combining some duplicated rows & summing one of their column In-Reply-To: References: Message-ID: Since you did not supply a subset of data, or indicate if you wanted all the other values transformed in some ways, here is a simple use of tapply to get your data with respect to 'catch'. If this not what you wanted, you need to be more clear in your request and also give some hints as to what you tried. > x <- data.frame(EID = sample(10, 20, TRUE), catch = 1:20) > tapply(x$catch, x$EID, sum) 1 2 3 4 5 6 7 8 9 10 18 12 20 5 54 19 20 4 56 2 On Sun, Nov 6, 2011 at 2:23 PM, Aurelie Cosandey Godin wrote: > Dear list, > > I have this dataframe: >> names(events) > ?[1] "EID" ? ?"X" ? ? ?"Y" ? ? ?"trip" ? "tow" ? ?"catch" ?"effort" "depth" > ?[9] "season" > Where some of my unique ID "EID" appears more than once in 162 cases. > >> length(events$EID)-length(unique(events$EID)) > [1] 162 > I would like to combined each replicate EID together and sum their "catch". I've been trying a few things with the plyr package... but can't find a rather straightforward command. > > Any tips would be much appreciated! Thank you very much! > > > > Aurelie Cosandey-Godin > Ph.D. student, Department of Biology > Industrial Graduate Fellow, WWF-Canada > Dalhousie University | Email: godina at dal.ca > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. From chee.chen at yahoo.com Sun Nov 6 21:19:14 2011 From: chee.chen at yahoo.com (Chee Chen) Date: Sun, 6 Nov 2011 15:19:14 -0500 Subject: [R] Request for Help: y-axis label overlapped by x-axis in subplots in big plot Message-ID: <47AF5D3B829C4802B124CD01C0B4892E@XbiT> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bschwab at anest.ufl.edu Sun Nov 6 21:53:55 2011 From: bschwab at anest.ufl.edu (Schwab,Wilhelm K) Date: Sun, 6 Nov 2011 20:53:55 +0000 Subject: [R] Self-describing data files Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From scottdaniel25 at gmail.com Sun Nov 6 21:49:09 2011 From: scottdaniel25 at gmail.com (ScottDaniel) Date: Sun, 6 Nov 2011 12:49:09 -0800 (PST) Subject: [R] Doing dist on separate objects in a text file In-Reply-To: <0AD472CB-6536-4A03-83BD-1390433C8F11@comcast.net> References: <1320535236339-3994515.post@n4.nabble.com> <0AD472CB-6536-4A03-83BD-1390433C8F11@comcast.net> Message-ID: <-166607997442399404@unknownmsgid> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Zuofeng.Shang.5 at nd.edu Sun Nov 6 22:15:47 2011 From: Zuofeng.Shang.5 at nd.edu (JeffND) Date: Sun, 6 Nov 2011 13:15:47 -0800 (PST) Subject: [R] how to use quadrature to integrate some complicated functions Message-ID: <1320614147229-3996765.post@n4.nabble.com> Hello to all, I am having trouble with intregrating a complicated uni-dimensional function of the following form Phi(x-a_1)*Phi(x-a_2)*...*Phi(x-a_{n-1})*phi(x-a_n). Here n is about 5000, Phi is the cumulative distribution function of standard normal, phi is the density function of standard normal, and x ranges over (-infty,infty). My idea is to to use quadrature to handle this integral. But since Phi has not cloaed form, I don't know how to do this effeciently. I appreciate very much if someone has any ideas about it. Thanks! Jeff -- View this message in context: http://r.789695.n4.nabble.com/how-to-use-quadrature-to-integrate-some-complicated-functions-tp3996765p3996765.html Sent from the R help mailing list archive at Nabble.com. From murdoch.duncan at gmail.com Sun Nov 6 22:51:48 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Sun, 06 Nov 2011 16:51:48 -0500 Subject: [R] Request for Help: y-axis label overlapped by x-axis in subplots in big plot In-Reply-To: <47AF5D3B829C4802B124CD01C0B4892E@XbiT> References: <47AF5D3B829C4802B124CD01C0B4892E@XbiT> Message-ID: <4EB70174.2030706@gmail.com> On 11-11-06 3:19 PM, Chee Chen wrote: > Dear All, > I would like to seek for help on this issue: > 1. I set par(mfrow=c(2,2)), hoping to plot 4 subgraphs in a whole graph > 2. Each subgraph has its own x,y axes and each has x-axis label and y-axis label > 3. moreover, subgraphs in the left column of the whole graph are all 3D, and have z axes and labels for z axes > 4. subgraphs in the right column of the whole graph are all 2D > 5. In each subgraph, x-axis label is at the bottom, y-axis label the left side, z-axis label the right side > > 5. Issue > Now all subgraphs are plotted successfully, except that > * the y-axis labels for subgraphs in the right column of the whole graph are overlapped by the z-axis labels of the subgraphs in the left column of the whole graph. (meaning that y axis labels for the 2D subplots in the right column of the whole graph are not shown) > > 6. what I tried > 6.1 When plot each graph in its own plot, everything displayed correctly. > 6.2 I switched the order of this subgraphs in the whole graph, so that 3D were in the right column of the whole graph, 2D the left column. But in this case, the y axis labels of the 2D graphs are not shown (because I guess they went out of range of graphical area). > > Any suggestions? Change the margins of your plots. Duncan Murdoch From buiduyminh at gmail.com Sun Nov 6 22:58:55 2011 From: buiduyminh at gmail.com (Minh Bui) Date: Sun, 6 Nov 2011 16:58:55 -0500 Subject: [R] Correlation analysis Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bps0002 at auburn.edu Sun Nov 6 23:27:02 2011 From: bps0002 at auburn.edu (B77S) Date: Sun, 6 Nov 2011 14:27:02 -0800 (PST) Subject: [R] Correlation analysis In-Reply-To: <1320616308691-3996877.post@n4.nabble.com> References: <1320616308691-3996877.post@n4.nabble.com> Message-ID: <1320618422902-3996961.post@n4.nabble.com> I would start by reading one or more of the introduction manuals available here: http://mirrors.ibiblio.org/pub/mirrors/CRAN/ wizi wrote: > > Hi everyone, > > I am new to R-project. I did search through the list for my problem but i > can't find it. I am sorry if this question has been asked. > > I would like to perform a correlation analysis between a hiv data and gene > expression. > > Basically, i have a file that contains: hiv_name, start_position, > end_position, chromosome. I would like to see if these data has anything > to do with the location of our genes (I also have another file contains > gene_name, start_position, end_position, chromosome). > > What functions that allow me to do this? > > I am very new to R and hopefully someone can guide me to the right > direction. > > > Thank you very much, > -- View this message in context: http://r.789695.n4.nabble.com/Correlation-analysis-tp3996877p3996961.html Sent from the R help mailing list archive at Nabble.com. From michael.weylandt at gmail.com Sun Nov 6 23:32:45 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt ) Date: Sun, 6 Nov 2011 17:32:45 -0500 Subject: [R] Correlation analysis In-Reply-To: References: Message-ID: <91B017AE-5B87-4DEA-8346-DDB5D2E13347@gmail.com> You might find useful tools if you look at Bioconductor as well. M On Nov 6, 2011, at 4:58 PM, Minh Bui wrote: > Hi everyone, > > I am new to R-project. I did search through the list for my problem but i > can't find it. I am sorry if this question has been asked. > > I would like to perform a correlation analysis between a hiv data and gene > expression. > > Basically, i have a file that contains: hiv_name, start_position, > end_position, chromosome. I would like to see if these data has anything to > do with the location of our genes (I also have another file contains > gene_name, start_position, end_position, chromosome). > > What functions that allow me to do this? > > I am very new to R and hopefully someone can guide me to the right > direction. > > > Thank you very much, > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From michael.weylandt at gmail.com Mon Nov 7 01:02:47 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt ) Date: Sun, 6 Nov 2011 19:02:47 -0500 Subject: [R] Matrix element-by-element multiplication In-Reply-To: <6d43a7f7-9a81-4d06-ba09-21ec8678079c@kedge1.utk.tennessee.edu> References: <1988657a-bd29-47f6-a85e-a088c62b85b5@kedge2.utk.tennessee.edu> <3a6945a1-67cc-4a87-b795-542e67942f98@kedge2.utk.tennessee.edu> <6d43a7f7-9a81-4d06-ba09-21ec8678079c@kedge1.utk.tennessee.edu> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jwiley.psych at gmail.com Mon Nov 7 02:01:03 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Sun, 6 Nov 2011 17:01:03 -0800 Subject: [R] Matrix element-by-element multiplication In-Reply-To: References: <1988657a-bd29-47f6-a85e-a088c62b85b5@kedge2.utk.tennessee.edu> <3a6945a1-67cc-4a87-b795-542e67942f98@kedge2.utk.tennessee.edu> <6d43a7f7-9a81-4d06-ba09-21ec8678079c@kedge1.utk.tennessee.edu> Message-ID: Hi, R may not have a special "scalar", but it is common, if informal, in linear algebra to refer to a 1 x 1 matrix as a scalar. Indeed, something like: 1:10 * matrix(2) or matrix(2) * 1:10 are both valid. Even matrix(2) %*% 1:10 and 1:10 %*% matrix(2) work, where the vector seems to be silently coerced to a matrix. R even seems to work hard to convert to a conformable matrix: ## works: 1:10 %*% matrix(1:10) ## does not work matrix(1:10) %*% matrix(1:10) ## works t(matrix(1:10)) %*% matrix(1:10) Interestingly, there is actually a (rather old) comment in arithmetic.c /* If either x or y is a matrix with length 1 and the other is a vector, we want to coerce the matrix to be a vector. Do we want to? We don't do it! BDR 2004-03-06 */ Given the coersion that already occurs with vectors to matrices for %*% and matrices to vectors for *, it seems not unreasonable to convert a 1 x 1 matrix to a vector _for_ * so that the following yields identical results: matrix(1:9, 3) * matrix(2) matrix(1:9, 3) * 2 Of course in the mean time, or in general, it is a good habit to create or explicity coerce objects yourself rather than relying on R to make smart guesses about what should be happening. Cheers, Josh On Sun, Nov 6, 2011 at 4:02 PM, R. Michael Weylandt wrote: > It looks like pdf is not a "scalar" (that term actually has no meaning in R but I know what you mean) but is rather a 1x1 matrix, as attested by the fact it has dimensions. If you give dnorm() a matrix it will return one, as it did here. > > Perhaps you should look at the is.matrix() and as.vector() functions rather than abusing a side-effect of c(), which makes it much more difficult to see R's internal logic, which, while quirky, is useful at the end of the day. > > Michael > > PS - It's good form to cc the list at each step so others can follow along and contribute when I say something wrong. It also helps you get quicker answers. > > On Nov 6, 2011, at 1:06 AM, Steven Yen wrote: > >> I am trying to multiply what I know is a scalar (pdf(xb)) to a column vector of coefficient (bb). >> In the following, pdf is a scalar and bb is 5 x 1. I first show what worked and then what did not work. >> If my pdf is a scalar, why would I need c(pdf) to be able to pre-multiply it by a 5 x 1 vector? >> >> --- >> >> > x ? ? ?<- as.matrix(colMeans(x)) >> > xb ? ? <- t(x)%*%bb >> > pdf ? ?<- dnorm(xb) >> >> > dim(bb) >> [1] 5 1 >> >> > >> > cpdf ?<- c(pdf) >> > dim(cpdf) >> NULL >> > cpdf >> [1] 0.304201 >> > (dphat <- cpdf*bb) >> ? ? ? ? ? ? ? ? ? ?[,1] >> (Intercept) ?0.32744753 >> xrage ? ? ? -0.00599225 >> xryr ? ? ? ? 0.01758431 >> xrrate ? ? ?-0.08217250 >> xrrel ? ? ? -0.05695434 >> > >> > pdf ? ?<- dnorm(xb) >> > dim(pdf) >> [1] 1 1 >> > pdf >> ? ? ? ? ?[,1] >> [1,] 0.304201 >> > (dphat <- ?pdf*bb) >> Error in pdf * bb : non-conformable arrays >> > >> >> At 12:21 AM 11/6/2011, you wrote: >>> There are a few (nasty?) side-effects to c(), one of which is >>> stripping a matrix of its dimensionality. E.g., >>> >>> x <- matrix(1:4, 2) >>> c(x) >>> [1] 1 2 3 4 >>> >>> So that's probably what happened to you. R has a somewhat odd feature >>> of not really considering a pure vector as a column or row vector but >>> being willing to change it to either: >>> >>> e.g. >>> >>> y <- 1:2 >>> >>> x %*% y >>> y %*% x >>> y %*% y >>> >>> while matrix(y) %*% x throws an error, which can also trip folks up. >>> You might also note that x * y and y*x return the same thing in this >>> problem. >>> >>> Getting back to your problem: what are v and b and what are you hoping >>> to get done? Specifically, what happened when you tried v*b (give the >>> exact error message). It seems likely that they are non-conformable >>> matrices, but here non-conformable for element-wise multiplication >>> doesn't mean the same thing as it does for matrix multiplication. >>> E.g., >>> >>> x <- matrix(1:4,2) >>> y <- matrix(1:6,2) >>> >>> dim(x) >>> [1] 2 2 >>> >>> dim(y) >>> [1] 2 3 >>> >>> x * y -- here R seems to want matrices with identical dimensions, but >>> i can't promise that. >>> >>> x %*% y does work. >>> >>> Hope this helps and yes I know it can seem crazy at first, but there >>> really is reason behind it at the end of the tunnel, >>> >>> Michael >>> >>> >>> On Sun, Nov 6, 2011 at 12:11 AM, Steven Yen wrote: >>> > My earlier attempt >>> > >>> > ? ?dp <- v*b >>> > >>> > did not work. Then, >>> > >>> > ? ?dp <- c(v)*b >>> > >>> > worked. >>> > >>> > Confused, >>> > >>> > Steven >>> > >>> > At 09:10 PM 11/4/2011, you wrote: >>> > >>> > Did you even try? >>> > >>> > a <- 1:3 >>> > x <- ?matrix(c(1,2,3,2,4,6,3,6,9),3) >>> > a*x >>> > >>> > ? ? ?[,1] [,2] [,3] >>> > [1,] ? ?1 ? ?2 ? ?3 >>> > [2,] ? ?4 ? ?8 ? 12 >>> > [3,] ? ?9 ? 18 ? 27 >>> > >>> > Michael >>> > >>> > On Fri, Nov 4, 2011 at 7:26 PM, Steven Yen wrote: >>> >> is there a way to do element-by-element multiplication as in Gauss >>> >> and MATLAB, as shown below? Thanks. >>> >> >>> >> --- >>> >> a >>> >> >>> >> ? ? ? ?1.0000000 >>> >> ? ? ? ?2.0000000 >>> >> ? ? ? ?3.0000000 >>> >> x >>> >> >>> >> ? ? ? ?1.0000000 ? ? ? ?2.0000000 ? ? ? ?3.0000000 >>> >> ? ? ? ?2.0000000 ? ? ? ?4.0000000 ? ? ? ?6.0000000 >>> >> ? ? ? ?3.0000000 ? ? ? ?6.0000000 ? ? ? ?9.0000000 >>> >> a.*x >>> >> >>> >> ? ? ? ?1.0000000 ? ? ? ?2.0000000 ? ? ? ?3.0000000 >>> >> ? ? ? ?4.0000000 ? ? ? ?8.0000000 ? ? ? ?12.000000 >>> >> ? ? ? ?9.0000000 ? ? ? ?18.000000 ? ? ? ?27.000000 >>> >> >>> >> >>> >> -- >>> >> Steven T. Yen, Professor of Agricultural Economics >>> >> The University of Tennessee >>> >> http://web.utk.edu/~syen/ >>> >> ? ? ? ?[[alternative HTML version deleted]] >>> >> >>> >> ______________________________________________ >>> >> R-help at r-project.org mailing list >>> >> https://stat.ethz.ch/mailman/listinfo/r-help >>> >> PLEASE do read the posting guide >>> >> http://www.R-project.org/posting-guide.html >>> >> and provide commented, minimal, self-contained, reproducible code. >>> >> >>> > >>> > -- >>> > Steven T. Yen, Professor of Agricultural Economics >>> > The University of Tennessee >>> > http://web.utk.edu/~syen/ >> -- >> Steven T. Yen, Professor of Agricultural Economics >> The University of Tennessee >> http://web.utk.edu/~syen/ > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ From p.murrell at auckland.ac.nz Mon Nov 7 02:26:36 2011 From: p.murrell at auckland.ac.nz (Paul Murrell) Date: Mon, 07 Nov 2011 14:26:36 +1300 Subject: [R] Help combining cell labelling and multiple mosaic plots In-Reply-To: References: Message-ID: <4EB733CC.6000100@auckland.ac.nz> Hi The problem is that BOTH mosaic() and labeling_cells() are calling seekViewport() to find the right viewport to draw into and for BOTH plots they are finding the same viewports (on the left side of the page). The following code solves the problem (for me anyway) by specifying a different 'prefix' for each mosaic() and labeling_cells() call ... grid.newpage() pushViewport(viewport(layout=grid.layout(1,2))) pushViewport(viewport(layout.pos.col=1)) mosaic(.test, gp=shading_hsv, pop=FALSE, split_verticaL=FALSE, newpage=FALSE, labeling_args=list(offset_varnames=c(top=3), offset_labels=c(top=2)), prefix="plot1") labeling_cells(text=round(prop.table(.test, 1), 2)*100, clip=FALSE)(.test, prefix="plot1") upViewport() pushViewport(viewport(layout.pos.col=2)) mosaic(.test1, gp=shading_hsv, newpage=FALSE, pop=FALSE, split_vertical=FALSE, labeling_args=list(offset_varnames=c(top=3), offset_labels=c(top=2)), prefix="plot2") labeling_cells(text=round(prop.table(.test1, 1), 2)*100, clip=FALSE)(.test1, prefix="plot2") popViewport(2) ... hope that helps. Paul On 1/11/2011 5:21 a.m., Simon Kiss wrote: > Dear colleagues I'm using data that looks like .test and .test1 below > to draw two mosaic plots with cell labelling (the row percentages > from the tables). When I take out the pop=FALSE commands in the > mosaic commands and comment out the two lines labelling the cells, > then the plots are laid out exactly as I'd like: side-by-side. But I > do require the cell labelling and the pop=FALSE arguments. I suspect > I need to add in a call to pushViewport or an upViewport command, but > I'm not sure. Any advice is welcome. > > > library(vcd) library(grid) > > > .test<-as.table(matrix(c(1, 2, 3, 4, 5, 6), nrow=3, ncol=2, > byrow=TRUE)) .test<-prop.table(.test, 1) .test1<-as.table(matrix(c(1, > 2, 3, 4), nrow=2, ncol=2, byrow=TRUE)) .test1<-prop.table(.test1, 1) > > dimnames(.test)<-list("Fluoride Cluster"=c('Beneficial\nand Safe', > 'Mixed Opinion', 'Harmful With No Benefits'), "Governments Should Not > Impose Treatment"=c('Agree', 'Disagree')) > dimnames(.test1)<-list("Vaccines Are Too Much To Handle"= c('Agree' , > 'Disagree'), "Governments Should Not Oblige Treatment" =c('Agree', > 'Disagree')) grid.newpage() > pushViewport(viewport(layout=grid.layout(1,2))) > pushViewport(viewport(layout.pos.col=1)) mosaic(.test, > gp=shading_hsv, pop=FALSE, split_verticaL=FALSE, newpage=FALSE, > labeling_args=list(offset_varnames=c(top=3), > offset_labels=c(top=2))) labeling_cells(text=round(prop.table(.test, > 1), 2)*100, clip=FALSE)(.test) popViewport() > > pushViewport(viewport(layout.pos.col=2)) mosaic(.test1, > gp=shading_hsv, newpage=FALSE,pop=FALSE, split_vertical=FALSE, > labeling_args=list(offset_varnames=c(top=3), > offset_labels=c(top=2))) labeling_cells(text=round(prop.table(.test1, > 1), 2)*100, clip=FALSE)(.test1) popViewport(2) > ********************************* Simon J. Kiss, PhD Assistant > Professor, Wilfrid Laurier University 73 George Street Brantford, > Ontario, Canada N3T 2C9 Cell: +1 905 746 7606 > > ______________________________________________ R-help at r-project.org > mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do > read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 paul at stat.auckland.ac.nz http://www.stat.auckland.ac.nz/~paul/ From francoisrousseu at hotmail.com Mon Nov 7 03:55:58 2011 From: francoisrousseu at hotmail.com (Francois Rousseu) Date: Sun, 6 Nov 2011 21:55:58 -0500 Subject: [R] tcltk window freezes when using locator( ) Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From syen at utk.edu Mon Nov 7 01:56:40 2011 From: syen at utk.edu (Steven Yen) Date: Sun, 6 Nov 2011 19:56:40 -0500 Subject: [R] Matrix element-by-element multiplication In-Reply-To: References: <1988657a-bd29-47f6-a85e-a088c62b85b5@kedge2.utk.tennessee.edu> <3a6945a1-67cc-4a87-b795-542e67942f98@kedge2.utk.tennessee.edu> <6d43a7f7-9a81-4d06-ba09-21ec8678079c@kedge1.utk.tennessee.edu> Message-ID: <7b8af230-352f-4947-bcff-b277e79b1760@kedge3.utk.tennessee.edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From philip.dilts at gmail.com Mon Nov 7 01:59:54 2011 From: philip.dilts at gmail.com (Philip Dilts) Date: Sun, 6 Nov 2011 17:59:54 -0700 Subject: [R] partial dependence plots in 'party' Message-ID: Hello, I can't seem to figure out how to generate partial dependence plots for random forest models generated with the 'party' package. Is there a function for this that I just haven't found yet? Thanks -Philip Dilts From francoisrousseu at hotmail.com Mon Nov 7 04:11:03 2011 From: francoisrousseu at hotmail.com (Francois Rousseu) Date: Sun, 6 Nov 2011 22:11:03 -0500 Subject: [R] tcltk window freezes when using locator( ) In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From lannajin at gmail.com Mon Nov 7 04:08:44 2011 From: lannajin at gmail.com (Lanna Jin) Date: Sun, 6 Nov 2011 19:08:44 -0800 (PST) Subject: [R] adjusting levelplot color scale to data Message-ID: <1320635324457-3997342.post@n4.nabble.com> Hi guys, I have a matrix with values varying from approximately -0.7 to 0.33 that I want to create a heatmap/levelplot with. When I execute the levelplot function for my matrix, I end up getting colors that are adjusted to the max and min rather than around 0. In other words, ideally I would like to have a color ramp that goes from red (negative number), to white (0), to blue (positive); however, right now the value 0 is in the blue. Any insight on how to address this problem? Thanks in advance! example... my matrix "y" looks something like this A B C D E row1 -0.5046406 -0.021579587 -0.4419101 -0.2999195330 -0.4845047 row2 -0.3070091 -0.059065936 0.3329806 -0.0519335420 -0.5766368 row3 -0.7271707 0.073282855 -0.3181990 -0.2485017700 -0.5732781 row4 0.3329806 -0.017762750 -0.1513197 -0.1016354970 0.2528442 levelplot(y) yields a color scale from red (-0.8 to 0.2) to blue (0.2 to 0.4) I'd want the color scale to be from red (-0.8 to 0) to blue (0 to 0.4) ----- Lanna -- View this message in context: http://r.789695.n4.nabble.com/adjusting-levelplot-color-scale-to-data-tp3997342p3997342.html Sent from the R help mailing list archive at Nabble.com. From baptiste.auguie at googlemail.com Mon Nov 7 06:05:12 2011 From: baptiste.auguie at googlemail.com (baptiste auguie) Date: Mon, 7 Nov 2011 18:05:12 +1300 Subject: [R] adjusting levelplot color scale to data In-Reply-To: <1320635324457-3997342.post@n4.nabble.com> References: <1320635324457-3997342.post@n4.nabble.com> Message-ID: Hi, Try specifying explicit break points together with their corresponding colors using at and col.regions, levelplot(m, at= unique(c(seq(-2, 0, length=100), seq(0, 10, length=100))), col.regions = colorRampPalette(c("blue", "white", "red"))(1e3)) HTH, baptiste On 7 November 2011 16:08, Lanna Jin wrote: > Hi guys, > > I have a matrix with values varying from approximately -0.7 to 0.33 that I > want to create a heatmap/levelplot with. > > When I execute the levelplot function for my matrix, I end up getting colors > that are adjusted to the max and min rather than around 0. In other words, > ideally I would like to have a color ramp that goes from red (negative > number), to white (0), to blue (positive); however, right now the value 0 is > in the blue. > > Any insight on how to address this problem? > > Thanks in advance! > > example... > my matrix "y" looks something like this > ? ? ? ? ? ? ? ? A ? ? ? ? ? ? ? ? B ? ? ? ? ? ? ? ? ? ? ?C > D ? ? ? ? ? ? ? ? ? ?E > ?row1 ? ? ? ? -0.5046406 -0.021579587 ? ?-0.4419101 -0.2999195330 > -0.4845047 > ?row2 ? ? ? ? -0.3070091 -0.059065936 ? ? 0.3329806 -0.0519335420 > -0.5766368 > ?row3 ? ? ? ? -0.7271707 ?0.073282855 ? ?-0.3181990 -0.2485017700 > -0.5732781 > ?row4 ? ? ? ? ?0.3329806 -0.017762750 ? ?-0.1513197 -0.1016354970 > 0.2528442 > > levelplot(y) yields a color scale from red (-0.8 to 0.2) to blue (0.2 to > 0.4) > I'd want the color scale to be from red (-0.8 to 0) to blue (0 to 0.4) > > > > ----- > Lanna > -- > View this message in context: http://r.789695.n4.nabble.com/adjusting-levelplot-color-scale-to-data-tp3997342p3997342.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From johjeffrey at hotmail.com Mon Nov 7 06:09:53 2011 From: johjeffrey at hotmail.com (Jeffrey Joh) Date: Sun, 6 Nov 2011 21:09:53 -0800 Subject: [R] Graph binned data Message-ID: I have a table that looks like this: structure(list(speed = c(3,9,14,8,7,6), result = c(0.697, 0.011, 0.015, 0.012, 0.018, 0.019), house = c(1, 1, 1, 1, 1, 1), date = c(719, 1027, 1027, 1027, 1030, 1030), id = c("1000", "10000", "10001", "10002", "10003", "10004")), .Names = c("speed", "result", "house", "date", "id"), class = "data.frame", row.names = c("1000", "10000", "10001", "10002", "10003", "10004")) I would like to bin the data by speed, 0-4, 5-9, 10-14, 15-20, etc. Then I would like to make a graph of speed vs result. The graph should show the average result of each bin, and error bars to represent the standard deviation of the result in each bin. What kind of code can I use to make this? Jeffrey From dwinsemius at comcast.net Mon Nov 7 06:10:21 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 7 Nov 2011 00:10:21 -0500 Subject: [R] adjusting levelplot color scale to data In-Reply-To: <1320635324457-3997342.post@n4.nabble.com> References: <1320635324457-3997342.post@n4.nabble.com> Message-ID: <3566815B-1382-44EC-BE6A-1381424947A4@comcast.net> On Nov 6, 2011, at 10:08 PM, Lanna Jin wrote: > Hi guys, > > I have a matrix with values varying from approximately -0.7 to 0.33 > that I > want to create a heatmap/levelplot with. > > When I execute the levelplot function for my matrix, I end up > getting colors > that are adjusted to the max and min rather than around 0. In other > words, > ideally I would like to have a color ramp that goes from red (negative > number), to white (0), to blue (positive); however, right now the > value 0 is > in the blue. > > Any insight on how to address this problem? ?levelplot # which leads to" ?level.colors # which in turn leads to: ? colorRamp levelplot(as.matrix(dat), at=seq( -.8, .4, length=31), col=color.palette(30) ) And next time, please post the output of dput rather than a mangled print() output. -- David. > > Thanks in advance! > > example... > my matrix "y" looks something like this > A B C > D E > row1 -0.5046406 -0.021579587 -0.4419101 -0.2999195330 > -0.4845047 > row2 -0.3070091 -0.059065936 0.3329806 -0.0519335420 > -0.5766368 > row3 -0.7271707 0.073282855 -0.3181990 -0.2485017700 > -0.5732781 > row4 0.3329806 -0.017762750 -0.1513197 -0.1016354970 > 0.2528442 > > levelplot(y) yields a color scale from red (-0.8 to 0.2) to blue > (0.2 to > 0.4) > I'd want the color scale to be from red (-0.8 to 0) to blue (0 to 0.4) > > > > ----- > Lanna > -- > View this message in context: http://r.789695.n4.nabble.com/adjusting-levelplot-color-scale-to-data-tp3997342p3997342.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From akpbond007 at gmail.com Mon Nov 7 06:06:14 2011 From: akpbond007 at gmail.com (arunkumar1111) Date: Sun, 6 Nov 2011 21:06:14 -0800 (PST) Subject: [R] Intercepts is coming as Zero in the Mixed Models Message-ID: <1320642374061-3997498.post@n4.nabble.com> Hi I'm getting the intercepts of the Random effects as 0. Please help me to understand why this is coming Zero This is my R code Data<- read.csv("C:/FE and RE.csv") Formula="Y~X2+X3+X4 + (1|State) + (0+X5|State)" fit=lmer(formula=Formula,data=Data) ranef(fit). My sample Data State Year Y X2 X3 X4 X5 X6 S2 1960 27.8 397.5 42.2 50.7 78.3 65.8 S1 1960 29.9 413.3 38.1 52 79.2 66.9 S2 1961 29.8 439.2 40.3 54 79.2 67.8 S1 1961 30.8 459.7 39.5 55.3 79.2 69.6 S2 1962 31.2 492.9 37.3 54.7 77.4 68.7 S1 1962 33.3 528.6 38.1 63.7 80.2 73.6 S2 1963 35.6 560.3 39.3 69.8 80.4 76.3 S1 1963 36.4 624.6 37.8 65.9 83.9 77.2 S2 1964 36.7 666.4 38.4 64.5 85.5 78.1 S1 1964 38.4 717.8 40.1 70 93.7 84.7 S2 1965 40.4 768.2 38.6 73.2 106.1 93.3 S1 1965 40.3 843.3 39.8 67.8 104.8 89.7 S2 1966 41.8 911.6 39.7 79.1 114 100.7 S1 1966 40.4 931.1 52.1 95.4 124.1 113.5 S2 1967 40.7 1021.5 48.9 94.2 127.6 115.3 S1 1967 40.1 1165.9 58.3 123.5 142.9 136.7 S2 1968 42.7 1349.6 57.9 129.9 143.6 139.2 S1 1968 44.1 1449.4 56.5 117.6 139.2 132 S2 1969 46.7 1575.5 63.7 130.9 165.5 132.1 S1 1969 50.6 1759.1 61.6 129.8 203.3 154.4 S2 1970 50.1 1994.2 58.9 128 219.6 174.9 S1 1970 51.7 2258.1 66.4 141 221.6 180.8 S2 1971 52.9 2478.7 70.4 168.2 232.6 189.4 -- View this message in context: http://r.789695.n4.nabble.com/Intercepts-is-coming-as-Zero-in-the-Mixed-Models-tp3997498p3997498.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Mon Nov 7 06:13:20 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 7 Nov 2011 00:13:20 -0500 Subject: [R] adjusting levelplot color scale to data In-Reply-To: <3566815B-1382-44EC-BE6A-1381424947A4@comcast.net> References: <1320635324457-3997342.post@n4.nabble.com> <3566815B-1382-44EC-BE6A-1381424947A4@comcast.net> Message-ID: <695EE97C-6175-42C2-A201-57083D29DD19@comcast.net> On Nov 7, 2011, at 12:10 AM, David Winsemius wrote: > > On Nov 6, 2011, at 10:08 PM, Lanna Jin wrote: > >> Hi guys, >> >> I have a matrix with values varying from approximately -0.7 to 0.33 >> that I >> want to create a heatmap/levelplot with. >> >> When I execute the levelplot function for my matrix, I end up >> getting colors >> that are adjusted to the max and min rather than around 0. In other >> words, >> ideally I would like to have a color ramp that goes from red >> (negative >> number), to white (0), to blue (positive); however, right now the >> value 0 is >> in the blue. >> >> Any insight on how to address this problem? > > > ?levelplot # which leads to" > ?level.colors # which in turn leads to: > ? colorRamp > color.palette = colorRampPalette(c("red", "white", "blue")) > levelplot(as.matrix(dat), at=seq( -.8, .4, length=31), > col=color.palette(30) ) > > > And next time, please post the output of dput rather than a mangled > print() output. > -- > David. >> >> Thanks in advance! >> >> example... >> my matrix "y" looks something like this >> A B C >> D E >> row1 -0.5046406 -0.021579587 -0.4419101 -0.2999195330 >> -0.4845047 >> row2 -0.3070091 -0.059065936 0.3329806 -0.0519335420 >> -0.5766368 >> row3 -0.7271707 0.073282855 -0.3181990 -0.2485017700 >> -0.5732781 >> row4 0.3329806 -0.017762750 -0.1513197 -0.1016354970 >> 0.2528442 >> >> levelplot(y) yields a color scale from red (-0.8 to 0.2) to blue >> (0.2 to >> 0.4) >> I'd want the color scale to be from red (-0.8 to 0) to blue (0 to >> 0.4) >> >> >> >> ----- >> Lanna >> -- >> View this message in context: http://r.789695.n4.nabble.com/adjusting-levelplot-color-scale-to-data-tp3997342p3997342.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Mon Nov 7 06:26:15 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 7 Nov 2011 00:26:15 -0500 Subject: [R] Graph binned data In-Reply-To: References: Message-ID: <62246F90-6D3F-45BE-8C03-90DB01E68C65@comcast.net> On Nov 7, 2011, at 12:09 AM, Jeffrey Joh wrote: > > I have a table that looks like this: > > structure(list(speed = c(3,9,14,8,7,6), result = c(0.697, 0.011, > 0.015, 0.012, 0.018, 0.019), house = c(1, > 1, 1, 1, 1, 1), date = c(719, 1027, 1027, 1027, 1030, 1030), > id = c("1000", "10000", > "10001", "10002", "10003", "10004")), .Names = c("speed", > "result", "house", "date", "id"), class = "data.frame", row.names = > c("1000", > "10000", "10001", "10002", "10003", "10004")) > > I would like to bin the data by speed, 0-4, 5-9, 10-14, 15-20, etc. ?cut > Then I would like to make a graph of speed vs result. The graph > should show the average result of each bin, ?tapply ?mean dat$sgrp <- cut(dat$speed, c(0,5,10, 15, 20), include.lowest=TRUE, right=TRUE) plot( tapply(dat$speed, dat$sgrp, mean), xaxt="n", ylim=c(0,20)) axis(1, at= 1:4, labels = levels(dat$sgrp) ) > and error bars to represent the standard deviation of the result in > each bin. What kind of code can I use to make this? (This would seem to be pretty basic material. Why don't you do further study of whatever introductory texts you are using.) The CI's can be added with one of the functions in package 'plotrix'. > > Jeffrey > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From loyolite270 at gmail.com Mon Nov 7 06:29:25 2011 From: loyolite270 at gmail.com (loyolite270) Date: Sun, 6 Nov 2011 21:29:25 -0800 (PST) Subject: [R] Structural equation modeling in R(lavaan,sem) In-Reply-To: References: <1301253139729-3409642.post@n4.nabble.com> <4D9073F7.2040309@gmail.com> <1301426701835-3415954.post@n4.nabble.com> Message-ID: <1320643765766-3997527.post@n4.nabble.com> I am new to both sem and lavaan package ... I dint exactly get the difference between sem from sem package and sem from lavaan package... , -- View this message in context: http://r.789695.n4.nabble.com/Structural-equation-modeling-in-R-lavaan-sem-tp3409642p3997527.html Sent from the R help mailing list archive at Nabble.com. From alaios at yahoo.com Mon Nov 7 11:09:14 2011 From: alaios at yahoo.com (Alaios) Date: Mon, 7 Nov 2011 02:09:14 -0800 (PST) Subject: [R] How much time a process need? Message-ID: <1320660554.33057.YahooMailNeo@web120120.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From AEasom at SportingIndex.com Mon Nov 7 11:09:18 2011 From: AEasom at SportingIndex.com (AndreE) Date: Mon, 7 Nov 2011 02:09:18 -0800 (PST) Subject: [R] Select some, but not all, variables stepwise In-Reply-To: <1320422468588-3990598.post@n4.nabble.com> References: <5648FEE24754234D80F8E5BBE1A22BE51219BC9C01@GBGH-EXCH-CMS.sig.ads> <20111104131437.M324@oikos.unam.mx> <1320421016070-3990516.post@n4.nabble.com> <1320422468588-3990598.post@n4.nabble.com> Message-ID: <1320660558177-3998065.post@n4.nabble.com> Thanks for the advice - I will consider some alternative strategies. -- View this message in context: http://r.789695.n4.nabble.com/Select-some-but-not-all-variables-stepwise-tp3990002p3998065.html Sent from the R help mailing list archive at Nabble.com. From petr.pikal at precheza.cz Mon Nov 7 11:21:29 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Mon, 7 Nov 2011 11:21:29 +0100 Subject: [R] How to delete only those rows in a dataframe in which all records are missing In-Reply-To: References: Message-ID: > > Perhaps something like this will work. > > df[!(rowSums(is.na(df))==NCOL(df)),] Or df[complete.cases(df),] Regards Petr > > Michael > > On Fri, Nov 4, 2011 at 9:27 AM, Jose Iparraguirre > wrote: > > Hi, > > > > Imagine I have the following data frame: > > > >> a <- c(1,NA,3) > >> b <- c(2,NA,NA) > >> c <- data.frame(cbind(a,b)) > >> c > > a b > > 1 1 2 > > 2 NA NA > > 3 3 NA > > > > I want to delete the second row. If I use na.omit, that would also > affect the third row. I tried to use a loop and an ifelse clause with > is.na to get R identify that row in which all records are missing, as > opposed to the first row in which no records are missing or the third one, > in which only one record is missing. How can I get R identify the row in > which all records are missing? Or, how can I get R delete/omit only this row? > > Thanks in advance, > > > > Jos? > > > > > > Jos? Iparraguirre > > Chief Economist > > Age UK > > > > T 020 303 31482 > > E Jose.Iparraguirre at ageuk.org.uk > > > > Tavis House, 1- 6 Tavistock Square > > London, WC1H 9NB > > www.ageuk.org.uk | ageukblog.org.uk ageukblog.org.uk/> | @AgeUKPA > > > > > > Age UK Improving later life > > > > www.ageuk.org.uk > > > > > > > > > > > > ------------------------------- > > > > Age UK is a registered charity and company limited by guarantee, > (registered charity number 1128267, registered company number 6825798). > Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA. > > > > For the purposes of promoting Age UK Insurance, Age UK is an Appointed > Representative of Age UK Enterprises Limited, Age UK is an Introducer > Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth > Access for the purposes of introducing potential annuity and health cash > plans customers respectively. Age UK Enterprises Limited, JLT Benefit > Solutions Limited and Simplyhealth Access are all authorised and regulated > by the Financial Services Authority. > > > > > > > > > > > > ------------------------------ > > > > This email and any files transmitted with it are confidential and > intended solely for the use of the individual or entity to whom they are > addressed. If you receive a message in error, please advise the sender and > delete immediately. > > > > > > > > Except where this email is sent in the usual course of our business, any > opinions expressed in this email are those of the author and do not > necessarily reflect the opinions of Age UK or its subsidiaries and > associated companies. Age UK monitors all e-mail transmissions passing > through its network and may block or modify mails which are deemed to be unsuitable. > > > > > > > > > > > > Age Concern England (charity number 261794) and Help the Aged (charity > number 272786) and their trading and other associated companies merged on > 1st April 2009. Together they have formed the Age UK Group, dedicated to > improving the lives of people in later life. The three national Age > Concerns in Scotland, Northern Ireland and Wales have also merged with > Help the Aged in these nations to form three registered charities: Age > Scotland, Age NI, Age Cymru. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From virgilio.gomez at uclm.es Mon Nov 7 12:22:40 2011 From: virgilio.gomez at uclm.es (Virgilio =?ISO-8859-1?Q?G=F3mez?= Rubio) Date: Mon, 07 Nov 2011 12:22:40 +0100 Subject: [R] R and Google Code-In Message-ID: <1320664960.2939.21.camel@virgil-VirtualBox> Dear all, An application has been put forward for R to participate in Google Code-in. This is a Google's contest to introduce pre-university students (age 13-18) to the many kinds of contributions that make open source software development possible. We are looking for possible mentors and ideas for tasks to be developed by the students. All the proposed tasks are available in this page: http://rwiki.sciviews.org/doku.php?id=developers:projects:gci2011&s=google%20code Possible tasks range from code development to translation of manual pages, GUIs, etc. A mailing list has been set up to discuss issues regarding this contest: http://groups.google.com/group/gci-r?hl=en If you are interested in participating in Google Code-In, please, join the mailing list to discuss your ideas and add your ideas to the wiki page. We are still waiting to know whether the R Foundation will be accepted as a participating institution but the tasks list will be taken into account when considering our application. Best regards, -- Virgilio G?mez-Rubio Departamento de Matem?ticas Escuela de Ingenieros Industriales Universidad de Castilla-La Mancha Avda Espa?a s/n 02071 Albacete - SPAIN From ligges at statistik.tu-dortmund.de Mon Nov 7 13:17:46 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Mon, 07 Nov 2011 13:17:46 +0100 Subject: [R] error message In-Reply-To: <1320523229682-3994158.post@n4.nabble.com> References: <1295368328987-3223412.post@n4.nabble.com> <1320429978716-3991100.post@n4.nabble.com> <1320520337578-3994067.post@n4.nabble.com> <1320523229682-3994158.post@n4.nabble.com> Message-ID: <4EB7CC6A.8050306@statistik.tu-dortmund.de> On 05.11.2011 21:00, JulianaMF wrote: > Humm... I was using adegenet / ade4 packages and both R and R studio prompted > the questions. > Sorry, I did so many searches on R help and Adegenet help that I forgot to > mention the packages I was using... Errr, what are you referring to? You also forgot to read the posting guide of this mailing list that asks you to cite the original thread, Uwe Ligges > Juliana > > -- > View this message in context: http://r.789695.n4.nabble.com/error-message-tp3223412p3994158.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Mon Nov 7 13:20:52 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Mon, 07 Nov 2011 13:20:52 +0100 Subject: [R] Error in eigen(a$hessian) : infinite or missing values in 'x' In-Reply-To: References: Message-ID: <4EB7CD24.2010901@statistik.tu-dortmund.de> On 05.11.2011 15:08, Kristian Lind wrote: > Dear R-users, > > I'm estimating a two- dimensional state-space model using the FKF package. > The resulting log likelihood function is maximized using auglag from the > Alabama package. The procedure works well for a subset of my data, but if I > try to use the entire data set I get the following error message. > > Error in eigen(a$hessian) : infinite or missing values in 'x' > > What's even more confusing is that if I estimate the model for a sample say > data[1:200,] then there's convergence. If I estimate it for data[1:300, ] > then I get the error message. Since there is a missing or infinite value in it? > But if I estimate the model for > data[201,300] it once again converges. data[201,300] is exactly one entry (row 201, column 300). Uwe Ligges > > Can anyone please enlighten me; where does this error stem from and what > can I do about it? > > Thank you in advance. > > Kristian > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From bbolker at gmail.com Mon Nov 7 13:21:03 2011 From: bbolker at gmail.com (Ben Bolker) Date: Mon, 7 Nov 2011 12:21:03 +0000 Subject: [R] Intercepts is coming as Zero in the Mixed Models References: <1320642374061-3997498.post@n4.nabble.com> Message-ID: arunkumar1111 gmail.com> writes: > I'm getting the intercepts of the Random effects as 0. Please help me to > understand why this is coming Zero > > This is my R code > > Data<- read.csv("C:/FE and RE.csv") > Formula="Y~X2+X3+X4 + (1|State) + (0+X5|State)" > fit=lmer(formula=Formula,data=Data) > ranef(fit). > This question is more suited for the r-sig-mixed-models mailing list ... You are getting an estimate of zero variance because lmer is computing that as the best estimate. The reason is that it's really completely impractical to try to estimate the variance among levels of a factor with only two levels. There is more discussion of this issue at http://glmm.wikidot.com/faq#fixed_vs_random Thanks for including your data so we could see the problem. From jwiley.psych at gmail.com Mon Nov 7 13:24:35 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Mon, 7 Nov 2011 04:24:35 -0800 Subject: [R] Structural equation modeling in R(lavaan,sem) In-Reply-To: <1320643765766-3997527.post@n4.nabble.com> References: <1301253139729-3409642.post@n4.nabble.com> <4D9073F7.2040309@gmail.com> <1301426701835-3415954.post@n4.nabble.com> <1320643765766-3997527.post@n4.nabble.com> Message-ID: Hi, Your question is so broad as to be unanswerable, but see the help pages for the function from both packages. Here is how you can load them both and then look at the help for a specific package: require(sem) require(lavaan) help("sem", package = "sem") help("sem", package = "lavaan") If a particular aspect of their implementation or use is confusing, feel free to ask. Cheers, Josh On Sun, Nov 6, 2011 at 9:29 PM, loyolite270 wrote: > I am new to both sem and lavaan package ... > > I dint exactly get the difference between sem from sem package and sem from > lavaan package... , > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Structural-equation-modeling-in-R-lavaan-sem-tp3409642p3997527.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ From ligges at statistik.tu-dortmund.de Mon Nov 7 13:27:30 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Mon, 07 Nov 2011 13:27:30 +0100 Subject: [R] How much time a process need? In-Reply-To: <1320660554.33057.YahooMailNeo@web120120.mail.ne1.yahoo.com> References: <1320660554.33057.YahooMailNeo@web120120.mail.ne1.yahoo.com> Message-ID: <4EB7CEB2.9030401@statistik.tu-dortmund.de> On 07.11.2011 11:09, Alaios wrote: > Dear all, > I have finished a large function that takes around 1 day to finish. > I was using system.time(callmyfunction()) to measure how much time it needed to finish, my problem is that I do not know how to interpret their numbers. > I was looking to convert these results to something more readably like. > "This function took 1 Day 2 hours and 35 minutes to complete." > > How I can convert the system.time output to something like that? system.time() responds in seconds, hence you can apply simple arithmetic to get days, hours and minutes. Uwe Ligges > > B.R > Alex > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From Patrick.Palmier at developpement-durable.gouv.fr Mon Nov 7 11:49:06 2011 From: Patrick.Palmier at developpement-durable.gouv.fr (PALMIER Patrick (Responsable de groupe) - CETE NP/TM/ST) Date: Mon, 07 Nov 2011 11:49:06 +0100 Subject: [R] R in batch mode packages loading question Message-ID: <4EB7B7A2.3030209@developpement-durable.gouv.fr> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Holger.Taschenberger at mpi-bpc.mpg.de Mon Nov 7 11:59:19 2011 From: Holger.Taschenberger at mpi-bpc.mpg.de (Holger Taschenberger) Date: Mon, 07 Nov 2011 11:59:19 +0100 Subject: [R] one sample Wilcoxon test using 'coin' Message-ID: <20111107115918.1092.F023FAF3@mpi-bpc.mpg.de> Hi, I'm trying to use the package 'coin' to run a one sample Wilcoxon test equivalent to this: x1<-c(1,3.5,2.1,4,1.5,5) wilcox.test(x1, mu=2, exact=TRUE) I assume that I can do this like so: x2<-rep(2,length(x1)) wilcoxsign_test(x1 ~ x2,distribution = exact()) But I'm not sure if this is really the correct way. Can someone please advise? (BTW: The reason to use 'coin' is it's ability to compute exact p-values even in the presence of ties in the ranks.) Thanks a lot, Holger From swaggertj at gmail.com Mon Nov 7 12:30:41 2011 From: swaggertj at gmail.com (Jing Tian) Date: Mon, 7 Nov 2011 19:30:41 +0800 Subject: [R] help with programming Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jonas.lorson at ebs.de Mon Nov 7 12:38:05 2011 From: jonas.lorson at ebs.de (jolo999) Date: Mon, 7 Nov 2011 03:38:05 -0800 (PST) Subject: [R] How to do a target value search analogous to Excel Solver Message-ID: <1320665885789-3998347.post@n4.nabble.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ypying88 at hotmail.com Mon Nov 7 12:39:51 2011 From: ypying88 at hotmail.com (jesslim) Date: Mon, 7 Nov 2011 03:39:51 -0800 (PST) Subject: [R] how save R file in exe file? Message-ID: <1320665991954-3998352.post@n4.nabble.com> I am the beginner of R. I just want to know how can i save my .R file into exe.file and that later will be execute in java. -- View this message in context: http://r.789695.n4.nabble.com/how-save-R-file-in-exe-file-tp3998352p3998352.html Sent from the R help mailing list archive at Nabble.com. From jmf at ib.usp.br Mon Nov 7 13:35:37 2011 From: jmf at ib.usp.br (JulianaMF) Date: Mon, 7 Nov 2011 04:35:37 -0800 (PST) Subject: [R] error message In-Reply-To: <4EB7CC6A.8050306@statistik.tu-dortmund.de> References: <1295368328987-3223412.post@n4.nabble.com> <1320429978716-3991100.post@n4.nabble.com> <1320520337578-3994067.post@n4.nabble.com> <1320523229682-3994158.post@n4.nabble.com> <4EB7CC6A.8050306@statistik.tu-dortmund.de> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jwiley.psych at gmail.com Mon Nov 7 13:37:40 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Mon, 7 Nov 2011 04:37:40 -0800 Subject: [R] help with programming In-Reply-To: References: Message-ID: Hi J. Tian, This list is not for helping with homework problems. Please see your instructor or teacher for assistance as it is what she or he is paid for. Cheers, Josh On Mon, Nov 7, 2011 at 3:30 AM, Jing Tian wrote: >> >> ? > > Dear moderators, > > Please help me encode the program instructed by follows. > Thank u! > > Apply the methods introduced in Sections 4.2.1 and 4.2.2, say the >> rank-based variable selection and BIC criterions, to the Boston housing >> data. >> > ? The Boston housing data contains 506 observations, and is publicly > available in the R package mlbench (dataset ?BostonHousing?). > ? The response variable Y is the median value of owner-occupied homes > (MEDV) in each of the 506 census tracts in the Boston Standard Metropolitan > Statistical Areas, and there are thirteen predictor variables. > ? We are interested in the relationship between MEDV and the other > predictor variables. > >> ? Setup your parametric model and use rank-based regression methodology to >> select and estimate the parameters. >> > > > Best regards, > J.Tian > > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ From felix at nfrac.org Mon Nov 7 13:38:24 2011 From: felix at nfrac.org (Felix Andrews) Date: Mon, 7 Nov 2011 23:38:24 +1100 Subject: [R] Help: stemming and stem completion with package tm in R In-Reply-To: <69BB5AF3-4131-4792-92C6-17FFEE0A7E60@gmail.com> References: <69BB5AF3-4131-4792-92C6-17FFEE0A7E60@gmail.com> Message-ID: Hi Yanchang, The problem seems to be that stemCompletion only looks for words that begin with "mine", and "mining" does not strictly begin with "mine". I don't think there is any easy way to modify stemCompletion to get around that. However, maybe you could substitute the most prevalent word in your document for each of the stemmed words, then you would not need to use stemCompletion at all: e.g. topfreq <- function(x) rev(names(sort(table(x))))[1] (d <- ave(a, b, FUN = topfreq)) # [1] "mining" "miners" "mining" Cheers Felix On 4 November 2011 12:28, Yanchang Zhao wrote: > Hi All > > I came across a problem below when doing stemming and stem completion > with package tm in R. Word "mining" was stemmed to "mine" with > stemDocument(), and then completed to "miners"with stemCompletion(). > However, I prefer to keep "mining" intact. > > For stemCompletion(), the default type of completion is "prevalent", > which takes the most frequent match as completion. Although "mining" > is much more frequent than "miners" in my text, it still completed > "mine" to "miners". > > An example is shown below. > > ############################################ > library(tm) > (a <- c("mining", "miners", "mining")) > (b <- stemDocument(a)) > (d <- stemCompletion(b, dictionary=a)) > ############################################ > > Some possible solutions are: > 1) to change the options or dictionary in stemDocument(), so that > "mining" is not stemmed to "mine", which I think is the best way; > 2) to change the options or dictionary in stemCompletion(), so that > "mine" is completed to "mining"; > 3) to manually correct this after stem completion, which is the last > option. > > I am looking for a solution for above 1) or 2), but cannot find the > way to do it with stemDocument() in package tm. > > Any help will be appreciated. > > Thanks, > Yanchang Zhao > Email: yanchangzhao(at)gmail.com > > RDataMining: ? ? ? ? ? http://www.rdatamining.com > Twitter: ? ? ? ? ? ? ? http://twitter.com/RDataMining > Group on Linkedin: ? http://group2.rdatamining.com > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Felix Andrews / ??? http://www.neurofractal.org/felix/ From dwinsemius at comcast.net Mon Nov 7 13:39:02 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 7 Nov 2011 07:39:02 -0500 Subject: [R] help with programming In-Reply-To: References: Message-ID: On Nov 7, 2011, at 6:30 AM, Jing Tian wrote: >> >> ? > > Dear moderators, > > Please help me encode the program instructed by follows. > Thank u! > > Apply the methods introduced in Sections 4.2.1 and 4.2.2, say the >> rank-based variable selection and BIC criterions, to the Boston >> housing >> data. >> > ? The Boston housing data contains 506 observations, and is publicly > available in the R package mlbench (dataset ?BostonHousing?). > ? The response variable Y is the median value of owner-occupied > homes > (MEDV) in each of the 506 census tracts in the Boston Standard > Metropolitan > Statistical Areas, and there are thirteen predictor variables. > ? We are interested in the relationship between MEDV and the other > predictor variables. > >> ? Setup your parametric model and use rank-based regression >> methodology to >> select and estimate the parameters. Please read the Posting Guide and then use a search facility on your browser to search out what it says about homework. > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> > > > Best regards, > J.Tian > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From jwiley.psych at gmail.com Mon Nov 7 13:45:49 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Mon, 7 Nov 2011 04:45:49 -0800 Subject: [R] How to do a target value search analogous to Excel Solver In-Reply-To: <1320665885789-3998347.post@n4.nabble.com> References: <1320665885789-3998347.post@n4.nabble.com> Message-ID: Hi, Try this: set.seed(1) z <- rnorm(1,0,1) y <- function(x, z, xc){2*x - 1 + z - xc} ## uniroot finds a 0 value, so offset function by 5.5 uniroot(y, z = z, xc = 5.5, interval = c(-100, 100)) ## use the root and now no offset y(3.563227, z, 0) see ?uniroot for help Cheers, Josh On Mon, Nov 7, 2011 at 3:38 AM, jolo999 wrote: > Hi all, > > i'm trying to find a solver possibility analogous to the Excel Solver in R. > Since i just started with R, I have only little knowledge. Can someone help > me by solving the problem? > > I have the following 'starting position': > > z = rnorm(1,0,1) > y <- function(x,z){2*x - 1 + z} > > I am looking for a certain "x" in such a way that the result of the function > 'y' equals 5.5 > > Thanks a lot! > > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-do-a-target-value-search-analogous-to-Excel-Solver-tp3998347p3998347.html > Sent from the R help mailing list archive at Nabble.com. > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ From ligges at statistik.tu-dortmund.de Mon Nov 7 13:46:35 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Mon, 07 Nov 2011 13:46:35 +0100 Subject: [R] how save R file in exe file? In-Reply-To: <1320665991954-3998352.post@n4.nabble.com> References: <1320665991954-3998352.post@n4.nabble.com> Message-ID: <4EB7D32B.3030500@statistik.tu-dortmund.de> On 07.11.2011 12:39, jesslim wrote: > I am the beginner of R. I just want to know how can i save my .R file into > exe.file and that later will be execute in java. Short answer: You cannot. Long answer: R is an interpreted language, read the manuals! Uwe Ligges > -- > View this message in context: http://r.789695.n4.nabble.com/how-save-R-file-in-exe-file-tp3998352p3998352.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Mon Nov 7 13:50:32 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Mon, 07 Nov 2011 13:50:32 +0100 Subject: [R] HoltWinters in R 2.14.0 In-Reply-To: <7505DDA2-521F-4499-885F-79E5C18EAA44@gmail.com> References: <1320432920314-3991247.post@n4.nabble.com> <7505DDA2-521F-4499-885F-79E5C18EAA44@gmail.com> Message-ID: <4EB7D418.2080009@statistik.tu-dortmund.de> On 05.11.2011 06:02, TimothyDalbey wrote: > You are 100% correct by my estimation however I suppose I am looking for the conditions in the data that might cause the optim() or optimize() functions to fail. Yes. > I took a brief tour of the HoltWinters source but the code available (readable) online seemed outdated (by way of conflicting descriptions in versioning.). The sources are available on CRAN for all versions that are around. > I'll have another poke around the source - that is unless there is someone out there that can clearly state why optimize() fails within the context of the HoltWinters class v. 2.14.0. It always failed, but it did not always report it before by an ERROR message as Brian Ripley explained already. Best, Uwe Ligges > > On Nov 4, 2011, at 8:38 PM, "Prof Brian Ripley [via R]" wrote: > >> On Fri, 4 Nov 2011, R. Michael Weylandt wrote: >> >>> I believe there were some changes to Holt-Winters, specifically in re >>> optimization that probably lead to your problem, but you'll have to >>> provide more details. See the NEWS file for citations about the >>> change. If you put example code/data others may be able to help you -- >>> I haven't updated yet so I can't be of much help. >>> >>> Michael >>> >>> >>> On Fri, Nov 4, 2011 at 2:55 PM, TimothyDalbey<[hidden email]> wrote: >>>> Hey All, >>>> >>>> First time on these forums. Thanks in advance. >>>> >>>> Soooo... I have a process that was functioning well before the 2.14 update. >>>> Now the HoltWinters function is throwing an error whereby I get the >>>> following: >>>> >>>> Error in HoltWinters(sales.ts) : optimization failure >> Most likely it was incorrect before. You cannot assume that it was >> actually 'functioning well': all the cases where we have seen this >> message it was giving incorrect answers before and not detecting them. >> And in all those cases the model was a bad fit and using starting >> values for the optimization helped. >> >>>> I've been looking around to determine why this happens (see if I can test >>>> the data beforehand) but I haven't come across anything. >>>> >>>> Any help appreciated! >> >> -- >> Brian D. Ripley, [hidden email] >> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >> University of Oxford, Tel: +44 1865 272861 (self) >> 1 South Parks Road, +44 1865 272866 (PA) >> Oxford OX1 3TG, UK Fax: +44 1865 272595 >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> >> If you reply to this email, your message will be added to the discussion below: >> http://r.789695.n4.nabble.com/HoltWinters-in-R-2-14-0-tp3991247p3992395.html >> To unsubscribe from HoltWinters in R 2.14.0, click here. > > > -- > View this message in context: http://r.789695.n4.nabble.com/HoltWinters-in-R-2-14-0-tp3991247p3992497.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From therneau at mayo.edu Mon Nov 7 14:06:10 2011 From: therneau at mayo.edu (Terry Therneau) Date: Mon, 07 Nov 2011 07:06:10 -0600 Subject: [R] Kaplan Meier - not for dates Message-ID: <1320671170.762.9.camel@nemo> --- begin included message -- I have some data which is censored and I want to determine the median. Its actually cost data for a cohort of patients, many of whom are still on treatment and so are censored. I can do the same sort of analysis for a survival curve and get the median survival... ...but can I just use the survival curve functions to plot an X axis that is $ rather than date? If not is there some other way to achieve this? -- end inclusion -- 1. The survfit routines will work, and the results that you plot will indeed be on the dollar scale, BUT 2. The answer will be wrong. The reason is that the censoring occurs on a time scale, not a $ scale: you don't stop observing someone because total cost hits a threshold, but because calendar time does. The KM routines assume that the censoring process and the event process are on the same scale. The result can be an overestimation of cost. See Dan-Yu Lin, Biometrics 1997, "Estimating medical costs from incomplete follow-up data". Terry Therneau From calum.polwart at nhs.net Mon Nov 7 14:15:06 2011 From: calum.polwart at nhs.net (Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST)) Date: Mon, 7 Nov 2011 13:15:06 +0000 Subject: [R] Kaplan Meier - not for dates In-Reply-To: <20111107130613.33F2A449CDB@nhs-pd1e-esg006.ad1.nhs.net> References: <20111107130613.33F2A449CDB@nhs-pd1e-esg006.ad1.nhs.net> Message-ID: <20111107131507.AAEBF448A35@nhs-pd1e-esg110.ad1.nhs.net> > 2. The answer will be wrong. The reason is that the censoring occurs on a time scale, not a $ scale: you don't stop observing someone because > total cost hits a threshold, but because calendar time does. The KM routines assume that the censoring process and the event process are on the > same scale. > The result can be an overestimation of cost. See Dan-Yu Lin, Biometrics 1997, "Estimating medical costs from incomplete follow-up data". > > Terry Therneau Thanks that's extremely useful. I'll dig out that reference. You are correct my censoring is happening on an event - (dis)continuation of treatment - not on reaching a cumulative cost. Calum ******************************************************************************************************************** This message may contain confidential information. If yo...{{dropped:21}} From michael.weylandt at gmail.com Mon Nov 7 14:18:39 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Mon, 7 Nov 2011 08:18:39 -0500 Subject: [R] How to delete only those rows in a dataframe in which all records are missing In-Reply-To: References: Message-ID: Good morning Peter, No, I don't think complete cases gets what the OP wants. He wants to only throw out those rows that are entirely NA while complete.cases() gets rows with any NA's. Best, Michael 2011/11/7 Petr PIKAL : >> >> Perhaps something like this will work. >> >> df[!(rowSums(is.na(df))==NCOL(df)),] > > > Or > > df[complete.cases(df),] > > Regards > Petr > > >> >> Michael >> >> On Fri, Nov 4, 2011 at 9:27 AM, Jose Iparraguirre >> wrote: >> > Hi, >> > >> > Imagine I have the following data frame: >> > >> >> a <- c(1,NA,3) >> >> b <- c(2,NA,NA) >> >> c <- data.frame(cbind(a,b)) >> >> c >> > ? a ?b >> > 1 ?1 ?2 >> > 2 NA NA >> > 3 ?3 NA >> > >> > I want to delete the second row. If I use na.omit, that would also >> affect the third row. I tried to use a loop and an ifelse clause with >> is.na to get R identify that row in which all records are missing, as >> opposed to the first row in which no records are missing or the third > one, >> in which only one record is missing. How can I get R identify the row in > >> which all records are missing? Or, how can I get R delete/omit only this > row? >> > Thanks in advance, >> > >> > Jos? >> > >> > >> > Jos? Iparraguirre >> > Chief Economist >> > Age UK >> > >> > T 020 303 31482 >> > E Jose.Iparraguirre at ageuk.org.uk> >> > >> > Tavis House, 1- 6 Tavistock Square >> > London, WC1H 9NB >> > www.ageuk.org.uk | ageukblog.org.uk> ageukblog.org.uk/> | @AgeUKPA >> > >> > >> > Age UK ?Improving later life >> > >> > www.ageuk.org.uk >> > >> > >> > >> > >> > >> > ------------------------------- >> > >> > Age UK is a registered charity and company limited by guarantee, >> (registered charity number 1128267, registered company number 6825798). >> Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA. >> > >> > For the purposes of promoting Age UK Insurance, Age UK is an Appointed > >> Representative of Age UK Enterprises Limited, Age UK is an Introducer >> Appointed Representative of JLT Benefit Solutions Limited and > Simplyhealth >> Access for the purposes of introducing potential annuity and health cash > >> plans customers respectively. ?Age UK Enterprises Limited, JLT Benefit >> Solutions Limited and Simplyhealth Access are all authorised and > regulated >> by the Financial Services Authority. >> > >> > >> > >> > >> > >> > ------------------------------ >> > >> > This email and any files transmitted with it are confidential and >> intended solely for the use of the individual or entity to whom they are > >> addressed. If you receive a message in error, please advise the sender > and >> delete immediately. >> > >> > >> > >> > Except where this email is sent in the usual course of our business, > any >> opinions expressed in this email are those of the author and do not >> necessarily reflect the opinions of Age UK or its subsidiaries and >> associated companies. Age UK monitors all e-mail transmissions passing >> through its network and may block or modify mails which are deemed to be > unsuitable. >> > >> > >> > >> > >> > >> > Age Concern England (charity number 261794) and Help the Aged (charity > >> number 272786) and their trading and other associated companies merged > on >> 1st April 2009. ?Together they have formed the Age UK Group, dedicated > to >> improving the lives of people in later life. ?The three national Age >> Concerns in Scotland, Northern Ireland and Wales have also merged with >> Help the Aged in these nations to form three registered charities: Age >> Scotland, Age NI, Age Cymru. >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > ? ? ? ?[[alternative HTML version deleted]] >> > >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> > >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > From therneau at mayo.edu Mon Nov 7 14:31:38 2011 From: therneau at mayo.edu (Terry Therneau) Date: Mon, 07 Nov 2011 07:31:38 -0600 Subject: [R] survfit function? Message-ID: <1320672698.762.19.camel@nemo> Two thoughts. First, prediction with time dependent covariates is always an issue. If you had unemployment as a month-by-month time-dependent covariate in the first model, then for prediction you will need to provide a month-by-month future unemployment scenario. Doing this is easy in the code, but how to choose which scenario is "relevant" and/or "interesting" is hard. See section 10.2.4 of Therneau and Grambsh for more discussion. Second, I think your time intervals will be ok. Given what you know now, the question is "will there be failure in the next 24". I'd think of "the next 24" as the time scale, and not a particular slice of calander time such as "1/1/2003 - 1/1/2005" Terry Therneau From calum.polwart at nhs.net Mon Nov 7 14:31:32 2011 From: calum.polwart at nhs.net (Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST)) Date: Mon, 7 Nov 2011 13:31:32 +0000 Subject: [R] Kaplan Meier - not for dates In-Reply-To: <20111107130613.33F2A449CDB@nhs-pd1e-esg006.ad1.nhs.net> References: <20111107130613.33F2A449CDB@nhs-pd1e-esg006.ad1.nhs.net> Message-ID: <20111107133133.D4FAA44AB59@nhs-pd1e-esg104.ad1.nhs.net> > 2. The answer will be wrong. The reason is that the censoring occurs on a time scale, not a $ scale: you don't stop observing someone because > total cost hits a threshold, but because calendar time does. The KM routines assume that the censoring process and the event process are on the > same scale. > The result can be an overestimation of cost. See Dan-Yu Lin, Biometrics 1997, "Estimating medical costs from incomplete follow-up data". Having now skimmed the paper this is long term follow-up. In my particular case the patients are getting treatment for relatively short periods (median time to stopping treatment will be ~ 9months) and will discontinue treatment relatively quickly (I'd be surprised if anyone is still on treatment 3-4 years out). I only want the costs of that treatment not the costs for their overall care to death. I'm not sure how that affects things but hoping it makes life simpler. Calum ******************************************************************************************************************** This message may contain confidential information. If yo...{{dropped:21}} From alaios at yahoo.com Mon Nov 7 14:32:53 2011 From: alaios at yahoo.com (Alaios) Date: Mon, 7 Nov 2011 05:32:53 -0800 (PST) Subject: [R] How much time a process need? In-Reply-To: <4EB7CEB2.9030401@statistik.tu-dortmund.de> References: <1320660554.33057.YahooMailNeo@web120120.mail.ne1.yahoo.com> <4EB7CEB2.9030401@statistik.tu-dortmund.de> Message-ID: <1320672773.23840.YahooMailNeo@web120105.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From alaios at yahoo.com Mon Nov 7 14:40:04 2011 From: alaios at yahoo.com (Alaios) Date: Mon, 7 Nov 2011 05:40:04 -0800 (PST) Subject: [R] function that load variables Message-ID: <1320673204.93299.YahooMailNeo@web120104.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jwiley.psych at gmail.com Mon Nov 7 14:40:34 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Mon, 7 Nov 2011 05:40:34 -0800 Subject: [R] How much time a process need? In-Reply-To: <1320672773.23840.YahooMailNeo@web120105.mail.ne1.yahoo.com> References: <1320660554.33057.YahooMailNeo@web120120.mail.ne1.yahoo.com> <4EB7CEB2.9030401@statistik.tu-dortmund.de> <1320672773.23840.YahooMailNeo@web120105.mail.ne1.yahoo.com> Message-ID: On Mon, Nov 7, 2011 at 5:32 AM, Alaios wrote: > So I just need to get the > > ?? user? system elapsed > ? 0.460?? 0.048? 67.366 > > > user value and convert the seconds to days and then to hours ? Right? > > What about this elapsed field? It's all in seconds. Convert whatever fields you want. Josh > > ________________________________ > From: Uwe Ligges > > Cc: "R-help at r-project.org" > Sent: Monday, November 7, 2011 1:27 PM > Subject: Re: [R] How much time a process need? > > > > On 07.11.2011 11:09, Alaios wrote: >> Dear all, >> I have finished a large function that takes around 1 day to finish. >> I was using system.time(callmyfunction()) to measure how much time it needed to finish, my problem is that I do not know how to interpret their numbers. >> I was looking to convert these results to something more readably like. >> "This function took 1 Day 2 hours and 35 minutes to complete." >> >> How I can convert the system.time output to something like that? > > > system.time() responds in seconds, hence you can apply simple arithmetic > to get days, hours and minutes. > > Uwe Ligges > > > >> >> B.R >> Alex >> >> ??? [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ From Patrick.Palmier at developpement-durable.gouv.fr Mon Nov 7 14:41:03 2011 From: Patrick.Palmier at developpement-durable.gouv.fr (PALMIER Patrick (Responsable de groupe) - CETE NP/TM/ST) Date: Mon, 07 Nov 2011 14:41:03 +0100 Subject: [R] R in batch mode packages loading question Message-ID: <4EB7DFEF.8010300@developpement-durable.gouv.fr> Hello, I use R in batch mode. Each time, I execute a script, R is loading each packages I need in my script. That's Ok But, I had to execute many scripts , and each time R is re-loading the corresponding packages, which take to much time Is it possible ask R to load the packages only once, and stay in memory in background for further scripts, which would avoid to load the packages in each script, or if you have another solution that need to only load packages once in the first scripts, so that further scripts do not need to load these packages too. Thanks in advance -- *Patrick PALMIER** **Centre d'?tudes Techniques de l'?quipement Nord - Picardie D?partement Transport Mobilit?s */*Responsable du groupe Syst?mes de Transports*//* */2, rue de Bruxelles, BP 275 59019 Lille cedex FRANCE T?l: +33 (0) 3 20 49 60 70 Fax: +33 (0) 3 20 49 63 69 From michael.weylandt at gmail.com Mon Nov 7 15:12:49 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Mon, 7 Nov 2011 09:12:49 -0500 Subject: [R] How much time a process need? In-Reply-To: References: <1320660554.33057.YahooMailNeo@web120120.mail.ne1.yahoo.com> <4EB7CEB2.9030401@statistik.tu-dortmund.de> <1320672773.23840.YahooMailNeo@web120105.mail.ne1.yahoo.com> Message-ID: Alaios: Generally I would think you would look at the elapsed field (at least I do): consider the example you ran to give that data. Did it take about half a second or a minute? Gabor showed this example once to illustrate the difference: system.time(Sys.sleep(20)) Michael PS -- If you really want to dig into this, try this set of tools: https://code.google.com/p/rbenchmark/ On Mon, Nov 7, 2011 at 8:40 AM, Joshua Wiley wrote: > On Mon, Nov 7, 2011 at 5:32 AM, Alaios wrote: >> So I just need to get the >> >> ?? user? system elapsed >> ? 0.460?? 0.048? 67.366 >> >> >> user value and convert the seconds to days and then to hours ? Right? >> >> What about this elapsed field? > > It's all in seconds. ?Convert whatever fields you want. > > Josh > >> >> ________________________________ >> From: Uwe Ligges >> >> Cc: "R-help at r-project.org" >> Sent: Monday, November 7, 2011 1:27 PM >> Subject: Re: [R] How much time a process need? >> >> >> >> On 07.11.2011 11:09, Alaios wrote: >>> Dear all, >>> I have finished a large function that takes around 1 day to finish. >>> I was using system.time(callmyfunction()) to measure how much time it needed to finish, my problem is that I do not know how to interpret their numbers. >>> I was looking to convert these results to something more readably like. >>> "This function took 1 Day 2 hours and 35 minutes to complete." >>> >>> How I can convert the system.time output to something like that? >> >> >> system.time() responds in seconds, hence you can apply simple arithmetic >> to get days, hours and minutes. >> >> Uwe Ligges >> >> >> >>> >>> B.R >>> Alex >>> >>> ??? [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> ? ? ? ?[[alternative HTML version deleted]] >> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > > > -- > Joshua Wiley > Ph.D. Student, Health Psychology > Programmer Analyst II, ATS Statistical Consulting Group > University of California, Los Angeles > https://joshuawiley.com/ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From michael.weylandt at gmail.com Mon Nov 7 15:17:26 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Mon, 7 Nov 2011 09:17:26 -0500 Subject: [R] function that load variables In-Reply-To: <1320673204.93299.YahooMailNeo@web120104.mail.ne1.yahoo.com> References: <1320673204.93299.YahooMailNeo@web120104.mail.ne1.yahoo.com> Message-ID: Perhaps you mean to use load(...., envir = .GlobalEnv) Currently you load up the variables in the function environment but then they are thrown away when the function ends. Michael On Mon, Nov 7, 2011 at 8:40 AM, Alaios wrote: > Dear all, > I have saved few variable names into local files, > I wanted to make a function that load this files and "generates" the variable names into my working environment. I have tried to do that as a function but my problem is > > that this function does not return the variable names > > > load_data<-function(path,Reload=FALSE){ > > ??? if (Reload==TRUE){ > > ??? ? print("Loading results") > > ??? ? # FirstSet > ??? ? load(file=paste(path,'first',sep="")) > ??? ? first<-Set > > ??? ? # SecondSet > ??? ? load(file=paste(path,'second',sep="")) > ??? ? second<-Set > > ..................(part omittted here) > > ??? ? save( first, second,....(part omitted here)...,???? file=paste(path,'Results',sep="")) > ??? } > > ??? return (load(file=paste(path,'Results',sep=""))) > > } > > so my idea was the following: > I call the function and it retuns the values first,second,... loaded into the current working space. > If I want to refresh them, something has changed to the firstDataSet I set the function's variable Reload=True and thus all the data are refreshed. > The problem is not that the return statement I ahve at the end of the function does not return the loaded variable to the working environment but only the status of the load command. > > Do you know how I can change that so my function returns also Loaded Variable names to the environment? > > Alex > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From pisicandru at hotmail.com Mon Nov 7 15:18:55 2011 From: pisicandru at hotmail.com (Monica Pisica) Date: Mon, 7 Nov 2011 14:18:55 +0000 Subject: [R] How to write a shapefile with projection In-Reply-To: <4EB539C1.8010807@univ-fcomte.fr> References: <4EB539C1.8010807@univ-fcomte.fr> Message-ID: Hi Patrick, Thanks for letting me know. I mostly use rgdal to read and write rasters so until now i kind of ignore other functionality. Unfortunately i supposed that a package dedicated to shapefiles would be the answer and had the functionality i needed. But rgdal does a nice job in saving my files as i need. It is good to know how to add the projection file to the shapefiles for the future, if it is not generated from the onset. Thanks again, Monica ---------------------------------------- > Date: Sat, 5 Nov 2011 14:27:29 +0100 > From: patrick.giraudoux at univ-fcomte.fr > To: pisicandru at hotmail.com > CC: r-help at r-project.org > Subject: re: How to write a shapefile with projection > > > Hi, > > > > Sorry i have put such a detailed question to the list about writing a shapefile with projection. I realized that if i use writeOGR from rgdal and not the other write shapefile functions i can get a shapefile with projection recognized by ArcGIS. The command is (in case anybody wonders): > > > > ?writeOGR(crest.sp, "I:\\LA_levee\\Shape", "llev_crest_pts6", driver = "ESRI Shapefile") > > > > where crest.sp is a spatial point data frame with projection. > > > > Thanks, > > > > Monica > > Indeed. > > writePointsShape() does not write the projection file, but using the > function showWKT from rgdal, you can also create one like that: > > writePointsShape(crest.sp,"crest") > cat(showWKT(proj4string(crest.sp)),file="crest.prj") > > Patrick > > From michael.weylandt at gmail.com Mon Nov 7 15:19:36 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Mon, 7 Nov 2011 09:19:36 -0500 Subject: [R] How to use 'prcomp' with CLUSPLOT? In-Reply-To: <1320445108437-3991868.post@n4.nabble.com> References: <47CA6EA672AF4A4DA62A7DCCFEDC9ABB0BB34CF0@XMAIL-MBX-AH1.AD.UCSD.EDU> <47CA6EA672AF4A4DA62A7DCCFEDC9ABB0BB34D18@XMAIL-MBX-AH1.AD.UCSD.EDU> <1320445108437-3991868.post@n4.nabble.com> Message-ID: Happy to look at it further, but I don't have access to "fitnw$cluster" so i can't run clusplot, modified or unmodified. If you would, create a test data set using dput() for all the needed objects. Michael On Fri, Nov 4, 2011 at 6:18 PM, jo wrote: > Hello Michael, > > Thank you for replying to my post! ?That was an interesting solution - good > to know, but I am now getting a different error: > /Error in if (length(clus) != n) stop("The clustering vector is of incorrect > length") : > ?argument is of length zero/ > which brought me here: > https://svn.r-project.org/R-packages/trunk/cluster/R/plotpart.q > I am trying to figure that out now... > > FYI, as a test set, one could just delete columns until they are <= to the > number of rows... > > clusplot has some nice extras, but I am also looking at just plotting > w/pca... > > > Thank you again, > Jo > > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-use-prcomp-with-CLUSPLOT-tp3989022p3991868.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From murdoch.duncan at gmail.com Mon Nov 7 15:20:02 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Mon, 07 Nov 2011 09:20:02 -0500 Subject: [R] R in batch mode packages loading question In-Reply-To: <4EB7B7A2.3030209@developpement-durable.gouv.fr> References: <4EB7B7A2.3030209@developpement-durable.gouv.fr> Message-ID: <4EB7E912.6090902@gmail.com> On 07/11/2011 5:49 AM, PALMIER Patrick (Responsable de groupe) - CETE NP/TM/ST wrote: > Hello, > > > I use R in batch mode. Each time, I execute a script, R is loading each > packages I need in my script. That's Ok > But, I had to execute many scripts , and each time R is re-loading the > corresponding packages, which take to much time > > Is it possible ask R to load the packages only once, and stay in memory > in background for further scripts, which would avoid to load the > packages in each script, or if you have another solution that need to > only load packages once in the first scripts, so that further scripts do > not need to load these packages too. Write one script that has a sequence of calls to source() to run the other scripts. You'll need to be careful that unintentional leftover objects and settings from one script don't affect the others; you may also want to use the "echo=TRUE" option when you source, so you see the commands as they are executed. Duncan Murdoch From pjmiller_57 at yahoo.com Mon Nov 7 15:39:06 2011 From: pjmiller_57 at yahoo.com (Paul Miller) Date: Mon, 7 Nov 2011 06:39:06 -0800 (PST) Subject: [R] Problem working with dates In-Reply-To: Message-ID: <1320676746.74842.YahooMailClassic@web161604.mail.bf1.yahoo.com> Hello All, I've been reading books about R for awhile now and am in the process of replicating the SAS analyses from an old report. I want to be sure that I can do all the things I need to in R before using it in my daily work. So far, I've managed to read in all my data and have done some data manipulation. I'm having trouble with fixing an error in a date variable though, and was hoping someone could help. One of the patients in my data has a DOB incorrectly entered as: '11/23/21931' Their DOB should be: '11/23/1931' How can I correct this problem before calculating age in the code below? DOB starts out as a factor in the Demo dataframe but then is converted into a date. So I had thought the ifelse that follows could be used to correct the problem, but this doesn't seem to be the case. Thanks, Paul Demo_Char <- within(Demo, { DateCompleted <- as.Date(DateCompleted, format = "%m/%d/%Y") DOB <- as.Date(DOB, format = "%m/%d/%Y") DOB <- ifelse(Subject==108945, as.Date("1931-11-23"), DOB) Age <- as.integer((DateCompleted - DOB) / 365.25) }) From Patrick.Palmier at developpement-durable.gouv.fr Mon Nov 7 15:38:47 2011 From: Patrick.Palmier at developpement-durable.gouv.fr (PALMIER Patrick (Responsable de groupe) - CETE NP/TM/ST) Date: Mon, 07 Nov 2011 15:38:47 +0100 Subject: [R] R in batch mode packages loading question In-Reply-To: <4EB7E912.6090902@gmail.com> References: <4EB7B7A2.3030209@developpement-durable.gouv.fr> <4EB7E912.6090902@gmail.com> Message-ID: <4EB7ED77.7090706@developpement-durable.gouv.fr> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Mon Nov 7 15:58:39 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 7 Nov 2011 09:58:39 -0500 Subject: [R] Problem working with dates In-Reply-To: <1320676746.74842.YahooMailClassic@web161604.mail.bf1.yahoo.com> References: <1320676746.74842.YahooMailClassic@web161604.mail.bf1.yahoo.com> Message-ID: On Nov 7, 2011, at 9:39 AM, Paul Miller wrote: > Hello All, > > I've been reading books about R for awhile now and am in the process > of replicating the SAS analyses from an old report. I want to be > sure that I can do all the things I need to in R before using it in > my daily work. > > So far, I've managed to read in all my data and have done some data > manipulation. I'm having trouble with fixing an error in a date > variable though, and was hoping someone could help. > > One of the patients in my data has a DOB incorrectly entered as: > > '11/23/21931' > > Their DOB should be: > > '11/23/1931' > > How can I correct this problem before calculating age in the code > below? > DOB starts out as a factor in the Demo dataframe but then is > converted into a date. So I had thought the ifelse that follows > could be used to correct the problem, but this doesn't seem to be > the case. > > Thanks, > > Paul > Why not fix the single error first? Demo[ Demo$Subject==108945, "DateCompleted"] <- '11/23/1931' Then you can skip the time-consuming ifelse() inside the within() call. -- David > Demo_Char <- within(Demo, { > DateCompleted <- as.Date(DateCompleted, format = "%m/%d/%Y") > DOB <- as.Date(DOB, format = "%m/%d/%Y") > DOB <- ifelse(Subject==108945, as.Date("1931-11-23"), DOB) > Age <- as.integer((DateCompleted - DOB) / 365.25) > }) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From jonas.lorson at ebs.de Mon Nov 7 15:07:40 2011 From: jonas.lorson at ebs.de (jolo999) Date: Mon, 7 Nov 2011 06:07:40 -0800 (PST) Subject: [R] How to do a target value search analogous to Excel Solver In-Reply-To: References: <1320665885789-3998347.post@n4.nabble.com> Message-ID: <1320674860383-3998729.post@n4.nabble.com> Works fine. Thanks for the quick reply!! -- View this message in context: http://r.789695.n4.nabble.com/How-to-do-a-target-value-search-analogous-to-Excel-Solver-tp3998347p3998729.html Sent from the R help mailing list archive at Nabble.com. From qqh5011 at gmail.com Mon Nov 7 14:06:53 2011 From: qqh5011 at gmail.com (qqh5011) Date: Mon, 7 Nov 2011 07:06:53 -0600 Subject: [R] How do I return to the row values of a matrix after computing distances Message-ID: <20111107070653785925130@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bonda at hsu-hh.de Mon Nov 7 15:15:49 2011 From: bonda at hsu-hh.de (bonda) Date: Mon, 7 Nov 2011 06:15:49 -0800 (PST) Subject: [R] nproc parameter in efpFunctional In-Reply-To: References: <1320228608589-3972419.post@n4.nabble.com> <1320306970603-3984605.post@n4.nabble.com> <1320402432656-3989598.post@n4.nabble.com> Message-ID: <1320675349553-3998747.post@n4.nabble.com> Thank you very much, it works now! Best regards, J -- View this message in context: http://r.789695.n4.nabble.com/nproc-parameter-in-efpFunctional-tp3972419p3998747.html Sent from the R help mailing list archive at Nabble.com. From horseatingweeds at gmail.com Mon Nov 7 15:30:39 2011 From: horseatingweeds at gmail.com (horseatingweeds) Date: Mon, 7 Nov 2011 06:30:39 -0800 (PST) Subject: [R] Error: could not find function "MLearn" Message-ID: <1320676239513-3998805.post@n4.nabble.com> I'm getting this error when I try to run the function MLearn(): Error: could not find function "MLearn" I have the MLInterface tools installed. But when I look for MLearn "??MLearn" but I don't find it. The closest thing I find is the method MLearn_new() under MLInterfaces. I've tried replacing MLearn() with MLearn_new() in my script, but I still get the same error, this time: Error: could not find function "MLearn_new" Any ideas what I'm doing wrong here? Thanks. R version 2.13.4, Biocinstall version 2.8.4, on Ubuntu 10.04 -- View this message in context: http://r.789695.n4.nabble.com/Error-could-not-find-function-MLearn-tp3998805p3998805.html Sent from the R help mailing list archive at Nabble.com. From michael.weylandt at gmail.com Mon Nov 7 16:02:15 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Mon, 7 Nov 2011 10:02:15 -0500 Subject: [R] Problem working with dates In-Reply-To: <1320676746.74842.YahooMailClassic@web161604.mail.bf1.yahoo.com> References: <1320676746.74842.YahooMailClassic@web161604.mail.bf1.yahoo.com> Message-ID: I think you are making the transform much more complicated than it needs to be: Suppose you have a data frame with a bunch of things that look like dates but are really factors: Then the following transform should work from factor to Date: df <- as.Date(as.character(df), format = "%Y/%m/%d") and to address the mistyped element: df[df == "11/2321931"] <- "11/23/1931" You should probably do this before the conversion to date type. If you want to do it in a look up-ish sort of way, this is probably better: within(Demo, DOB[Subject == 108945] <- "11/23/1931") Michael On Mon, Nov 7, 2011 at 9:39 AM, Paul Miller wrote: > Hello All, > > I've been reading books about R for awhile now and am in the process of replicating the SAS analyses from an old report. I want to be sure that I can do all the things I need to in R before using it in my daily work. > > So far, I've managed to read in all my data and have done some data manipulation. I'm having trouble with fixing an error in a date variable though, and was hoping someone could help. > > One of the patients in my data has a DOB incorrectly entered as: > > '11/23/21931' > > Their DOB should be: > > '11/23/1931' > > How can I correct this problem before calculating age in the code below? > DOB starts out as a factor in the Demo dataframe but then is converted into a date. So I had thought the ifelse that follows could be used to correct the problem, but this doesn't seem to be the case. > > Thanks, > > Paul > > Demo_Char <- within(Demo, { > DateCompleted <- as.Date(DateCompleted, format = "%m/%d/%Y") > DOB <- as.Date(DOB, format = "%m/%d/%Y") > DOB <- ifelse(Subject==108945, as.Date("1931-11-23"), DOB) > Age <- as.integer((DateCompleted - DOB) / 365.25) > }) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From anate25 at gmail.com Mon Nov 7 16:05:14 2011 From: anate25 at gmail.com (anat) Date: Mon, 7 Nov 2011 07:05:14 -0800 (PST) Subject: [R] Plotting a network in a "star" mode Message-ID: <1320678314155-3998920.post@n4.nabble.com> Dear R experts, I have a network constructed through an adjacency matrix (square matrix) using the "network" package in R. I'm interested in plotting this network, but in a star-mode, which means I have one node (the first column or row)i want to be located in the center of the network and all the other around it. I read the network.plot documentation, but there I could only find 3 available methods (the options available for "mode"), which did not meet my needs. My matrix looks as follows: 6 nodes, "1" appears when an edge is connecting 2 nodes. [,1] [,2] [,3] [,4] [,5] [,6] [1,] 0 1 1 1 1 1 [2,] 1 0 1 1 1 1 [3,] 1 1 0 1 1 1 [4,] 1 1 1 0 1 0 [5,] 1 1 1 1 0 1 [6,] 1 1 1 0 1 0 I would like to plot a network with 6 nodes, so that the first one is in the middle. Can you please advice me how to do that? Thank you, Anat -- View this message in context: http://r.789695.n4.nabble.com/Plotting-a-network-in-a-star-mode-tp3998920p3998920.html Sent from the R help mailing list archive at Nabble.com. From sbpurohit at gmail.com Mon Nov 7 16:19:51 2011 From: sbpurohit at gmail.com (1Rnwb) Date: Mon, 7 Nov 2011 07:19:51 -0800 (PST) Subject: [R] help with formula for clogit Message-ID: <1320679191169-3998967.post@n4.nabble.com> I would like to know if clogit function can be used as below clogit(group~., data=dataframe) When I try to use in above format it takes a long time, I would appreciate some pointers to get multiple combinations tested. set.seed(100) d=data.frame(x=rnorm(20)+5, x1=rnorm(20)+5, x2=rnorm(20)+5, x3=rnorm(20)+5, x4=rnorm(20)+5, x5=rnorm(20)+5, x6=rnorm(20)+5, x7=rnorm(20)+5, x8=rnorm(20)+5, group=rep(c(1,2),10), Age=rnorm(20)+35,strata=c(rep(1,10), rep(2,10))) nam=names(d)[1:9] results <- c("Protein", "OR", "p-val") pc3=combinations(n=length(nam),r=2) for (len in 1:dim(pc3)[2]) { prs=pc3[len,] mols=nam[prs] d2=d[,c(mols,'group','Age','strata')] log.reg<-clogit(group~.,data=d2) a = summary(log.reg)$conf.int z= summary(log.reg)$coefficients[1,4] #ncol in coefficients is 3 * number of parameters pval = 2*pnorm(-abs(z)) res2 = c(paste('IL8+',molecule,sep=''), paste (round(a[1,1],2), " (" , round(a[1,3],2), " - " , round(a[1,4],2), ")" , sep=""), pval) results = rbind (results ,res2 ) } Thanks Sharad -- View this message in context: http://r.789695.n4.nabble.com/help-with-formula-for-clogit-tp3998967p3998967.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Mon Nov 7 16:28:39 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 7 Nov 2011 10:28:39 -0500 Subject: [R] Error: could not find function "MLearn" In-Reply-To: <1320676239513-3998805.post@n4.nabble.com> References: <1320676239513-3998805.post@n4.nabble.com> Message-ID: On Nov 7, 2011, at 9:30 AM, horseatingweeds wrote: > I'm getting this error when I try to run the function MLearn(): > > Error: could not find function "MLearn" The usual reason for a newbie not getting a function is that they failed to do one of these: library(MLInterface) require(MLInterface) You didn't refer to MLInterface as a package but that is my guess as to what you installed. Installation puts it in your library but you also need to load it in an open workspace. > > I have the MLInterface tools installed. But when I look for MLearn > "??MLearn" but I don't find it. The closest thing I find is the method > MLearn_new() under MLInterfaces. I've tried replacing MLearn() with > MLearn_new() in my script, but I still get the same error, this time: > > Error: could not find function "MLearn_new" > > Any ideas what I'm doing wrong here? Thanks. > > R version 2.13.4, Biocinstall version 2.8.4, on Ubuntu 10.04 > > -- > View this message in context: http://r.789695.n4.nabble.com/Error-could-not-find-function-MLearn-tp3998805p3998805.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From adele_thompson at cargill.com Mon Nov 7 16:29:58 2011 From: adele_thompson at cargill.com (Schatzi) Date: Mon, 7 Nov 2011 07:29:58 -0800 (PST) Subject: [R] Lower bounds on selfStart function not working Message-ID: <1320679798626-3999231.post@n4.nabble.com> I adapted a selfStart function and the lower bounds are not working. The parameter "b" is negative, whereas I would like the lower bound to be zero. Any ideas? Thanks. Here is my code (I am still figuring out how to easily make replicable examples): A<-1.75 mu<-.2 l<-2 b<-0 x<-seq(0,18,.25) create.y<-function(x){ y<-b+A/(1+exp(4*mu/A*(l-x)+2)) return(y) } ys<-create.y(x) yvec<-(rep(ys,5))*(.95+runif(length(x)*5)/10) Trt<-factor(c(rep("A1",length(x)),rep("A2",length(x)),rep("A3",length(x)),rep("A4",length(x)),rep("A5",length(x)))) Data<-data.frame(Trt,rep(x,5),yvec) names(Data)<-c("Trt","x","y") NewData<-groupedData(y~x|Trt,data=Data) powrDpltInit <- function(mCall, LHS, data) { xy <- sortedXyData(mCall[["x"]],LHS,data) A.s <- max(xy$y)-min(xy$y) mu.s <- A.s/7.5 l.s <- 0 b.s <- max(min(xy$y),0.00001) value <- c(A.s, l.s, mu.s, b.s) #function to optimize func1 <- function(value) { A.s <- value[1] mu.s <- value[2] l.s <- value[3] b.s <- value[4] y1<-rep(0,length(xy$x)) # generate vector for predicted y (y1) to evaluate against observed y for(cnt in 1:length(xy$x)){ y1[cnt]<- b.s+A.s/(1+exp(4*mu.s/A.s*(l.s-x[cnt])+2))} #predicting y1 for values of y evl<-sum((xy$y-y1)^2) #sum of squares is function to minimize return(evl)} #optimizing oppar<-optim(c(A.s , mu.s , l.s , b.s),func1,method="L-BFGS-B", lower=c(0.0001,0.0,0.0,0.0), control=list(maxit=2000,trace=TRUE)) #saving optimized parameters value<-c(oppar$par[1L],oppar$par[2L],oppar$par[3L],oppar$par[4L]) names(value) <- mCall[c("A","mu","l","b")] value } SSpowrDplt<-selfStart(~b+A/(1+exp(4*mu/A*(l-x)+2)),initial=powrDpltInit, parameters=c("A","mu","l","b")) test1<-nlsList(SSpowrDplt,NewData) coef(test1) ----- In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/Lower-bounds-on-selfStart-function-not-working-tp3999231p3999231.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Mon Nov 7 16:39:54 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 7 Nov 2011 10:39:54 -0500 Subject: [R] How do I return to the row values of a matrix after computing distances In-Reply-To: <20111107070653785925130@gmail.com> References: <20111107070653785925130@gmail.com> Message-ID: <6B077195-341A-4B55-B722-4B1B510CA490@comcast.net> On Nov 7, 2011, at 8:06 AM, qqh5011 wrote: > ## Package Needed > library(fields) > > ## Assumptions > set.seed(123) > nsim<-5 > p<-2 > > ## Generate Random Matrix G > G <- matrix(runif(p*nsim),nsim,p) > > ## Set Empty Matraces dmax and dmin > dmax<- matrix(data=NA,nrow=nsim,ncol=p) > dmin<- matrix(data=NA,nrow=nsim,ncol=p) > > ## Loop to Fill dmax and dmin > for(i in 1:nsim) { > > dmax[i]<- max(rdist(G[i,,drop=FALSE],G)) > dmin[i]<- min(rdist(G[i,,drop=FALSE],G[-i,])) } > > I filled the dmax and dmin with the distance values I calculated, > but what I really want to fill them with are the rows in G. What > should I do? I tried "which" function but it did not work. Thank you > in advance!!!! Dear qq5011 ... AKA user1033763; The practice of cross-posting questions to StackOverFlow and rhelp within hours of each other is deprecated, at least on on rhelp. http://stackoverflow.com/questions/8036831/how-to-return-to-the-qualifying-rows-of-original-data If you don't get an answer within some reasonable interval (measured in days, not hours) then feel free to post a follow-up ... with citation of the first posting. > qqh5011 > [[alternative HTML version deleted]] Posting in HTML is likewise deprecated. Please read the Posting Guide were both of these issues and many other sensible requests for professional behavior are detailed > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. (You did do that, and please continue to do so.) -- David Winsemius, MD West Hartford, CT From ligges at statistik.tu-dortmund.de Mon Nov 7 16:40:04 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Mon, 07 Nov 2011 16:40:04 +0100 Subject: [R] How much time a process need? In-Reply-To: <1320672773.23840.YahooMailNeo@web120105.mail.ne1.yahoo.com> References: <1320660554.33057.YahooMailNeo@web120120.mail.ne1.yahoo.com> <4EB7CEB2.9030401@statistik.tu-dortmund.de> <1320672773.23840.YahooMailNeo@web120105.mail.ne1.yahoo.com> Message-ID: <4EB7FBD4.50108@statistik.tu-dortmund.de> On 07.11.2011 14:32, Alaios wrote: > So I just need to get the > > user system elapsed > 0.460 0.048 67.366 > > > user value and convert the seconds to days and then to hours ? Right? > > What about this elapsed field? Yes, the elapsed time in seconds. Uwe Ligges > > > > ________________________________ > From: Uwe Ligges > To: Alaios > Cc: "R-help at r-project.org" > Sent: Monday, November 7, 2011 1:27 PM > Subject: Re: [R] How much time a process need? > > > > On 07.11.2011 11:09, Alaios wrote: >> Dear all, >> I have finished a large function that takes around 1 day to finish. >> I was using system.time(callmyfunction()) to measure how much time it needed to finish, my problem is that I do not know how to interpret their numbers. >> I was looking to convert these results to something more readably like. >> "This function took 1 Day 2 hours and 35 minutes to complete." >> >> How I can convert the system.time output to something like that? > > > system.time() responds in seconds, hence you can apply simple arithmetic > to get days, hours and minutes. > > Uwe Ligges > > > >> >> B.R >> Alex >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. From Patrick.Palmier at developpement-durable.gouv.fr Mon Nov 7 16:40:24 2011 From: Patrick.Palmier at developpement-durable.gouv.fr (PALMIER Patrick (Responsable de groupe) - CETE NP/TM/ST) Date: Mon, 07 Nov 2011 16:40:24 +0100 Subject: [R] R in batch mode packages loading question In-Reply-To: <4EB7E912.6090902@gmail.com> References: <4EB7B7A2.3030209@developpement-durable.gouv.fr> <4EB7E912.6090902@gmail.com> Message-ID: <4EB7FBE8.8030608@developpement-durable.gouv.fr> Thank you for your response, but this is not exactly what I need We are working on a tool that automatically generate R scripts adapted to our surveys databases. When we want to do a table, we select interactively fields, associated labels, functions for an automatic crosstable for example Then, the tool executes R in batch mode with the script produced by the tool The "many scripts" correspond to the many crosstables people want to do. It is not possible produce these different scripts and execute all of them in a unique script with source. For example, people want to do a crosstable, the tool produce a r script that generate the result. Then, the user want to do the same crosstable but on a subset or with different labels. Then, the tools produces another r script, and execute R which reloads at each time all the neeeded packages. Is there a way to avoid this thing and to load packages only once? Than you in advance *Patrick PALMIER** *** Le 07/11/2011 15:20, > Duncan Murdoch (par Internet) a ?crit : > On 07/11/2011 5:49 AM, PALMIER Patrick (Responsable de groupe) - CETE > NP/TM/ST wrote: >> Hello, >> >> >> I use R in batch mode. Each time, I execute a script, R is loading each >> packages I need in my script. That's Ok >> But, I had to execute many scripts , and each time R is re-loading the >> corresponding packages, which take to much time >> >> Is it possible ask R to load the packages only once, and stay in memory >> in background for further scripts, which would avoid to load the >> packages in each script, or if you have another solution that need to >> only load packages once in the first scripts, so that further scripts do >> not need to load these packages too. > > Write one script that has a sequence of calls to source() to run the > other scripts. > > You'll need to be careful that unintentional leftover objects and > settings from one script don't affect the others; you may also want to > use the "echo=TRUE" option when you source, so you see the commands as > they are executed. > > Duncan Murdoch > > From adele_thompson at cargill.com Mon Nov 7 16:58:25 2011 From: adele_thompson at cargill.com (Schatzi) Date: Mon, 7 Nov 2011 07:58:25 -0800 (PST) Subject: [R] Lower bounds on selfStart function not working In-Reply-To: <1320679798626-3999231.post@n4.nabble.com> References: <1320679798626-3999231.post@n4.nabble.com> Message-ID: <1320681505588-4001986.post@n4.nabble.com> I tested the "optim" function and that is returning non-negative parameter values (meaning they are bound by the lower limits), but I think those are the starting estimates for the nlsList model which is then finding negative values for the solution. ----- In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/Lower-bounds-on-selfStart-function-not-working-tp3999231p4001986.html Sent from the R help mailing list archive at Nabble.com. From sallysims at earthlink.net Mon Nov 7 16:58:25 2011 From: sallysims at earthlink.net (Sally Ann Sims) Date: Mon, 7 Nov 2011 10:58:25 -0500 Subject: [R] logistric regression: model revision Message-ID: <644C0EDF0C01413DA7203BEC4D21286A@SASlptp> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From AZiem at us.ci.org Mon Nov 7 16:59:05 2011 From: AZiem at us.ci.org (Andrew Ziem ) Date: Mon, 7 Nov 2011 08:59:05 -0700 Subject: [R] R in batch mode packages loading question In-Reply-To: <4EB7B7A2.3030209@developpement-durable.gouv.fr> References: <4EB7B7A2.3030209@developpement-durable.gouv.fr> Message-ID: <50486F3885905241A1890719DAFDD3447FE90A4F4D@gmc0050.ci.org> Try the fork() function in the package multicore (if your system supports it) Andrew -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of PALMIER Patrick (Responsable de groupe) - CETE NP/TM/ST Sent: Monday, November 07, 2011 3:49 AM To: r-help at r-project.org Subject: [R] R in batch mode packages loading question Hello, I use R in batch mode. Each time, I execute a script, R is loading each packages I need in my script. That's Ok But, I had to execute many scripts , and each time R is re-loading the corresponding packages, which take to much time Is it possible ask R to load the packages only once, and stay in memory in background for further scripts, which would avoid to load the packages in each script, or if you have another solution that need to only load packages once in the first scripts, so that further scripts do not need to load these packages too. Thanks in advance -- *Patrick PALMIER** **Centre d'?tudes Techniques de l'?quipement Nord - Picardie D?partement Transport Mobilit?s */*Responsable du groupe Syst?mes de Transports*//* */2, rue de Bruxelles, BP 275 59019 Lille cedex FRANCE T?l: +33 (0) 3 20 49 60 70 Fax: +33 (0) 3 20 49 63 69 From dwinsemius at comcast.net Mon Nov 7 17:05:46 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 7 Nov 2011 11:05:46 -0500 Subject: [R] logistric regression: model revision In-Reply-To: <644C0EDF0C01413DA7203BEC4D21286A@SASlptp> References: <644C0EDF0C01413DA7203BEC4D21286A@SASlptp> Message-ID: <01D685B1-E8E6-484D-8643-499F4EBE65B3@comcast.net> On Nov 7, 2011, at 10:58 AM, Sally Ann Sims wrote: > Hello, > > I am working on fitting a logistic regression model to my dataset. > I removed the squared term in the second version of the model, but > my model output is exactly the same. > > Model version 1: GRP_GLM<-glm(HB_NHB~elev > +costdis1^2,data=glm_1,family=binomial(link=logit)) > summary(GRP_GLM) > > > Model version 2: QM_1<-glm(HB_NHB~elev > +costdis1,data=glm_2,family=binomial(link=logit)) > summary(QM_1) > > > The call in version 2 has changed: > Call: > glm(formula = HB_NHB ~ elev + costdis1, family = binomial(link = > logit), > data = glm_2) > But I???m getting the exact same results as I did in the model where > costdis1 is squared. Are you sure that you got output that correctly modeled the costdis1^2? I would ahve guessed that you would have needed to use : GRP_GLM<-glm(HB_NHB~elev+I(costdis1^2), data=glm_1, family=binomial(link=logit)) ?I The "^" in model formulas is for composing interactions. ?formula > > Any ideas what I might do to correct this? Thank you. > > Sally > [[alternative HTML version deleted]] And please post in plain text. -- David Winsemius, MD West Hartford, CT From ripley at stats.ox.ac.uk Mon Nov 7 17:08:17 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Mon, 7 Nov 2011 16:08:17 +0000 (GMT) Subject: [R] R in batch mode packages loading question In-Reply-To: <4EB7FBE8.8030608@developpement-durable.gouv.fr> References: <4EB7B7A2.3030209@developpement-durable.gouv.fr> <4EB7E912.6090902@gmail.com> <4EB7FBE8.8030608@developpement-durable.gouv.fr> Message-ID: On Mon, 7 Nov 2011, PALMIER Patrick (Responsable de groupe) - CETE NP/TM/ST wrote: > Thank you for your response, but this is not exactly what I need > > We are working on a tool that automatically generate R scripts adapted > to our surveys databases. > When we want to do a table, we select interactively fields, associated > labels, functions for an automatic crosstable for example > Then, the tool executes R in batch mode with the script produced by the tool > The "many scripts" correspond to the many crosstables people want to do. > It is not possible produce these different scripts and execute all of > them in a unique script with source. > > For example, people want to do a crosstable, the tool produce a r script > that generate the result. Then, the user want to do the same crosstable > but on a subset or with different labels. Then, the tools produces > another r script, and execute R which reloads at each time all the > neeeded packages. > > Is there a way to avoid this thing and to load packages only once? You have to load them once per R session. So you could organize your work to use a single session, as Duncan suggested. And that session could wait for input (using something like Rserve). If your packages are taking too long to load, fix the packages. Well-written packages load in milliseconds: after all R itself manages to load several large packages in 100ms or so. > > > Than you in advance > > *Patrick PALMIER** > *** > > > Le 07/11/2011 15:20, > Duncan Murdoch (par Internet) a ?crit : >> On 07/11/2011 5:49 AM, PALMIER Patrick (Responsable de groupe) - CETE >> NP/TM/ST wrote: >>> Hello, >>> >>> >>> I use R in batch mode. Each time, I execute a script, R is loading each >>> packages I need in my script. That's Ok >>> But, I had to execute many scripts , and each time R is re-loading the >>> corresponding packages, which take to much time >>> >>> Is it possible ask R to load the packages only once, and stay in memory >>> in background for further scripts, which would avoid to load the >>> packages in each script, or if you have another solution that need to >>> only load packages once in the first scripts, so that further scripts do >>> not need to load these packages too. >> >> Write one script that has a sequence of calls to source() to run the other >> scripts. >> >> You'll need to be careful that unintentional leftover objects and settings >> from one script don't affect the others; you may also want to use the >> "echo=TRUE" option when you source, so you see the commands as they are >> executed. >> >> Duncan Murdoch >> >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From dasolexa at hotmail.com Mon Nov 7 17:17:01 2011 From: dasolexa at hotmail.com (David A.) Date: Mon, 7 Nov 2011 17:17:01 +0100 Subject: [R] R-bash beginner Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jdnewmil at dcn.davis.ca.us Mon Nov 7 17:29:43 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Mon, 07 Nov 2011 08:29:43 -0800 Subject: [R] R-bash beginneR In-Reply-To: References: Message-ID: <009fb483-ff67-4397-9c0e-1e1470b72420@email.android.com> This is not an R question. Use the print function in R and use backticks in bash. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN: Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. "David A." wrote: > >Hi, > >I am trying to run some R commands into my bash scripts and want to use >shell variables in the R commands and store the output of R objects >into shell variables for further usage in downstream analyses. So far I >have managed the first, but how to get values out of R script? I am >using "here documents" (as a starter, maybe something else is simpler >or better; suggestions greatly appreciated). > >A basic random example: > >#!/bin/sh >MYVAR=2 >R --slave --quiet --no-save <x<-5 > >z<-x/$MYVAR >zz<-x*$MYVAR >EEE > > >How get the values of z and zz into shell variables? > > >Thanks > >D > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. From pjmiller_57 at yahoo.com Mon Nov 7 17:30:10 2011 From: pjmiller_57 at yahoo.com (Paul Miller) Date: Mon, 7 Nov 2011 08:30:10 -0800 (PST) Subject: [R] Problem working with dates In-Reply-To: Message-ID: <1320683410.1343.YahooMailClassic@web161604.mail.bf1.yahoo.com> Hi Michael and David, Thank you both for your reply to my question. Problem solved. I'm finding that my level of success with R is a little uneven thus far. I'm sometimes surprised by the things I can do, but then am even more surprised by the simple things I struggle with. Appreciate your help. Paul From jdnewmil at dcn.davis.ca.us Mon Nov 7 17:35:32 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Mon, 07 Nov 2011 08:35:32 -0800 Subject: [R] function that load variables In-Reply-To: References: <1320673204.93299.YahooMailNeo@web120104.mail.ne1.yahoo.com> Message-ID: <64744638-ecbe-4693-9352-bdfc3a9fe61b@email.android.com> A much better solution is to make separate functions for each object you import, and return an object from the function to be assigned in the calling environment. This will be far less confusing to read later. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN: Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. "R. Michael Weylandt" wrote: >Perhaps you mean to use > >load(...., envir = .GlobalEnv) > >Currently you load up the variables in the function environment but >then they are thrown away when the function ends. > >Michael > >On Mon, Nov 7, 2011 at 8:40 AM, Alaios wrote: >> Dear all, >> I have saved few variable names into local files, >> I wanted to make a function that load this files and "generates" the >variable names into my working environment. I have tried to do that as >a function but my problem is >> >> that this function does not return the variable names >> >> >> load_data<-function(path,Reload=FALSE){ >> >> ??? if (Reload==TRUE){ >> >> ??? ? print("Loading results") >> >> ??? ? # FirstSet >> ??? ? load(file=paste(path,'first',sep="")) >> ??? ? first<-Set >> >> ??? ? # SecondSet >> ??? ? load(file=paste(path,'second',sep="")) >> ??? ? second<-Set >> >> ..................(part omittted here) >> >> ??? ? save( first, second,....(part omitted here)...,???? >file=paste(path,'Results',sep="")) >> ??? } >> >> ??? return (load(file=paste(path,'Results',sep=""))) >> >> } >> >> so my idea was the following: >> I call the function and it retuns the values first,second,... loaded >into the current working space. >> If I want to refresh them, something has changed to the firstDataSet >I set the function's variable Reload=True and thus all the data are >refreshed. >> The problem is not that the return statement I ahve at the end of the >function does not return the loaded variable to the working environment >but only the status of the load command. >> >> Do you know how I can change that so my function returns also Loaded >Variable names to the environment? >> >> Alex >> ? ? ? ?[[alternative HTML version deleted]] >> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. From gunter.berton at gene.com Mon Nov 7 17:46:56 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Mon, 7 Nov 2011 08:46:56 -0800 Subject: [R] function that load variables In-Reply-To: <64744638-ecbe-4693-9352-bdfc3a9fe61b@email.android.com> References: <1320673204.93299.YahooMailNeo@web120104.mail.ne1.yahoo.com> <64744638-ecbe-4693-9352-bdfc3a9fe61b@email.android.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michele.donato at wayne.edu Mon Nov 7 17:49:43 2011 From: michele.donato at wayne.edu (michele donato) Date: Mon, 07 Nov 2011 11:49:43 -0500 Subject: [R] Dunif and Punif Message-ID: <4EB80C27.1010902@wayne.edu> Hi, I am trying to use dunif and runif however, I have two problems: if I do dunif(1:10, min=1, max=10) I get 10 values, which summed give me 1.1111 I understand that the probability is computed as f(x) = 1 / (max-min) but in this case it looks wrong: I have 10 values, each one equiprobable, and the probability for each one should be 0.1 and not 0.11111 (which is, consistently with the definition, 1/9) It looks like one of the extremes is not considered in the computation of the probability, but then it's assigned a probability anyway. Similar problem with punif. if I do punif(1, min=1, max=10) I get 0 as result, as if the lower extreme is not considered, which is not consistent with the description where min <= x <= max If the lower extreme is not considered because cdf(x) = p(X References: <4EB80C27.1010902@wayne.edu> Message-ID: In short, the unif() distribution corresponds to the continuous uniform distribution, not the discrete. Longer: dDIST() doesn't give a pmf so summing it isn't what you are looking for: it gives a pdf. For punif() consider P\{X <= 1\} when X is distributed on [1, 10]. Clearly this has probability zero because it can only occur for one out of uncountably many values -- though this notion should be made more precise using a little bit of measure theory. Michael On Mon, Nov 7, 2011 at 11:49 AM, michele donato wrote: > Hi, > I am trying to use dunif and runif > however, I have two problems: > if I do > > dunif(1:10, min=1, max=10) > > I get 10 values, which summed give me 1.1111 > I understand that the probability is computed as f(x) = 1 / (max-min) > but in this case it looks wrong: I have 10 values, each one > equiprobable, and the probability for each one should be 0.1 and not > 0.11111 (which is, consistently with the definition, 1/9) > > It looks like one of the extremes is not considered in the computation > of the probability, but then it's assigned a probability anyway. > > Similar problem with punif. > > if I do > punif(1, min=1, max=10) > I get 0 as result, as if the lower extreme is not considered, which is > not consistent with the description where min <= x <= max > If the lower extreme is not considered because cdf(x) = p(X not p(X<=x)} the problem stands in p(X<11) which should be the sum of > everything. ( P(1) + P(2) + ... + P(10) ) > > What is happening here? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From michael.weylandt at gmail.com Mon Nov 7 18:09:23 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Mon, 7 Nov 2011 12:09:23 -0500 Subject: [R] Dunif and Punif In-Reply-To: References: <4EB80C27.1010902@wayne.edu> Message-ID: A point of clarification: dDIST() sometimes gives a pmf, e.g., dpois(). But dunif() is a pdf. Sorry for the typo. Michael On Mon, Nov 7, 2011 at 12:08 PM, R. Michael Weylandt wrote: > In short, the unif() distribution corresponds to the continuous > uniform distribution, not the discrete. > > Longer: dDIST() doesn't give a pmf so summing it isn't what you are > looking for: it gives a pdf. For punif() consider P\{X <= 1\} when X > is distributed on [1, 10]. Clearly this has probability zero because > it can only occur for one out of uncountably many values -- though > this notion should be made more precise using a little bit of measure > theory. > > Michael > > > On Mon, Nov 7, 2011 at 11:49 AM, michele donato > wrote: >> Hi, >> I am trying to use dunif and runif >> however, I have two problems: >> if I do >> >> dunif(1:10, min=1, max=10) >> >> I get 10 values, which summed give me 1.1111 >> I understand that the probability is computed as f(x) = 1 / (max-min) >> but in this case it looks wrong: I have 10 values, each one >> equiprobable, and the probability for each one should be 0.1 and not >> 0.11111 (which is, consistently with the definition, 1/9) >> >> It looks like one of the extremes is not considered in the computation >> of the probability, but then it's assigned a probability anyway. >> >> Similar problem with punif. >> >> if I do >> punif(1, min=1, max=10) >> I get 0 as result, as if the lower extreme is not considered, which is >> not consistent with the description where min <= x <= max >> If the lower extreme is not considered because cdf(x) = p(X> not p(X<=x)} the problem stands in p(X<11) which should be the sum of >> everything. ( P(1) + P(2) + ... + P(10) ) >> >> What is happening here? >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > From SABARIC at auburn.edu Mon Nov 7 18:11:11 2011 From: SABARIC at auburn.edu (Richard Saba) Date: Mon, 7 Nov 2011 17:11:11 +0000 Subject: [R] vars impulse responce function output Message-ID: <7EBE32C544A6BD4CB055BEB5E8598ABB996131@exmb1.auburn.edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From peter.langfelder at gmail.com Mon Nov 7 18:13:05 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Mon, 7 Nov 2011 09:13:05 -0800 Subject: [R] Dunif and Punif In-Reply-To: <4EB80C27.1010902@wayne.edu> References: <4EB80C27.1010902@wayne.edu> Message-ID: On Mon, Nov 7, 2011 at 8:49 AM, michele donato wrote: > Hi, > I am trying to use dunif and runif > however, I have two problems: > if I do > > dunif(1:10, min=1, max=10) > > I get 10 values, which summed give me 1.1111 > I understand that the probability is computed as f(x) = 1 / (max-min) > but in this case it looks wrong: I have 10 values, each one > equiprobable, and the probability for each one should be 0.1 and not > 0.11111 (which is, consistently with the definition, 1/9) > > It looks like one of the extremes is not considered in the computation > of the probability, but then it's assigned a probability anyway. > > Similar problem with punif. > > if I do > punif(1, min=1, max=10) > I get 0 as result, as if the lower extreme is not considered, which is > not consistent with the description where min <= x <= max > If the lower extreme is not considered because cdf(x) = p(X not p(X<=x)} the problem stands in p(X<11) which should be the sum of > everything. ( P(1) + P(2) + ... + P(10) ) > > What is happening here? The uniform distribution is continuous. Your interval has length 9 (10-1 = 9), so the density 1/9. Multiplied by 10 it gives you your answer. Same for the cumulative probability distribution (punif) - it is zero at x=1 because that's where your interval starts. Peter From horseatingweeds at gmail.com Mon Nov 7 17:48:11 2011 From: horseatingweeds at gmail.com (horseatingweeds) Date: Mon, 7 Nov 2011 08:48:11 -0800 (PST) Subject: [R] Error: could not find function "MLearn" In-Reply-To: References: <1320676239513-3998805.post@n4.nabble.com> Message-ID: <1320684491532-4005626.post@n4.nabble.com> Thanks Dave. I thought I had run library(MLInterface), but I looked back, and it was library(MLearn) I was running. Silly me... It's chugging away now. -- View this message in context: http://r.789695.n4.nabble.com/Error-could-not-find-function-MLearn-tp3998805p4005626.html Sent from the R help mailing list archive at Nabble.com. From silvano at uel.br Mon Nov 7 18:57:51 2011 From: silvano at uel.br (Silvano) Date: Mon, 7 Nov 2011 15:57:51 -0200 Subject: [R] Problema with Excel files Message-ID: <816935807AA343BD9358B44E20FCB786@ccePC> Hi, I have a Excel file with three spreadsheets: PlanA, PlanB and PlanC. I'm trying to read the three spreadsheets and then adding them together. But, when I try read the PlanA there is an error message: rm(list=ls()) setwd('C:/Test/Dados/Teste') require(RODBC) Arquivo = odbcConnectExcel('T070206_1347.xls') (Geral = sqlFetch(Arquivo, 'PlanA')) (Lactacao = sqlFetch(Arquivo, 'PlanB')) Erro em as.POSIXlt.character(x, tz, ...) : character string is not in a standard unambiguous format I don't know what's happening. Can anyone help me? Thanks a lot, -------------------------------------- Silvano Cesar da Costa Departamento de Estat?stica Universidade Estadual de Londrina Fone: 3371-4346 From C.G.G.Aitken at ed.ac.uk Mon Nov 7 18:59:39 2011 From: C.G.G.Aitken at ed.ac.uk (Colin Aitken) Date: Mon, 07 Nov 2011 17:59:39 +0000 Subject: [R] Estimate of intercept in loglinear model Message-ID: <4EB81C8B.9020809@ed.ac.uk> How does R estimate the intercept term \alpha in a loglinear model with Poisson model and log link for a contingency table of counts? (E.g., for a 2-by-2 table {n_{ij}) with \log(\mu) = \alpha + \beta_{i} + \gamma_{j}) I fitted such a model and checked the calculations by hand. I agreed with the main effect terms but not the intercept. Interestingly, I agreed with the fitted value provided by R for the first cell {11} in the table. If my estimate of intercept = \hat{\alpha}, my estimate of the fitted value for the first cell = exp(\hat{\alpha}) but R seems to be doing something else for the estimate of the intercept. However if I check the R $fitted_value for n_{11} it agrees with my exp(\hat{\alpha}). I would expect that with the corner-point parametrization, the estimates for a 2 x 2 table would correspond to expected frequencies exp(\alpha), exp(\alpha + \beta), exp(\alpha + \gamma), exp(\alpha + \beta + \gamma). The MLE of \alpha appears to be log(n_{.1} * n_{1.}/n_{..}), but this is not equal to the intercept given by R in the example I tried. With thanks in anticipation, Colin Aitken -- Professor Colin Aitken, Professor of Forensic Statistics, School of Mathematics, King?s Buildings, University of Edinburgh, Mayfield Road, Edinburgh, EH9 3JZ. Tel: 0131 650 4877 E-mail: c.g.g.aitken at ed.ac.uk Fax : 0131 650 6553 http://www.maths.ed.ac.uk/~cgga The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From dwinsemius at comcast.net Mon Nov 7 19:09:01 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 7 Nov 2011 13:09:01 -0500 Subject: [R] Problema with Excel files In-Reply-To: <816935807AA343BD9358B44E20FCB786@ccePC> References: <816935807AA343BD9358B44E20FCB786@ccePC> Message-ID: <59D752BC-F9A0-49AB-8F59-F8611F5F7115@comcast.net> On Nov 7, 2011, at 12:57 PM, Silvano wrote: > Hi, > > I have a Excel file with three spreadsheets: PlanA, PlanB and PlanC. > I'm trying to read the three spreadsheets and then adding them > together. > But, when I try read the PlanA there is an error message: > > > rm(list=ls()) > setwd('C:/Test/Dados/Teste') > require(RODBC) > Arquivo = odbcConnectExcel('T070206_1347.xls') > (Geral = sqlFetch(Arquivo, 'PlanA')) > (Lactacao = sqlFetch(Arquivo, 'PlanB')) > > Erro em as.POSIXlt.character(x, tz, ...) : > character string is not in a standard unambiguous format > > > I don't know what's happening. Can anyone help me? It's a guess without the data, but have you got a date column in the spreadsheet with a non-standard format? If so, can you change it to YYYY-MM-DD in the format/Date/Custom (or some such) panel? > > Thanks a lot, > > -------------------------------------- > Silvano Cesar da Costa > Departamento de Estat?stica > Universidade Estadual de Londrina > Fone: 3371-4346 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Mon Nov 7 19:11:54 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 7 Nov 2011 13:11:54 -0500 Subject: [R] Estimate of intercept in loglinear model In-Reply-To: <4EB81C8B.9020809@ed.ac.uk> References: <4EB81C8B.9020809@ed.ac.uk> Message-ID: On Nov 7, 2011, at 12:59 PM, Colin Aitken wrote: > How does R estimate the intercept term \alpha in a loglinear > model with Poisson model and log link for a contingency table of > counts? > > (E.g., for a 2-by-2 table {n_{ij}) with \log(\mu) = \alpha + > \beta_{i} + \gamma_{j}) > > I fitted such a model and checked the calculations by hand. I > agreed with the main effect terms but not the intercept. > Interestingly, I agreed with the fitted value provided by R for the > first cell {11} in the table. > > If my estimate of intercept = \hat{\alpha}, my estimate of the > fitted value for the first cell = exp(\hat{\alpha}) but R seems to > be doing something else for the estimate of the intercept. > > However if I check the R $fitted_value for n_{11} it agrees with my > exp(\hat{\alpha}). > > I would expect that with the corner-point parametrization, the > estimates for a 2 x 2 table would correspond to expected frequencies > exp(\alpha), exp(\alpha + \beta), exp(\alpha + \gamma), exp(\alpha + > \beta + \gamma). The MLE of \alpha appears to be log(n_{.1} * n_{1.}/ > n_{..}), but this is not equal to the intercept given by R in the > example I tried. > > With thanks in anticipation, > > Colin Aitken > > > -- > Professor Colin Aitken, > Professor of Forensic Statistics, Do you suppose you could provide a data-corpse for us to dissect? Noting the tag line for every posting .... > and provide commented, minimal, self-contained, reproducible code. -- David Winsemius, MD West Hartford, CT From djmuser at gmail.com Mon Nov 7 19:30:39 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Mon, 7 Nov 2011 10:30:39 -0800 Subject: [R] logistric regression: model revision In-Reply-To: <644C0EDF0C01413DA7203BEC4D21286A@SASlptp> References: <644C0EDF0C01413DA7203BEC4D21286A@SASlptp> Message-ID: Since you didn't provide a reproducible example, here are a couple of possibilities to check, but I have utterly no idea if they're applicable to your problem or not: * does costdis1 consist of 0's and 1's? * is costdis1 a factor? In the first model, you treat costdis1 as a pure quadratic and in the second model, it is a linear term. The two models are not nested. Modeling a term as a pure quadratic is a very strong assumption - the more usual practice is to fit both a linear and quadratic term in costdis1 to allow more flexibility in the fitted surface, but that would require costdis1 to be numeric. HTH, Dennis On Mon, Nov 7, 2011 at 7:58 AM, Sally Ann Sims wrote: > Hello, > > I am working on fitting a logistic regression model to my dataset. ?I removed the squared term in the second version of the model, but my model output is exactly the same. > > Model version 1: ?GRP_GLM<-glm(HB_NHB~elev+costdis1^2,data=glm_1,family=binomial(link=logit)) > summary(GRP_GLM) > > > Model version 2: ?QM_1<-glm(HB_NHB~elev+costdis1,data=glm_2,family=binomial(link=logit)) > summary(QM_1) > > > The call in version 2 has changed: > Call: > glm(formula = HB_NHB ~ elev + costdis1, family = binomial(link = logit), > ? ?data = glm_2) > But I?m getting the exact same results as I did in the model where costdis1 is squared. > > Any ideas what I might do to correct this? ?Thank you. > > Sally > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From tlumley at uw.edu Mon Nov 7 19:48:05 2011 From: tlumley at uw.edu (Thomas Lumley) Date: Tue, 8 Nov 2011 07:48:05 +1300 Subject: [R] help with formula for clogit In-Reply-To: <1320679191169-3998967.post@n4.nabble.com> References: <1320679191169-3998967.post@n4.nabble.com> Message-ID: On Tue, Nov 8, 2011 at 4:19 AM, 1Rnwb wrote: > I would like to know if clogit function can be used as below > clogit(group~., data=dataframe) > Not usefully. That syntax does not specify a strata() term, which is why the computation is very slow and probably not what you intended. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland From amitrhelp at yahoo.co.uk Mon Nov 7 19:49:06 2011 From: amitrhelp at yahoo.co.uk (Amit Patel) Date: Mon, 7 Nov 2011 18:49:06 +0000 (GMT) Subject: [R] repeating a loop Message-ID: <1320691746.45155.YahooMailNeo@web28415.mail.ukl.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.weylandt at gmail.com Mon Nov 7 20:03:16 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Mon, 7 Nov 2011 14:03:16 -0500 Subject: [R] repeating a loop In-Reply-To: <1320691746.45155.YahooMailNeo@web28415.mail.ukl.yahoo.com> References: <1320691746.45155.YahooMailNeo@web28415.mail.ukl.yahoo.com> Message-ID: Could you just initialize RepeatPlot = "y" and then wrap your whole script in while(RepeatPlot == "y") { ## YOUR STUFF ENDING WITH A POSSIBLE MODIFICATION OF RepeatPlot }? Michael On Mon, Nov 7, 2011 at 1:49 PM, Amit Patel wrote: > Hi > > I have implented boxplots in my script to create box plots > > BoxplotsCheck <- readline(prompt = "Would you like to create boxplots for any Feature? (y/n):") > ? if (BoxplotsCheck? == "y"){ > ??? BoxplotsFeature <- readline(prompt = "Which Feature would you like to create a Boxplot for?:") > ??? BoxplotsFeature <- as.numeric(BoxplotsFeature) > ??? BoxplotsData <- as.numeric(which(PCIList == BoxplotsFeature)) > ??? BoxplotsData <- TotalIntensityList[BoxplotsData,] > ??? BoxplotsHeading <- paste("Tukey boxplot (including outliers) for PCI ", BoxplotsFeature , sep = "") > ??? bplot(as.numeric(BoxplotsData), GroupingList, style = "tukey", outlier = TRUE, > ? col="red", main = BoxplotsHeading, > ??? xlab = "Groups", ylab = "Normalised Intensity", plot = TRUE) > ??? BoxplotsFilename <- paste(BoxplotsFeature, "_Boxplot", sep = "") > ??? savePlot(filename = "BoxplotsFilename", type = "jpeg", device = dev.cur(), restoreConsole = TRUE) > > > RepeatPlot <- readline(prompt = "Would you like to create another boxplot for any Feature? (y/n):") > } > > > If the user inputs "y" for BoxplotsCheck then a boxplot is saved and creatyed based on a user choice. > > I want to include the option to do another boxplot if needed (i.e. another user prompt after saveplot returning another y or n for the variable "RepeatPlot "? ) how can i do this. I guess some kind of loop continuing whileRepeatPlot == y > > Can anyone help > > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From jfox at mcmaster.ca Mon Nov 7 20:05:27 2011 From: jfox at mcmaster.ca (John Fox) Date: Mon, 7 Nov 2011 14:05:27 -0500 Subject: [R] [R-pkgs] new version 2.0-0 of the sem package Message-ID: <008001cc9d80$368e6c60$a3ab4520$@mcmaster.ca> Dear R users, Jarrett Byrnes and I would like to announce version 2.0-0 of the sem package for fitting observed- and latent-variable structural equation models. This is a general reworking of the original sem package (which is still available on R-Forge as package sem1). Some highlights of sem 2.0-0 include: o More convenient and compact model specification, including the default automatic generation of error variances for endogenous variables and more compact specification of error covariances. In the near future, we anticipate releasing an update that permits equation-style specification of structural equations. o The ability to update model specifications. o Soft-coded objective functions ("fit" functions in SEM jargon) and optimizers. Two objective functions are provided, for multinormal full-information maximum likelihood and for generalized least squares; and three optimizers are provided, based on the standard R nlm(), optim(), and nlminb() optimizers. The user can add objective functions and optimizers. o Analytic standard errors are provided by default for the FIML estimator (standard errors based on the numeric Hessian are now optional), and robust standard errors and tests are optionally available. o The ability to fit a model to a data frame, as a preferred alternative to computing a covariance or moment matrix in an intermediate step. The original data are required to obtain robust standard errors and tests, and are optional otherwise. o Correctly computed "modification indices" (score tests for fixed parameters). o Enhanced output, including R^2s for endogenous variables, additional information criteria, and the computation of indirect effects. o Executable examples in the help pages for the package, along with (as before) non-executable examples in which model specifications and correlation/covariance/moment matrices appear in the input stream. The new sem package is designed to be upwards compatible with the old one, so that scripts that worked previously should still work. Although we have tested the new sem package, this is a major update and it's possible that bugs will surface. Please let me know if you encounter problems. Best, John -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages From adele_thompson at cargill.com Mon Nov 7 20:16:26 2011 From: adele_thompson at cargill.com (Schatzi) Date: Mon, 7 Nov 2011 11:16:26 -0800 (PST) Subject: [R] Lower bounds on selfStart function not working In-Reply-To: <1320681505588-4001986.post@n4.nabble.com> References: <1320679798626-3999231.post@n4.nabble.com> <1320681505588-4001986.post@n4.nabble.com> Message-ID: <1320693386867-4012805.post@n4.nabble.com> I ran the code again and got an error saying that the "x" was unknown. I don't know why I hadn't seen that error before. Anyway, I made the edits to "func1" so instead of "x", it is "xy$x." #function to optimize func1 <- function(value) { A.s <- value[1] mu.s <- value[2] l.s <- value[3] b.s <- value[4] y1<-rep(0,length(xy$x)) # generate vector for predicted y (y1) to evaluate against observed y for(cnt in 1:length(xy$x)){ y1[cnt]<- b.s+A.s/(1+exp(4*mu.s/A.s*(l.s-xy$x[cnt])+2))} #predicting y1 for values of y evl<-sum((xy$y-y1)^2) #sum of squares is function to minimize return(evl)} There is another place where there is an "x" in the selfStart function: SSpowrDplt<-selfStart(~b+A/(1+exp(4*mu/A*(l-x)+2)),initial=powrDpltInit, parameters=c("A","mu","l","b")) I don't know why that is working fine or how it knows that my "x" is that specific one. It seems that I am not fully understanding how this is working. ----- In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/Lower-bounds-on-selfStart-function-not-working-tp3999231p4012805.html Sent from the R help mailing list archive at Nabble.com. From kelsmith at usgs.gov Mon Nov 7 19:24:07 2011 From: kelsmith at usgs.gov (kelsmith) Date: Mon, 7 Nov 2011 10:24:07 -0800 (PST) Subject: [R] ordination in vegan: what does downweight() do? Message-ID: <1320690247221-4010352.post@n4.nabble.com> Can anyone point me in the right direction of figuring out what downweight() is doing? I am using vegan to perform CCA on diatom assemblage data. I have a lot of rare species, so I want to reduce the influence of rare species in my CCA. I have read that some authors reduce rare species by only including species with an abundance of at least 1% in at least one sample (other authors use 5% as a rule, but this removes at least half my species). If I code it as follows: cca(downweight(diatoms, fraction=5) ~ ., env) It is clearly not removing these species entirely from analysis, as some authors suggest. So I am wondering: what is downweight() doing exactly? I assume it is somehow ranking the species and reducing their abundance values based on their rank, but I'm not entirely sure and can't seem to figure out how to look at the code (R novice here). Nor can I find a clear description within the documentation (although I may be looking in all the wrong places). So, my inclination is to remove species that are very rare (max abundance < 1%) prior to the CCA and then use the downweight function (fraction = 5?) in my CCA (as above). This way, I can include most of my species, but overall still reduce the impact of rare species. Any advice is appreciated. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/ordination-in-vegan-what-does-downweight-do-tp4010352p4010352.html Sent from the R help mailing list archive at Nabble.com. From mrberbeco at ucdavis.edu Mon Nov 7 19:11:44 2011 From: mrberbeco at ucdavis.edu (minda berbeco) Date: Mon, 7 Nov 2011 10:11:44 -0800 Subject: [R] Comparing Excel to R for Standard Curves Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mark_difford at yahoo.co.uk Mon Nov 7 20:04:02 2011 From: mark_difford at yahoo.co.uk (Mark Difford) Date: Mon, 7 Nov 2011 11:04:02 -0800 (PST) Subject: [R] Estimate of intercept in loglinear model In-Reply-To: <4EB81C8B.9020809@ed.ac.uk> References: <4EB81C8B.9020809@ed.ac.uk> Message-ID: <1320692642821-4012346.post@n4.nabble.com> On Nov 07, 2011 at 7:59pm Colin Aitken wrote: > How does R estimate the intercept term \alpha in a loglinear > model with Poisson model and log link for a contingency table of counts? Colin, If you fitted this using a GLM then the default in R is to use so-called treatment contrasts (i.e. Dunnett contrasts). See ?contr.treatment. Take the first example on the ?glm help page ## Dobson (1990) Page 93: Randomized Controlled Trial : counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) print(d.AD <- data.frame(treatment, outcome, counts)) glm.D93 <- glm(counts ~ outcome + treatment, family=poisson()) anova(glm.D93) summary(glm.D93) < snip > Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 3.045e+00 1.709e-01 17.815 <2e-16 *** outcome2 -4.543e-01 2.022e-01 -2.247 0.0246 * outcome3 -2.930e-01 1.927e-01 -1.520 0.1285 treatment2 1.338e-15 2.000e-01 0.000 1.0000 treatment3 1.421e-15 2.000e-01 0.000 1.0000 < snip > > levels(outcome) [1] "1" "2" "3" > levels(treatment) [1] "1" "2" "3" So here the intercept represents the estimated counts at the first level of "outcome" (i.e. outcome = 1) and the first level of "treatment" (i.e. treatment = 1). > predict(glm.D93, newdata=data.frame(outcome="1", treatment="1")) 1 3.044522 Regards, Mark. ----- Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4012346.html Sent from the R help mailing list archive at Nabble.com. From mark_difford at yahoo.co.uk Mon Nov 7 20:12:51 2011 From: mark_difford at yahoo.co.uk (Mark Difford) Date: Mon, 7 Nov 2011 11:12:51 -0800 (PST) Subject: [R] Estimate of intercept in loglinear model In-Reply-To: <1320692642821-4012346.post@n4.nabble.com> References: <4EB81C8B.9020809@ed.ac.uk> <1320692642821-4012346.post@n4.nabble.com> Message-ID: <1320693171479-4012723.post@n4.nabble.com> On Nov 07, 2011 at 9:04pm Mark Difford wrote: > So here the intercept represents the estimated counts... Perhaps I should have added (though surely unnecessary in your case) that exponentiation gives the predicted/estimated counts, viz 21 (compared to 18 for the saturated model). ## > exp(3.044522) [1] 20.99999 Regards, Mark. ----- Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4012723.html Sent from the R help mailing list archive at Nabble.com. From jholtman at gmail.com Mon Nov 7 20:37:35 2011 From: jholtman at gmail.com (jim holtman) Date: Mon, 7 Nov 2011 14:37:35 -0500 Subject: [R] Doing dist on separate objects in a text file In-Reply-To: <-166607997442399404@unknownmsgid> References: <1320535236339-3994515.post@n4.nabble.com> <0AD472CB-6536-4A03-83BD-1390433C8F11@comcast.net> <-166607997442399404@unknownmsgid> Message-ID: Is this what you want: > x <- read.table(textConnection('"Label" "X" "Y" "Slice" + 1 "Field_1_R3D_D3D_PRJ_w617.tif" 348 506 1 + 2 "Field_1_R3D_D3D_PRJ_w617.tif" 359 505 1 + 3 "Field_1_R3D_D3D_PRJ_w617.tif" 356 524 1 + 4 "Field_1_R3D_D3D_PRJ_w617.tif" 2 0 1 + 5 "Field_1_R3D_D3D_PRJ_w617.tif" 412 872 1 + 6 "Field_1_R3D_D3D_PRJ_w617.tif" 422 863 1 + 7 "Field_1_R3D_D3D_PRJ_w617.tif" 429 858 1 + 8 "Field_1_R3D_D3D_PRJ_w617.tif" 429 880 1 + 9 "Field_1_R3D_D3D_PRJ_w617.tif" 437 865 1 + 10 "Field_1_R3D_D3D_PRJ_w617.tif" 447 855 1 + 11 "Field_1_R3D_D3D_PRJ_w617.tif" 450 868 1 + 12 "Field_1_R3D_D3D_PRJ_w617.tif" 447 875 1 + 13 "Field_1_R3D_D3D_PRJ_w617.tif" 439 885 1 + 14 "Field_1_R3D_D3D_PRJ_w617.tif" 2 8 1'), header = TRUE, as.is = TRUE) > closeAllConnections() > # create column to segment the data > x$mark <- cumsum(x$X < 10) > # now remove separators > x <- subset(x, X >= 10) > # now split the data > x.s <- split(x, x$mark) > # now do the dist > lapply(x.s, function(a) dist(a[, c("X", "Y")])) $`0` 1 2 2 11.04536 3 19.69772 19.23538 $`1` 5 6 7 8 9 10 11 12 6 13.453624 7 22.022716 8.602325 8 18.788294 18.384776 22.000000 9 25.961510 15.132746 10.630146 17.000000 10 38.910153 26.248809 18.248288 30.805844 14.142136 11 38.209946 28.442925 23.259407 24.186773 13.341664 13.341664 12 35.128336 27.730849 24.758837 18.681542 14.142136 20.000000 7.615773 13 29.966648 27.802878 28.792360 11.180340 20.099751 31.048349 20.248457 12.806248 On Sun, Nov 6, 2011 at 3:49 PM, ScottDaniel wrote: >> >> On Nov 5, 2011, at 7:20 PM, ScottDaniel wrote: >> >> > So I have a text file that looks like this: >> > "Label" ? ? "X" ? ? "Y" ? ? "Slice" >> > 1 ? "Field_1_R3D_D3D_PRJ_w617.tif" ?348 ? ? 506 ? ? 1 >> > 2 ? "Field_1_R3D_D3D_PRJ_w617.tif" ?359 ? ? 505 ? ? 1 >> > 3 ? "Field_1_R3D_D3D_PRJ_w617.tif" ?356 ? ? 524 ? ? 1 >> > 4 ? "Field_1_R3D_D3D_PRJ_w617.tif" ?2 ? ? ? 0 ? ? ? 1 >> > 5 ? "Field_1_R3D_D3D_PRJ_w617.tif" ?412 ? ? 872 ? ? 1 >> > 6 ? "Field_1_R3D_D3D_PRJ_w617.tif" ?422 ? ? 863 ? ? 1 >> > 7 ? "Field_1_R3D_D3D_PRJ_w617.tif" ?429 ? ? 858 ? ? 1 >> > 8 ? "Field_1_R3D_D3D_PRJ_w617.tif" ?429 ? ? 880 ? ? 1 >> > 9 ? "Field_1_R3D_D3D_PRJ_w617.tif" ?437 ? ? 865 ? ? 1 >> > 10 ?"Field_1_R3D_D3D_PRJ_w617.tif" ?447 ? ? 855 ? ? 1 >> > 11 ?"Field_1_R3D_D3D_PRJ_w617.tif" ?450 ? ? 868 ? ? 1 >> > 12 ?"Field_1_R3D_D3D_PRJ_w617.tif" ?447 ? ? 875 ? ? 1 >> > 13 ?"Field_1_R3D_D3D_PRJ_w617.tif" ?439 ? ? 885 ? ? 1 >> > 14 ?"Field_1_R3D_D3D_PRJ_w617.tif" ?2 ? ? ? 8 ? ? ? 1 >> > >> > What it represents are the locations of centromeres per nucleus in a >> > microscope image. What I need to do is do a dist() on each grouping >> > (the >> > grouping being separated by the low values of x and y's) and then >> > compute an >> > average. The part that I'm having trouble with is writing code that >> > will >> > allow R to separate these objects. >> >> I'm having trouble figuring out what you mean by "separating the >> objects". Each row is a separate reading, and I think you just want >> pairwise distances, right? > > What I mean is that rows 1-3 represent one group of centromeres and > rows 5-13 represent a second group. So I want to do a separate dist on > each group (i.e. I want a pair wise distance for rows 1 and 3 but not > 1 and 12). Does that clear thing up? > > > -- > View this message in context: http://r.789695.n4.nabble.com/Doing-dist-on-separate-objects-in-a-text-file-tp3994515p3996701.html > Sent from the R help mailing list archive at Nabble.com. > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. From jbustosmelo at yahoo.es Mon Nov 7 20:56:35 2011 From: jbustosmelo at yahoo.es (Jose Bustos Melo) Date: Mon, 7 Nov 2011 19:56:35 +0000 (GMT) Subject: [R] Regression-Discontinuity Analysis Message-ID: <1320695795.1083.YahooMailNeo@web26506.mail.ukl.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: no disponible URL: From gyanendra.pokharel at gmail.com Mon Nov 7 21:10:15 2011 From: gyanendra.pokharel at gmail.com (Gyanendra Pokharel) Date: Mon, 7 Nov 2011 15:10:15 -0500 Subject: [R] Correction in error Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rkevinburton at charter.net Mon Nov 7 21:23:09 2011 From: rkevinburton at charter.net (Kevin Burton) Date: Mon, 7 Nov 2011 14:23:09 -0600 Subject: [R] Upgrade R? Message-ID: <001501cc9d8b$1187cff0$34976fd0$@charter.net> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sarah.goslee at gmail.com Mon Nov 7 21:26:04 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Mon, 7 Nov 2011 15:26:04 -0500 Subject: [R] Correction in error In-Reply-To: References: Message-ID: Hi, I see two problems right off: On Mon, Nov 7, 2011 at 3:10 PM, Gyanendra Pokharel wrote: > Hello R community, following is my code and it shows error, can some one > fix this error and explain why this occurs? > > gibbs <-function(m,n, theta = 0, lambda = 1){ > ? ?alpha <- 1.5 > ? ?beta <- 1.5 > ? ?gamma <- 1.5 > ? ?x<- array(0,c(m+1, 3)) > ? ?x[1,1] <- theta > ? ?x[1,2] <- lambda > ? ?x[1,3]<- n > ? ?for(t in 2:m+1){ > ? ? ? ?x[t,1] <- rbinom(x[t-1,3], 1, x[t-1,1]) The rbinom() command here returns a vector of values, but you're trying to assign it to a single matrix element. You might want to double-check the help for rbinom() again. > ? ? ? ?x[t,2]<-rbeta(m, x[t-1,1] + alpha, tx[t-1,3] - x[t-1,1] + beta) tx isn't defined, but is probably a typo for x > ? ? ? ?x[t,3] <- rpois(x[t-1,3] - x[t-1,1],(1 - x[t-1,2])*gamma) > ? ?} > ? ?x > } > gibbs(100, 10) With a nice short function such as this, it's easy to set your arguments to their default values and actually run the function line by line at the command prompt so you can see whether what is happening is what you expect. Sarah -- Sarah Goslee http://www.functionaldiversity.org From michael.weylandt at gmail.com Mon Nov 7 21:27:52 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Mon, 7 Nov 2011 15:27:52 -0500 Subject: [R] Correction in error In-Reply-To: References: Message-ID: The first argument to rbinom() is how many random samples you want to draw, not whatever you seem to think it is. It's not matching the size of what you mean to assign it to: in particular note that x[t-1, 3] is zero for t=3 which is where you initialize it. (I.e., you are also probably getting tripped up by an order of operations error) Michael On Mon, Nov 7, 2011 at 3:10 PM, Gyanendra Pokharel wrote: > Hello R community, following is my code and it shows error, can some one > fix this error and explain why this occurs? > > gibbs <-function(m,n, theta = 0, lambda = 1){ > ? ?alpha <- 1.5 > ? ?beta <- 1.5 > ? ?gamma <- 1.5 > ? ?x<- array(0,c(m+1, 3)) > ? ?x[1,1] <- theta > ? ?x[1,2] <- lambda > ? ?x[1,3]<- n > ? ?for(t in 2:m+1){ > ? ? ? ?x[t,1] <- rbinom(x[t-1,3], 1, x[t-1,1]) > ? ? ? ?x[t,2]<-rbeta(m, x[t-1,1] + alpha, tx[t-1,3] - x[t-1,1] + beta) > ? ? ? ?x[t,3] <- rpois(x[t-1,3] - x[t-1,1],(1 - x[t-1,2])*gamma) > ? ?} > ? ?x > } > gibbs(100, 10) > > Gyn > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From wdunlap at tibco.com Mon Nov 7 21:31:11 2011 From: wdunlap at tibco.com (William Dunlap) Date: Mon, 7 Nov 2011 20:31:11 +0000 Subject: [R] Correction in error In-Reply-To: References: Message-ID: A possible third problem is that 2:m+1 is the same as (2:m) + 1 and you probably want 2:(m+1) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Sarah Goslee > Sent: Monday, November 07, 2011 12:26 PM > To: Gyanendra Pokharel > Cc: R-help at r-project.org > Subject: Re: [R] Correction in error > > Hi, > > I see two problems right off: > > On Mon, Nov 7, 2011 at 3:10 PM, Gyanendra Pokharel > wrote: > > Hello R community, following is my code and it shows error, can some one > > fix this error and explain why this occurs? > > > > gibbs <-function(m,n, theta = 0, lambda = 1){ > > ? ?alpha <- 1.5 > > ? ?beta <- 1.5 > > ? ?gamma <- 1.5 > > ? ?x<- array(0,c(m+1, 3)) > > ? ?x[1,1] <- theta > > ? ?x[1,2] <- lambda > > ? ?x[1,3]<- n > > ? ?for(t in 2:m+1){ > > ? ? ? ?x[t,1] <- rbinom(x[t-1,3], 1, x[t-1,1]) > > The rbinom() command here returns a vector of values, but you're > trying to assign it to a single matrix element. You might want to > double-check the help for rbinom() again. > > > > ? ? ? ?x[t,2]<-rbeta(m, x[t-1,1] + alpha, tx[t-1,3] - x[t-1,1] + beta) > > tx isn't defined, but is probably a typo for x > > > ? ? ? ?x[t,3] <- rpois(x[t-1,3] - x[t-1,1],(1 - x[t-1,2])*gamma) > > ? ?} > > ? ?x > > } > > gibbs(100, 10) > > With a nice short function such as this, it's easy to set your > arguments to their default values and actually run the function line > by line at the command prompt so you can see whether what is happening > is what you expect. > > Sarah > > -- > Sarah Goslee > http://www.functionaldiversity.org > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From bellard.celine at gmail.com Mon Nov 7 21:50:55 2011 From: bellard.celine at gmail.com (Celine) Date: Mon, 7 Nov 2011 12:50:55 -0800 (PST) Subject: [R] Aggregate or extract function ? Message-ID: <1320699055205-4013673.post@n4.nabble.com> Hi R user, I have two dataframe with different variables and coordinates : X Y sp bio3 bio5 bio6 bio13 bio14 1 -70.91667 -45.08333 0 47 194 -27 47 12 2 -86.58333 66.25000 0 16 119 -345 42 3 3 -62.58333 -17.91667 0 68 334 152 144 28 4 -68.91667 -31.25000 0 54 235 -45 25 7 5 55.58333 48.41667 0 23 319 -172 23 14 6 66.25000 37.75000 0 34 363 -18 49 0 and this one : X Y LU1 LU2 LU3 LU4 LU5 LU6 LU7 LU8 LU9 LU10 LU11 LU12 LU13 LU14 LU15 LU16 LU17 LU18 1 -36.5 84 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000000 0 0 0 0 2 -36.0 84 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000000 0 0 0 0 3 -35.5 84 0 0 0 0 0 0 0 0 0 0 0 0 0 26.085468 0 0 0 0 4 -35.0 84 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000000 0 0 0 0 5 -34.5 84 0 0 0 0 0 0 0 0 0 0 0 0 0 5.267761 0 0 0 0 6 -34.0 84 0 0 0 0 0 0 0 0 0 0 0 0 0 105.371069 0 0 0 0 I wouldlike to add to my first dataframe the value of the LU variables at the coordinates of the first dataframe. Of course, the coordinates are not at the same resolution and are different, this is the problem. I wouldlike to decrease the resolution of the first one because the second dataframe have a coarser resolution and obtain something like that : X Y sp bio3 bio5 bio6 bio13 bio14 LU1 LU2 LU3 LU4 ... 1 -70.91667 -45.08333 0 47 194 -27 47 12 0 22.08 76.9 2 -86.58333 66.25000 0 16 119 -345 42 3 0 22.08 76.9 3 -62.58333 -17.91667 0 68 334 152 144 28 0 22.08 76.9 4 -68.91667 -31.25000 0 54 235 -45 25 7 0 22.08 76.9 5 55.58333 48.41667 0 23 319 -172 23 14 0 22.08 76.9 6 66.25000 37.75000 0 34 363 -18 49 0 0 22.08 76.9 Do someone know a function or a way to do obtain that ? Thanks in advance for the help, C?line -- View this message in context: http://r.789695.n4.nabble.com/Aggregate-or-extract-function-tp4013673p4013673.html Sent from the R help mailing list archive at Nabble.com. From djmuser at gmail.com Mon Nov 7 23:19:04 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Mon, 7 Nov 2011 14:19:04 -0800 Subject: [R] Correction in error In-Reply-To: References: Message-ID: Hi: In your function call, x[1, 1] = theta = 0. In the first line of the loop, your rbinom() call works out to be x[2, 1] <- rbinom(x[1, 3], 1, x[1, 1]) <=> rbinom(10, 1, 0) That likely accounts for the error message: Error in x[t, 1] <- rbinom(x[t - 1, 3], 1, x[t - 1, 1]) : replacement has length zero HTH, Dennis On Mon, Nov 7, 2011 at 12:10 PM, Gyanendra Pokharel wrote: > Hello R community, following is my code and it shows error, can some one > fix this error and explain why this occurs? > > gibbs <-function(m,n, theta = 0, lambda = 1){ > ? ?alpha <- 1.5 > ? ?beta <- 1.5 > ? ?gamma <- 1.5 > ? ?x<- array(0,c(m+1, 3)) > ? ?x[1,1] <- theta > ? ?x[1,2] <- lambda > ? ?x[1,3]<- n > ? ?for(t in 2:m+1){ > ? ? ? ?x[t,1] <- rbinom(x[t-1,3], 1, x[t-1,1]) > ? ? ? ?x[t,2]<-rbeta(m, x[t-1,1] + alpha, tx[t-1,3] - x[t-1,1] + beta) > ? ? ? ?x[t,3] <- rpois(x[t-1,3] - x[t-1,1],(1 - x[t-1,2])*gamma) > ? ?} > ? ?x > } > gibbs(100, 10) > > Gyn > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From sabaric at charter.net Mon Nov 7 23:32:19 2011 From: sabaric at charter.net (Richard Saba) Date: Mon, 7 Nov 2011 16:32:19 -0600 Subject: [R] vars impulse response function output Message-ID: <003301cc9d9d$1c270f90$54752eb0$@charter.net> Sorry about first post. This is in plain text. Does anyone know if the bootstrap CI intervals generated by the irf() function (impulse response function) in the " vars" package are bias corrected? Thanks, Richard Saba From ccquant at gmail.com Mon Nov 7 23:34:55 2011 From: ccquant at gmail.com (Ben quant) Date: Mon, 7 Nov 2011 15:34:55 -0700 Subject: [R] RpgSQL row names Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bbolker at gmail.com Mon Nov 7 23:47:24 2011 From: bbolker at gmail.com (Ben Bolker) Date: Mon, 7 Nov 2011 22:47:24 +0000 Subject: [R] Comparing Excel to R for Standard Curves References: Message-ID: minda berbeco ucdavis.edu> writes: > > Hello, > > In the past I have used Excel for creating standard curves. I recently > switched to R using the lm function to create my curves instead, but find > the results are slightly different even though my y-intercept and slope are > the same. A friend told me that the difference was due to R and Excel > using a different algorithm to calculate the line. Has anyone run into > this problem before? Can you please give us a reproducible example? e.g. read the posting guide or http://tinyurl.com/reproducible-000 ... It's also not clear what you mean by "the results are slightly different even though my y-intercept and slope are the same" ... what do you mean by "results"? R^2 value? Ben Bolker From flokke at live.de Mon Nov 7 22:09:13 2011 From: flokke at live.de (flokke) Date: Mon, 7 Nov 2011 13:09:13 -0800 (PST) Subject: [R] rearrange set of items randomly Message-ID: <1320700153654-4013723.post@n4.nabble.com> Dear all, I hope that this question is not too weird, I will try to explain it as good as I can. I have to write a function for a school project and one limitation is that I may not use the in built function sample() At one point in the function I would like to resample/rearrange the items of my sample (so I would want to use sample, but I am not allowed to do so), so I have to come up with sth else that does the same as the in built function sample() The only thing that sample() does is rearranging the items of a sample, so I searched the internet for a function that does that to be able to use it, but I cannot find anything that could help me. Can maybe someone help me with this? I would be very grateful, Cheers, Maria -- View this message in context: http://r.789695.n4.nabble.com/rearrange-set-of-items-randomly-tp4013723p4013723.html Sent from the R help mailing list archive at Nabble.com. From katiaborgia at hotmail.it Mon Nov 7 22:14:09 2011 From: katiaborgia at hotmail.it (katiab81) Date: Mon, 7 Nov 2011 13:14:09 -0800 (PST) Subject: [R] geoR, variofit/likfit Message-ID: <1320700449200-4013734.post@n4.nabble.com> hi, I have a problem with the functions variofit ant likfit. I have to chose the more appropriate variogram model for my specific problem and data. I write the alghoritm has follows: library(geoR) data(wolfcamp) v.e<-variog(wolfcamp) exp.vario<-variofit(v.e,ini.cov.pars=c(175226.94,139.81),cov.model="exponential",nugget=21903.37) lines(exp.vario,col=1) sph.vario<-variofit(v.e,ini.cov.pars=c(175226.94,139.81),cov.model="spherical",nugget=21903.37) lines(sph.vario,col=2) ..... exp.lik<-likfit(wolfcamp,coords=wolfcamp$coords,data=wolfcamp$data,ini.cov.pars=c(175226.94,139.81),cov.model="exponential",trend="cte",fix.kappa=FALSE,nugget=21903,37) lines(exp.lik,col=5) sph.lik<-likfit(wolfcamp,coords=wolfcamp$coords,data=wolfcamp$data,ini.cov.pars=c(175226.94,139.81),cov.model="spherical",trend="cte",fix.kappa=FALSE,nugget=21903,37) lines(sph.lik,col=6) .... the lines of the resultant variograms are different using variofit or likfit, but when I change the cov.model (from exp to sph or lin....) the program gives identical result. I think this is incorrect.... can you tell me if I made any mistake in the implementations of the algorithm. thank you. P.S. sorry for my english. katia -- View this message in context: http://r.789695.n4.nabble.com/geoR-variofit-likfit-tp4013734p4013734.html Sent from the R help mailing list archive at Nabble.com. From v_candida at hotmail.com Mon Nov 7 22:24:28 2011 From: v_candida at hotmail.com (Eric) Date: Mon, 7 Nov 2011 22:24:28 +0100 Subject: [R] DESeq Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sjoyes at uoguelph.ca Mon Nov 7 23:22:54 2011 From: sjoyes at uoguelph.ca (SarahJoyes) Date: Mon, 7 Nov 2011 14:22:54 -0800 (PST) Subject: [R] Sampling with conditions Message-ID: <1320704574008-4014036.post@n4.nabble.com> Hey everyone, I am at best, an amateur user of R, but I am stuck on how to set-up the following situation. I am trying to select a random sample of numbers from 0 to 10 and insert them into the first column of a matrix (which will used later in a loop). However, I need to have those numbers add up to 10. How can I set those conditions? So far I have: n<-matrix(0,nr=5,ncol=10) for(i in 1:10){n[i,1]<-sample(0:10,1)} How do I set-up the "BUT sum(n[i,1])=10"? Thanks SarahJ -- View this message in context: http://r.789695.n4.nabble.com/Sampling-with-conditions-tp4014036p4014036.html Sent from the R help mailing list archive at Nabble.com. From jvadams at usgs.gov Mon Nov 7 23:31:51 2011 From: jvadams at usgs.gov (Jean V Adams) Date: Mon, 7 Nov 2011 16:31:51 -0600 Subject: [R] Aggregate or extract function ? In-Reply-To: <1320699055205-4013673.post@n4.nabble.com> References: <1320699055205-4013673.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From markm0705 at gmail.com Mon Nov 7 23:46:22 2011 From: markm0705 at gmail.com (markm0705) Date: Mon, 7 Nov 2011 14:46:22 -0800 (PST) Subject: [R] Adding lines to scatterplot odd result when creating multiple plots Message-ID: <1320705982831-4014140.post@n4.nabble.com> Dear R helpers I'm attempting to create a matrix of scatterplots with X-Y mean lines and regression lines added to each plot in the matrix. I have managed to create the first plot I would like using scatterplot but have run into an odd result when I use par() to set up the page to take multiple plot. Specifically, the mean and regression lines appear to plot in the second plot window, not on top of the scatterplot (when I include the par(mfcol) command). How do I get the scatterplot and added lines to stay together on this plot and subsequent plots I would like to include in the scatterplot matrix Data attached Thanks in advance MarkM library("car") par(mfcol=c(1,2)) # Parameters to change Infile<-"kt3d_Thk1.dat" X<-"Estimated thickness (mE)" Y<-"True thickness (mE)" #load data #read the data skip then read header crossval <- read.table(Infile,skip = 9,sep = "") head<-readLines(Infile,9) #decode the header lines wanted head2<-head[3:9] colnames(crossval)=head2 #Filter out non-estimated crossval2<-crossval[crossval$Estimate>0,] # Compute the means AveEst<-mean(crossval2$Estimate) AveTru<-mean(crossval2$True) #Fit the regression line Fit<-lm(crossval2$True~crossval2$Estimate ) #create plots scatterplot(True ~ Estimate, data=crossval2, xlab= X, ylab= Y, main= "Min 4 Max 8", grid=FALSE, xlim=c(0,8), ylim=c(0,8), pch=21, cex=1.2, smooth=FALSE, reg.line=FALSE ) #Plot mean lines and regression abline(h=AveTru,col="grey70") abline(v=AveEst,col="grey70") abline(0,1,col="red") abline(lm(crossval2$True~crossval2$Estimate), col="blue", lty=1, lwd=2) -- View this message in context: http://r.789695.n4.nabble.com/Adding-lines-to-scatterplot-odd-result-when-creating-multiple-plots-tp4014140p4014140.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Tue Nov 8 00:12:20 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 7 Nov 2011 18:12:20 -0500 Subject: [R] rearrange set of items randomly In-Reply-To: <1320700153654-4013723.post@n4.nabble.com> References: <1320700153654-4013723.post@n4.nabble.com> Message-ID: <20ECAF13-6D7A-4E5F-B68C-59B9B870E353@comcast.net> On Nov 7, 2011, at 4:09 PM, flokke wrote: > Dear all, > I hope that this question is not too weird, I will try to explain it > as good > as I can. > > I have to write a function for a school project and one limitation > is that I > may not use the in built function sample() > > At one point in the function I would like to resample/rearrange the > items of > my sample (so I would want to use sample, but I am not allowed to do > so), so > I have to come up with sth else that does the same as the in built > function > sample() > > The only thing that sample() does is rearranging the items of a > sample, so I > searched the internet for a function that does that to be able to > use it, > but I cannot find anything that could help me. > > Can maybe someone help me with this? > I would be very grateful, May I suggests thinking about the ordering of random variables? -- David Winsemius, MD West Hartford, CT From ggrothendieck at gmail.com Tue Nov 8 00:16:08 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Mon, 7 Nov 2011 18:16:08 -0500 Subject: [R] RpgSQL row names In-Reply-To: References: Message-ID: On Mon, Nov 7, 2011 at 5:34 PM, Ben quant wrote: > Hello, > > Using the RpgSQL package, there must be a way to get the row names into the > table automatically. In the example below, I'm trying to get rid of the > cbind line, yet have the row names of the data frame populate a column. > >> bentest = matrix(1:4,2,2) >> dimnames(bentest) = list(c('ra','rb'),c('ca','cb')) >> bentest > ? ca cb > ra ?1 ?3 > rb ?2 ?4 >> bentest = cbind(item_name=rownames(bentest),bentest) >> dbWriteTable(con, "r.bentest", bentest) > [1] TRUE >> dbGetQuery(con, "SELECT * FROM r.bentest") > ?item_name ca cb > 1 ? ? ? ?ra ?1 ?3 > 2 ? ? ? ?rb ?2 ?4 > > The RJDBC based drivers currently don't support that. You can create a higher level function that does it. dbGetQuery2 <- function(...) { out <- dbGetQuery(...) i <- match("row_names", names(out), nomatch = 0) if (i > 0) { rownames(out) <- out[[i]] out <- out[-1] } out } rownames(BOD) <- letters[1:nrow(BOD)] dbWriteTable(con, "BOD", cbind(row_names = rownames(BOD), BOD)) dbGetQuery2(con, "select * from BOD") -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From michael.weylandt at gmail.com Tue Nov 8 00:24:26 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Mon, 7 Nov 2011 18:24:26 -0500 Subject: [R] DESeq In-Reply-To: References: Message-ID: This is one of those quirky R moments, try something like as.integer(as.character(DATA)). Michael On Mon, Nov 7, 2011 at 4:24 PM, Eric wrote: > Hello, > > I have RNAseq data, which I am trying to analyze with DESeq. My file (tab delimited .txt) appears to be correct: > >>head(myfile) > ? ? ? ?VZ_w13 VZ_w14a VZ_w14b VZ_w15a VZ_w15b VZ_w16a > ? ? ? ?ENSG00000253101 ? ? ?0 ? ? ? 0 ? ? ? 0 ? ? ? 0 ? ? ? 0 ? ? ? 0 > ? ? ? ?ENSG00000223972 ? ? ?0 ? ? ? 0 ? ? ? 0 ? ? ? 0 ? ? ? 0 ? ? ? 0... > > However, when I try to analyze the data with > >>cds <- newCountDataSet(myfile,conds) > > I get the following message: > > "Error in newCountDataSet(myfile,conds) : The countData is not integer. > > The problem, as far as I can tell, is that my data are numerical, not integer, because when I run > >>str(myfile) > ? ? ? ?'data.frame': ? 53433 obs. of ?14 variables: > ? ? ? ? $ VZ_w13 ? : num ?0 0 0 0 8 0 0 0 0 0 ... > > Does anyone have a way to convert my file from numerical to integer? As you can see, the data are in fact integers are already, so I'm a bit confused. > > Thanks, > Eric > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From anopheles123 at gmail.com Tue Nov 8 00:31:29 2011 From: anopheles123 at gmail.com (Weidong Gu) Date: Mon, 7 Nov 2011 18:31:29 -0500 Subject: [R] Sampling with conditions In-Reply-To: <1320704574008-4014036.post@n4.nabble.com> References: <1320704574008-4014036.post@n4.nabble.com> Message-ID: Not sure this is valid that you can have 9 random samples out of 10, but the last one has to be fixed to meet the restraint, sum=10. Weidong On Mon, Nov 7, 2011 at 5:22 PM, SarahJoyes wrote: > Hey everyone, > I am at best, an amateur user of R, but I am stuck on how to set-up the > following situation. > I am trying to select a random sample of numbers from 0 to 10 and insert > them into the first column of a matrix (which will used later in a loop). > However, I need to have those numbers add up to 10. How can I set those > conditions? > So far I have: > n<-matrix(0,nr=5,ncol=10) > for(i in 1:10){n[i,1]<-sample(0:10,1)} > How do I set-up the "BUT sum(n[i,1])=10"? > Thanks > SarahJ > > -- > View this message in context: http://r.789695.n4.nabble.com/Sampling-with-conditions-tp4014036p4014036.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From frainj at gmail.com Tue Nov 8 00:33:56 2011 From: frainj at gmail.com (John C Frain) Date: Mon, 7 Nov 2011 23:33:56 +0000 Subject: [R] False Virus detection with colorspace package? Message-ID: After updating to version 2.14 and copying packages from my Version 2.13.2 library I ran update.packages(checkBuilt=TRUE,ask=FALSE) to update these packages. This failed because AVG reported a virus in the "temporary" copy of colorspace.dll created during the install and the update then failed because it was unable to open this temporary file. To continue I deleted the colorspace packages and its reverse dependences and the reverse dependencies of the reverse dependencies. After deleting these packages the update process then finished. I would presume that the virus is probably a false detection. However when I virus check the version 2.13.2 library AVG does not find a virus. As far as I can determine the only difference between the two packages is that they are built with different versions of R. I would intend to reinstall these packages when the problem has been solved. I am sending a report to AVG. For the moment I can fall back to the earlier version if necessary. Has any one else detected a similar problem. An extract of the diagnosis sent to AVG is below. Best regards John AVG Free Version 2012.0.1869 Virus database 2092/4602 detects a virus in the colorspace package in the R statistical system. The zip file containing the offending file can be downloaded from http://ftp.heanet.ie/mirrors/cran.r-project.org/bin/windows/contrib/r-release/colorspace_1.1-0.zip or from any of the CRAN mirrors. The message produced by AVG is ***************************************************** File name: c:\.....\colourspace\libs\i386\colorspace.dll Threat name: Virus found Win32/Heur Detected on open ***************************************************** Is this a false positive? -- John C Frain Economics Department Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:frainj at tcd.ie mailto:frainj at gmail.com From jdnewmil at dcn.davis.ca.us Tue Nov 8 00:44:37 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Mon, 07 Nov 2011 15:44:37 -0800 Subject: [R] False Virus detection with colorspace package? In-Reply-To: References: Message-ID: This kind of thing is all too common. You can download the source, review the code, and build the package yourself, and compare the supplied binary with your home-built one. Chances are AVG will complain about that one too, and you can confirm the false positive. Anti virus software is not too friendly with non-mainstream software. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN: Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. John C Frain wrote: >After updating to version 2.14 and copying packages from my Version >2.13.2 library I ran > >update.packages(checkBuilt=TRUE,ask=FALSE) to update these packages. >This failed because AVG reported a virus in the "temporary" copy of >colorspace.dll created during the install and the update then failed >because it was unable to open this temporary file. To continue I >deleted the colorspace packages and its reverse dependences and the >reverse dependencies of the reverse dependencies. After deleting >these packages the update process then finished. > >I would presume that the virus is probably a false detection. However >when I virus check the version 2.13.2 library AVG does not find a >virus. As far as I can determine the only difference between the two >packages is that they are built with different versions of R. I would >intend to reinstall these packages when the problem has been solved. >I am sending a report to AVG. For the moment I can fall back to the >earlier version if necessary. > >Has any one else detected a similar problem. > >An extract of the diagnosis sent to AVG is below. > >Best regards > >John > >AVG Free Version 2012.0.1869 Virus database 2092/4602 detects a virus >in the colorspace package in the R statistical system. The zip file >containing the offending file can be downloaded from > >http://ftp.heanet.ie/mirrors/cran.r-project.org/bin/windows/contrib/r-release/colorspace_1.1-0.zip >or from any of the CRAN mirrors. > >The message produced by AVG is > >***************************************************** > >File name: c:\.....\colourspace\libs\i386\colorspace.dll > >Threat name: Virus found Win32/Heur > >Detected on open > >***************************************************** > >Is this a false positive? > >-- >John C Frain >Economics Department >Trinity College Dublin >Dublin 2 >Ireland >www.tcd.ie/Economics/staff/frainj/home.html >mailto:frainj at tcd.ie >mailto:frainj at gmail.com > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. From frainj at gmail.com Tue Nov 8 00:55:46 2011 From: frainj at gmail.com (John C Frain) Date: Mon, 7 Nov 2011 23:55:46 +0000 Subject: [R] False Virus detection with colorspace package? In-Reply-To: References: Message-ID: Thanks. This is exactly what I thought. The idea was to check if anyone else had detected a problem. I have since discovered http://virusscan.jotti.org which allows one to check the files using 20 different anti-virus programs. Only AVG detected a virus. The other 19 reported no problems. Best Regards John On 7 November 2011 23:44, Jeff Newmiller wrote: > This kind of thing is all too common. You can download the source, review the code, and build the package yourself, and compare the supplied binary with your home-built one. Chances are AVG will complain about that one too, and you can confirm the false positive. Anti virus software is not too friendly with non-mainstream software. > --------------------------------------------------------------------------- > Jeff Newmiller ? ? ? ? ? ? ? ? ? ? ? ?The ? ? ..... ? ? ? ..... ?Go Live... > DCN: ? ? ? ?Basics: ##.#. ? ? ? ##.#. ?Live Go... > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Live: ? OO#.. Dead: OO#.. ?Playing > Research Engineer (Solar/Batteries ? ? ? ? ? ?O.O#. ? ? ? #.O#. ?with > /Software/Embedded Controllers) ? ? ? ? ? ? ? .OO#. ? ? ? .OO#. ?rocks...1k > --------------------------------------------------------------------------- > Sent from my phone. Please excuse my brevity. > > John C Frain wrote: > >>After updating to version 2.14 and copying packages from my Version >>2.13.2 library ?I ran >> >>update.packages(checkBuilt=TRUE,ask=FALSE) to update these packages. >>This failed because AVG reported a virus in the "temporary" copy of >>colorspace.dll created during the install and the update then failed >>because it was unable to open this temporary file. ?To continue I >>deleted the colorspace packages and its reverse dependences and the >>reverse dependencies of the reverse dependencies. ?After deleting >>these packages the update process then finished. >> >>I would presume that the virus is probably a false detection. ?However >>when I virus check the version 2.13.2 library AVG does not find a >>virus. ?As far as I can determine the only difference between the two >>packages is that they are built with different versions of R. ?I would >>intend to reinstall these packages when the problem has been solved. >>I am sending a report to AVG. For the moment I can fall back to the >>earlier version if necessary. >> >>Has any one else detected a similar problem. >> >>An extract of the diagnosis sent to AVG is below. >> >>Best regards >> >>John >> >>AVG Free Version 2012.0.1869 Virus database 2092/4602 detects a virus >>in the colorspace package in the R statistical system. ?The zip file >>containing the offending file can be downloaded from >> >>http://ftp.heanet.ie/mirrors/cran.r-project.org/bin/windows/contrib/r-release/colorspace_1.1-0.zip >>or from any of the CRAN mirrors. >> >>The message produced by AVG is >> >>***************************************************** >> >>File name: ? c:\.....\colourspace\libs\i386\colorspace.dll >> >>Threat name: Virus found Win32/Heur >> >>Detected on open >> >>***************************************************** >> >>Is this a false positive? >> >>-- >>John C Frain >>Economics Department >>Trinity College Dublin >>Dublin 2 >>Ireland >>www.tcd.ie/Economics/staff/frainj/home.html >>mailto:frainj at tcd.ie >>mailto:frainj at gmail.com >> >>______________________________________________ >>R-help at r-project.org mailing list >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide >>http://www.R-project.org/posting-guide.html >>and provide commented, minimal, self-contained, reproducible code. > > -- John C Frain Economics Department Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:frainj at tcd.ie mailto:frainj at gmail.com From djnordlund at frontier.com Tue Nov 8 00:56:50 2011 From: djnordlund at frontier.com (Daniel Nordlund) Date: Mon, 7 Nov 2011 15:56:50 -0800 Subject: [R] Sampling with conditions In-Reply-To: <1320704574008-4014036.post@n4.nabble.com> References: <1320704574008-4014036.post@n4.nabble.com> Message-ID: <17FEF370037E4CB58D1DE0C954236861@Gandalf> > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of SarahJoyes > Sent: Monday, November 07, 2011 2:23 PM > To: r-help at r-project.org > Subject: [R] Sampling with conditions > > Hey everyone, > I am at best, an amateur user of R, but I am stuck on how to set-up the > following situation. > I am trying to select a random sample of numbers from 0 to 10 and insert > them into the first column of a matrix (which will used later in a loop). > However, I need to have those numbers add up to 10. How can I set those > conditions? > So far I have: > n<-matrix(0,nr=5,ncol=10) > for(i in 1:10){n[i,1]<-sample(0:10,1)} > How do I set-up the "BUT sum(n[i,1])=10"? > Thanks > SarahJ > Sarah, Does something like this do what you want? n <- matrix(0,nrow=5, ncol=10) repeat{ c1 <- sample(0:10, 4, replace=TRUE) if(sum(c1) <= 10) break } n[,1] <- c(c1,10-sum(c1)) n Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA From anopheles123 at gmail.com Tue Nov 8 01:03:12 2011 From: anopheles123 at gmail.com (Weidong Gu) Date: Mon, 7 Nov 2011 19:03:12 -0500 Subject: [R] help with formula for clogit In-Reply-To: <1320679191169-3998967.post@n4.nabble.com> References: <1320679191169-3998967.post@n4.nabble.com> Message-ID: clogit needs to spell out formula plus strata, you can try modify the code ... d2=d[,c(mols,'group','Age','strata')] fo<-as.formula(paste(paste('group~','Age',sep=''),'strata(strata)',sep='+')) log.reg<-clogit(fo,data=d2) ... Weidong On Mon, Nov 7, 2011 at 10:19 AM, 1Rnwb wrote: > I would like to know if clogit function can be used as below > clogit(group~., data=dataframe) > > When I try to use in above format it takes a long time, I would appreciate > some pointers to get multiple combinations tested. > > set.seed(100) > ?d=data.frame(x=rnorm(20)+5, > ?x1=rnorm(20)+5, > ?x2=rnorm(20)+5, > ?x3=rnorm(20)+5, > ?x4=rnorm(20)+5, > ?x5=rnorm(20)+5, > ?x6=rnorm(20)+5, > ?x7=rnorm(20)+5, > ?x8=rnorm(20)+5, > group=rep(c(1,2),10), Age=rnorm(20)+35,strata=c(rep(1,10), rep(2,10))) > > nam=names(d)[1:9] > results <- c("Protein", "OR", "p-val") > pc3=combinations(n=length(nam),r=2) > > for (len in 1:dim(pc3)[2]) > ?{ > ?prs=pc3[len,] > ?mols=nam[prs] > ?d2=d[,c(mols,'group','Age','strata')] > ?log.reg<-clogit(group~.,data=d2) > ?a = summary(log.reg)$conf.int > ?z= summary(log.reg)$coefficients[1,4] ?#ncol in coefficients is 3 * number > of parameters > ?pval ?= 2*pnorm(-abs(z)) > ?res2 = c(paste('IL8+',molecule,sep=''), paste (round(a[1,1],2), " (" ?, > round(a[1,3],2), " - " , round(a[1,4],2), ")" , sep=""), pval) > ?results ?= rbind (results ,res2 ) > } > > Thanks > Sharad > > -- > View this message in context: http://r.789695.n4.nabble.com/help-with-formula-for-clogit-tp3998967p3998967.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From ted.harding at wlandres.net Tue Nov 8 01:25:57 2011 From: ted.harding at wlandres.net ( (Ted Harding)) Date: Tue, 08 Nov 2011 00:25:57 -0000 (GMT) Subject: [R] Sampling with conditions In-Reply-To: <1320704574008-4014036.post@n4.nabble.com> Message-ID: On 07-Nov-11 22:22:54, SarahJoyes wrote: > Hey everyone, > I am at best, an amateur user of R, but I am stuck on how > to set-up the following situation. > I am trying to select a random sample of numbers from 0 to 10 > and insert them into the first column of a matrix (which will > used later in a loop). > However, I need to have those numbers add up to 10. How can > I set those conditions? > So far I have: > n<-matrix(0,nr=5,ncol=10) > for(i in 1:10){n[i,1]<-sample(0:10,1)} > How do I set-up the "BUT sum(n[i,1])=10"? > Thanks > SarahJ Sarah, your example is confusing because you have set up a matrix 'n' with 5 rows and 10 columns. But your loop cycles through 10 rows! However, assuming that your basic requirement is to sample 10 integers which add up to 10, consider rmultinom(): rmultinom(n=1,size=10,prob=(1:10)/10) # [,1] # [1,] 1 # [2,] 0 # [3,] 2 # [4,] 0 # [5,] 1 # [6,] 1 # [7,] 2 # [8,] 0 # [9,] 1 #[10,] 2 rmultinom(n=1,size=10,prob=(1:10)/10) # [,1] # [1,] 0 # [2,] 0 # [3,] 0 # [4,] 0 # [5,] 1 # [6,] 1 # [7,] 2 # [8,] 1 # [9,] 2 #[10,] 3 This gives each integer in (0:10) equal chances of being in the sample. For unequal chances, vary 'prob'. Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 08-Nov-11 Time: 00:25:54 ------------------------------ XFMail ------------------------------ From arrayprofile at yahoo.com Tue Nov 8 01:33:14 2011 From: arrayprofile at yahoo.com (array chip) Date: Mon, 7 Nov 2011 16:33:14 -0800 (PST) Subject: [R] why NA coefficients Message-ID: <1320712394.45057.YahooMailNeo@web125809.mail.ne1.yahoo.com> Hi, I am trying to run ANOVA with an interaction term on 2 factors (treat has 7 levels, group has 2 levels). I found the coefficient for the last interaction term is always 0, see attached dataset and the code below: > test<-read.table("test.txt",sep='\t',header=T,row.names=NULL) > lm(y~factor(treat)*factor(group),test) Call: lm(formula = y ~ factor(treat) * factor(group), data = test) Coefficients: ????????????????? (Intercept)???????????????? factor(treat)2???????????????? factor(treat)3? ???????????????????? 0.429244?????????????????????? 0.499982?????????????????????? 0.352971? ?????????????? factor(treat)4???????????????? factor(treat)5???????????????? factor(treat)6? ??????????????????? -0.204752?????????????????????? 0.142042?????????????????????? 0.044155? ?????????????? factor(treat)7???????????????? factor(group)2? factor(treat)2:factor(group)2? ??????????????????? -0.007775????????????????????? -0.337907????????????????????? -0.208734? factor(treat)3:factor(group)2? factor(treat)4:factor(group)2? factor(treat)5:factor(group)2? ??????????????????? -0.195138?????????????????????? 0.800029?????????????????????? 0.227514? factor(treat)6:factor(group)2? factor(treat)7:factor(group)2? ???????????????????? 0.331548???????????????????????????? NA I guess this is due to model matrix being singular or collinearity among the matrix columns? But I can't figure out how the matrix is singular in this case? Can someone show me why this is the case? Thanks John -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test.txt URL: From dwinsemius at comcast.net Tue Nov 8 02:13:16 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 7 Nov 2011 20:13:16 -0500 Subject: [R] why NA coefficients In-Reply-To: <1320712394.45057.YahooMailNeo@web125809.mail.ne1.yahoo.com> References: <1320712394.45057.YahooMailNeo@web125809.mail.ne1.yahoo.com> Message-ID: On Nov 7, 2011, at 7:33 PM, array chip wrote: > Hi, I am trying to run ANOVA with an interaction term on 2 factors > (treat has 7 levels, group has 2 levels). I found the coefficient > for the last interaction term is always 0, see attached dataset and > the code below: > >> test<-read.table("test.txt",sep='\t',header=T,row.names=NULL) >> lm(y~factor(treat)*factor(group),test) > > Call: > lm(formula = y ~ factor(treat) * factor(group), data = test) > > Coefficients: > (Intercept) > factor(treat)2 factor(treat)3 > 0.429244 > 0.499982 0.352971 > factor(treat)4 > factor(treat)5 factor(treat)6 > -0.204752 > 0.142042 0.044155 > factor(treat)7 factor(group)2 > factor(treat)2:factor(group)2 > -0.007775 > -0.337907 -0.208734 > factor(treat)3:factor(group)2 factor(treat)4:factor(group)2 > factor(treat)5:factor(group)2 > -0.195138 > 0.800029 0.227514 > factor(treat)6:factor(group)2 factor(treat)7:factor(group)2 > 0.331548 NA > > > I guess this is due to model matrix being singular or collinearity > among the matrix columns? But I can't figure out how the matrix is > singular in this case? Can someone show me why this is the case? Because you have no cases in one of the crossed categories. -- David Winsemius, MD West Hartford, CT From ericstrom at aol.com Tue Nov 8 02:08:02 2011 From: ericstrom at aol.com (eric) Date: Mon, 7 Nov 2011 17:08:02 -0800 (PST) Subject: [R] Warning message interpretation Message-ID: <1320714482267-4014483.post@n4.nabble.com> Using the rmarketdata package and getting a warning message. What does this warning message tell me ? What could I do to eliminate or address it ? require(rdatamarket) Loading required package: rdatamarket Loading required package: zoo Warning message: In assignInNamespace("as.Date.numeric", function(x, origin, ...) { : binding of ?as.Date.numeric? is locked and will not be changed -- View this message in context: http://r.789695.n4.nabble.com/Warning-message-interpretation-tp4014483p4014483.html Sent from the R help mailing list archive at Nabble.com. From juliosergio at gmail.com Tue Nov 8 01:48:47 2011 From: juliosergio at gmail.com (JulioSergio) Date: Tue, 8 Nov 2011 00:48:47 +0000 Subject: [R] Intervals in function cut Message-ID: When I was studying the function cut I found this example: > x <- rep(0:8, tx0) > x [1] 0 0 0 0 0 0 0 0 0 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 4 4 4 5 5 5 5 5 5 5 5 5 5 6 [39] 6 6 6 6 7 7 7 8 8 8 8 8 > cut(x, b = 8) [1] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] [6] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (0.994,2] [11] (0.994,2] (0.994,2] (0.994,2] (2,3] (2,3] [16] (2,3] (2,3] (2,3] (2,3] (3,4] [21] (3,4] (3,4] (3,4] (3,4] (4,5] [26] (4,5]