From Wittner.Ben at mgh.harvard.edu Tue Mar 1 00:04:18 2011 From: Wittner.Ben at mgh.harvard.edu (Wittner, Ben, Ph.D.) Date: Mon, 28 Feb 2011 18:04:18 -0500 Subject: [R] Gamma mixture models with flexmix Message-ID: <2A462CD218D03249B384CABF3E6F6DE50719CADC@PHSXMB14.partners.org> I've been trying with no success to model mixtures of Gamma distributions using the package flexmix (see examples below). Can anyone help me get it to model better? Thanks very much. -Ben ## ## Please help me get flexmix to correctly model mixtures of ## Gamma distributions. See examples below. ## library('flexmix') ## ## Plot a histogram of dat and the Gamma mixture model given by ## shapes, rates and pis that is intended as a model of the ## distribution from which dat was drawn. ## plotGammaMixture <- function(dat, shapes, rates, pis) { KK <- length(pis) stopifnot(KK == length(shapes)) stopifnot(KK == length(rates)) ho <- hist(dat, plot=FALSE) x <- seq(ho$breaks[1], ho$breaks[length(ho$breaks)], length.out=1000) y <- list() for (ii in 1:KK) { y[[ii]] <- pis[ii]*dgamma(x, shape=shapes[ii], rate=rates[ii]) } uy <- unlist(y) ylim <- if (any(uy == Inf)) { c(0, 2*max(ho$intensities)) } else { c(0, max(c(ho$intensities, uy))) } plot(ho, col='lightgray', freq=FALSE, ylim=ylim, main=paste(KK, 'component(s) in model\nmodel prior(s) =', paste(round(pis, 2), collapse=', '))) cols <- rainbow(KK) for (ii in 1:KK) { lines(x, y[[ii]], col=cols[ii]) } } ## ## Model dat as a mixture of Gammas then plot. ## modelGammas <- function(dat, which='BIC') { set.seed(939458) fmo <- stepFlexmix(dat ~ 1, k=1:3, model=FLXMRglm(family='Gamma')) mdl <- getModel(fmo, which=which) print(smry <- summary(mdl)) print(prm <- parameters(mdl)) plotGammaMixture(dat, prm['shape', ], prm['shape', ]*prm['coef.(Intercept)', ], smry at comptab[, 'prior']) } ## ## Works well for a single Gamma distribution. ## set.seed(78483) dat1 <- rgamma(6000, shape=2, rate=0.5) modelGammas(dat1) set.seed(78483) dat2 <- rgamma(6000, shape=5, rate=0.3) modelGammas(dat2) ## ## Please help me get it to work for mixtures of ## two Gamma distributions. ## set.seed(78483) dat3 <- c(rgamma(6000, shape=3, rate=.5), rgamma(4000, shape=5, rate=.1)) modelGammas(dat3) ## Even telling it that there are two components ## does not help. I get two nearly identical distributions, modelGammas(dat3, which=2) ## whereas what I want to see is something like this. plotGammaMixture(dat3, shapes=c(3, 5), rates=c(.5, .1), pis=c(.6, .4)) ############################### Here's the output of sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base The information in this e-mail is intended only for the ...{{dropped:11}} From carson.farmer at gmail.com Tue Mar 1 00:15:09 2011 From: carson.farmer at gmail.com (Carson Farmer) Date: Mon, 28 Feb 2011 23:15:09 +0000 Subject: [R] mixture models/latent class regression comparison In-Reply-To: References: Message-ID: Thanks for the reply Christian, > I have never used mmlcr for this, but quite generally when fitting such > models, the likelihood has often very many local optima. This means that the > result of the EM (or a similar) algorithm depends on the initialisation, > which in flexmix (and perhaps also in mmlcr) is done in a random fashion. > This means that results may differ even if the same method is applied twice, > and unfortunately, depending on the dataset, the result may be quite > unstable. This may explain that the two functions give you strongly > different results, not of course implying that one of them is generally > better. I though this might be the issue here. So is the solution to simply run it many times and look for the best likelihood? I am still confused as to why the mmlcr function consistently produces 'poorer' results? Perhaps the flexmix package is being a bit more 'clever' in terms of avoiding local optima? I will have to dig a bit deeper. Cheers, Carson >> Dear list, >> >> I have been comparing the outputs of two packages for latent class >> regression, namely 'flexmix', and 'mmlcr'. What I have noticed is that >> the flexmix package appears to come up with a much better fit than the >> mmlcr package (based on logLik, AIC, BIC, and visual inspection). Has >> anyone else observed such behaviour? Has anyone else been successful >> in using the mmlcr package? I ask because I am interested in latent >> class negative binomial regression, which the mmlcr package appears to >> support, however, the results for basic Poisson latent class >> regression appear to be inferior to the results from flexmix. Below is >> a simple reproducible example to illustrate the comparison: >> >> library(flexmix) >> library(mmlcr) >> data(NPreg) # from package flexmix >> m1 <- flexmix(yp ~ x, k=2, data=NPreg, model=FLXMRglm(family='poisson')) >> NPreg$id <- 1:200 # mmlcr requires an id column >> m2 <- mmlcr(outer=~1|id, components=list(list(formula=yp~x, >> class="poisonce")), data=NPreg, n.groups=2) >> >> # summary and coefficients for flexmix model >> summary(m1) >> summary(refit(m1)) >> >> # summary and coefficients for mmlcr model >> summary(m2) >> m2 >> >> Regards, >> >> Carson >> >> P.S. I have attached a copy of the mmlcr package with a modified >> mmlcr.poisonce function due to errors in the version available here: >> http://cran.r-project.org/src/contrib/Archive/mmlcr/. See also >> http://jeldi.com/Members/jthacher/tips-and-tricks/programs/r/mmlcr >> section "Bugs?" subsection "Poisson". >> >> -- >> Carson J. Q. Farmer >> ISSP Doctoral Fellow >> National Centre for Geocomputation >> National University of Ireland, Maynooth, >> http://www.carsonfarmer.com/ >> > > *** --- *** > Christian Hennig > University College London, Department of Statistical Science > Gower St., London WC1E 6BT, phone +44 207 679 1698 > chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche > -- Carson J. Q. Farmer ISSP Doctoral Fellow National Centre for Geocomputation National University of Ireland, Maynooth, http://www.carsonfarmer.com/ From bbolker at gmail.com Tue Mar 1 00:23:55 2011 From: bbolker at gmail.com (Ben Bolker) Date: Mon, 28 Feb 2011 16:23:55 -0700 Subject: [R] Measuring correlations in repeated measures data In-Reply-To: <755bd4eb-5bb3-4792-e247-ecbe876f2459@me.com> References: <755bd4eb-5bb3-4792-e247-ecbe876f2459@me.com> Message-ID: <4D6C2E8B.5080503@gmail.com> On 11-02-28 11:59 AM, Brant Inman wrote: > Ben, > > Thanks for the response. Your method generates an answer that is > slightly different than what I was looking for. In the Orthodont > dataset there are 4 age groups (8, 10, 12, 14). I would like to > calculate the correlation of "distance" for all combinations of the > categorical variable "age". The anticipated output would therefore be a > matrix with 4 columns and 4 rows and a diagonal of ones. > > For example, in such a table I would be able to look at the mean within > individual correlation coefficient for distance b/t ages 8 and 10 or, > alternatively, ages 10 and 14. Is there a function in nlme or lme4 that > does this? Given the model below, fit2$modelStruct$corStruct produces Correlation structure of class corSymm representing Correlation: 1 2 3 2 -0.099 3 0.021 -0.242 4 -0.298 0.184 0.262 (this is also shown at the end of summary(fit2)) This is the lower triangle of the (symmetric) correlation matrix; the diagonal is 1 by definition. Isn't that what you're looking for? (Sorry if I'm misunderstanding.) Ben > > Brant > > On Feb 28, 2011, at 02:24 AM, Ben Bolker wrote: > >> Brant Inman mac.com > writes: >> >> > >> > R-helpers: >> > >> > I would like to measure the correlation coefficient between the repeated >> measures of a single variable >> > that is measured over time and is unbalanced. As an example, >> consider the >> Orthodont dataset from package >> > nlme, where the model is: >> > >> > fit <- lmer(distance ~ age + (1 | Subject), data=Orthodont) >> > >> > I would like to measure the correlation b/t the variable "distance" at >> different ages such that I would have >> > a matrix of correlation coefficients like the following: >> > >> > age08 age09 age10 age11 age12 age13 age14 >> > age08 1 >> > age09 1 >> > age10 1 >> > age11 1 >> > age12 1 >> > age13 1 >> > age14 1 >> > >> > The idea would be to demonstrate that the correlations b/t >> > repeated measures of the variable "distance" >> > decrease as the time b/t measures increases For example, >> > one might expect the correlation >> > coefficient? b/t age08 and age09 to be higher than that >> > between age08 and age14. >> > >> >> This stuff is not currently possible in lmer/lme4 but is >> easy in nlme: >> >> library(nlme) >> Orthodont$age0 <- Orthodont$age/2-3 >> ## later code requires a time index of consecutive integers >> ## (which apparently must also start at 1, although not stated) >> >> fit <- lme(distance~age,random=~1|Subject,data=Orthodont) >> >> ## compute autocorrelation on the basis of lag only, plot >> a <- ACF(fit) >> plot(a,alpha=0.05) >> >> >> fit2 <- update(fit, correlation=corSymm(form=~age0|Subject)) >> fit3 <- update(fit, correlation=corAR1(form=~age0|Subject)) >> >> AIC(fit,fit2,fit3) >> ## at least on the basis of AIC, this extra complexity is >> ## not warranted >> >> anova(fit,fit2) ## likelihood ratio test >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. From jsorkin at grecc.umaryland.edu Tue Mar 1 00:44:48 2011 From: jsorkin at grecc.umaryland.edu (John Sorkin) Date: Mon, 28 Feb 2011 18:44:48 -0500 Subject: [R] lme error: Error in getGroups.data.frame(dataMix, groups) In-Reply-To: References: <34de8d3c-b6fb-c257-90e5-a06fbe1e8e5e@me.com> <4D6B8A35020000CB000817CE@medicine.umaryland.edu> Message-ID: <4D6BED20020000CB00081895@medicine.umaryland.edu> Dennis, Thank you, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Dennis Murphy 2/28/2011 4:08 PM >>> Hi: On Mon, Feb 28, 2011 at 8:42 AM, John Sorkin wrote: > R 2.10.0 > Windows XP > > I am trying to run lme. I receive the following error message: > My lme code is: > fitRandom <- lme(values ~ factor(subject), > data=withindata) > Where's the random factor? Perhaps you mean lme(values ~ 1, random = ~ 1 | subject, data = withindata) or in lme4, lmer(values ~ 1 + (1 | subject), data = withindata) HTH, Dennis PS: Use of dput() output is more helpful than copying from the console, as it retains the same classes you have on your end. > dput(withindata) structure(list(subject = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), values = c(2.3199639, -8.5795802, -4.1901241, 0.4588128, 16.9128232, 8.9856358, 1.9303254, -1.4320313, -15.4225123, 5.9293529, -29.2014153, -8.9684986, -11.906217, 13.2133887, 1.2491941, -8.0613768, -5.6340179, 3.1916857, -7.7447932, 2.2316354, 0.6444938, 4.6912677, 20.9135073, 2.1433533, -0.8057022, -13.0187979, 8.9634065, 13.4815344, 4.6148061, -18.4781373, 15.5263564, -2.1993412, 5.183026, 16.2311097, -2.5781897, -3.016729, -0.1119353, 1.1983126, -8.8212143, 3.8895263)), .Names = c("subject", "values"), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40")) Below I have printed the console output, and at the bottom of this message, > I have printed my code. > I hope someone can tell my what I am doing wrong. > Thank you, > John > > > print(withindata) > subject values > 1 1 2.3199639 > 2 1 -8.5795802 > 3 1 -4.1901241 > 4 1 0.4588128 > 5 1 16.9128232 > 6 1 8.9856358 > 7 1 1.9303254 > 8 1 -1.4320313 > 9 1 -15.4225123 > 10 1 5.9293529 > 11 2 -29.2014153 > 12 2 -8.9684986 > 13 2 -11.9062170 > 14 2 13.2133887 > 15 2 1.2491941 > 16 2 -8.0613768 > 17 2 -5.6340179 > 18 2 3.1916857 > 19 2 -7.7447932 > 20 2 2.2316354 > 21 3 0.6444938 > 22 3 4.6912677 > 23 3 20.9135073 > 24 3 2.1433533 > 25 3 -0.8057022 > 26 3 -13.0187979 > 27 3 8.9634065 > 28 3 13.4815344 > 29 3 4.6148061 > 30 3 -18.4781373 > 31 4 15.5263564 > 32 4 -2.1993412 > 33 4 5.1830260 > 34 4 16.2311097 > 35 4 -2.5781897 > 36 4 -3.0167290 > 37 4 -0.1119353 > 38 4 1.1983126 > 39 4 -8.8212143 > 40 4 3.8895263 > > fitRandom <- lme(values ~ factor(subject), > + data=withindata) > Error in getGroups.data.frame(dataMix, groups) : > Invalid formula for groups > > summary(fitRandom) > Error in summary(fitRandom) : object 'fitRandom' not found > > > > My code: > > library(nlme) > > # Define essential constants. > # Number of subject studied. > NSubs <- 4 > # Number of observations per subject. > NObs <- 10 > # Between study SD > tau <- 4 > # Within study SD. > sigma <- 8 > # END Define essential constants. > > # Define between subject variation > between <- matrix(nrow=10,ncol=1) > between <- rnorm(NSubs,0,tau) > between > # END Define between subject variation. > > # Define within subject varation. > within <- matrix(nrow=NObs*NSubs,ncol=2) > for (subject in 1:NSubs) { > # Create a variable defining subject. > within[c(1:NObs)+((subject-1)*NObs),1] <- subject > # Create within subject variation. > within[c(1:NObs)+((subject-1)*NObs),2] <- > rnorm(NObs,between[subject],sigma) > } > # END Define within subject variation. > > # Create a dataframe to hold values. > withindata <- data.frame(subject=within[,1],values=within[,2]) > print(withindata[1:4,]) > > print(withindata) > fitRandom <- lme(values ~ subject, > data=withindata) > summary(fitRandom) > > > > John David Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > Confidentiality Statement: > This email message, including any attachments, is for ...{{dropped:17}} From seth at userprimary.net Tue Mar 1 00:48:40 2011 From: seth at userprimary.net (Seth Falcon) Date: Mon, 28 Feb 2011 15:48:40 -0800 Subject: [R] Data type problem when extract data from SQLite to R by using RSQLite In-Reply-To: References: Message-ID: Hi Jia, On Mon, Feb 28, 2011 at 12:37 PM, chen jia wrote: > When I extract data from SQLite to R, the data types (or modes) of the > extracted data seems to be determined by the value of the first row. > Please see the following example. It would help to provide the output of sessionInfo() as well as the schema definition for the table in SQLite (or at least description of how it was created). Here's an example that works as you'd like: > library(RSQLite) > db = dbConnect(SQLite(), dbname = ":memory:") > dbGetQuery(db, "create table t (a int, b real, c text)") > df = data.frame(a=c(NA, 1L, 2L), b=c(NA, 1.1, 2.2), c=c(NA, "x", "y"),stringsAsFactors=FALSE) > df a b c 1 NA NA 2 1 1.1 x 3 2 2.2 y > dbGetPreparedQuery(db, "insert into t values (?, ?, ?)", df) > dbGetQuery(db, "select * from t") a b c 1 NA NA 2 1 1.1 x 3 2 2.2 y > sapply(dbGetQuery(db, "select * from t"), typeof) a b c "integer" "double" "character" > sapply(dbGetQuery(db, "select * from t limit 1"), typeof) a b c "integer" "double" "character" > sapply(dbGetQuery(db, "select a from t limit 1"), typeof) a "integer" > sapply(dbGetQuery(db, "select a from t limit 2"), typeof) a "integer" > sapply(dbGetQuery(db, "select a from t limit 1"), typeof) a "integer" > sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] RSQLite_0.9-4 DBI_0.2-5 loaded via a namespace (and not attached): [1] tools_2.11.1 -- Seth Falcon | @sfalcon | http://userprimary.net/ From shawjw at gmail.com Tue Mar 1 00:50:53 2011 From: shawjw at gmail.com (James Shaw) Date: Mon, 28 Feb 2011 17:50:53 -0600 Subject: [R] Robust variance estimation with rq (failure of the bootstrap?) Message-ID: I am fitting quantile regression models using data collected from a sample of 124 patients. When modeling cross-sectional associations, I have noticed that nonparametric bootstrap estimates of the variances of parameter estimates are much greater in magnitude than the empirical Huber estimates derived using summary.rq's "nid" option. The outcome variable is severely skewed, and I am afraid that this may be affecting the consistency of the bootstrap variance estimates. I have read that the m out of n bootstrap can be used to overcome this problem. However, this procedure requires both the original sample (n) and the subsample (m) sizes to be large. The version implemented in rq.boot does not appear to provide any improvement over the naive bootstrap. Ultimately, I am interested in using median regression to model changes in the outcome variable over time. Summary.rq's robust variance estimator is not applicable to repeated-measures data. I question whether the block (cluster) bootstrap variance estimator, which can accommodate intraclass correlation, would perform well. Can anyone suggest alternatives for variance estimation in this situation? Regards, Jim James W. Shaw, Ph.D., Pharm.D., M.P.H. Assistant Professor Department of Pharmacy Administration College of Pharmacy University of Illinois at Chicago 833 South Wood Street, M/C 871, Room 266 Chicago, IL 60612 Tel.: 312-355-5666 Fax: 312-996-0868 Mobile Tel.: 215-852-3045 From jonathan.m.dubois at gmail.com Tue Mar 1 01:14:32 2011 From: jonathan.m.dubois at gmail.com (Jonathan DuBois) Date: Mon, 28 Feb 2011 19:14:32 -0500 Subject: [R] regression with categorical nuisance variable Message-ID: Hi, I am new to R, so I am unsure of the formula to set up this analysis. I would like to run a linear model with a continuous dependent variable (brain volume) and a continuous independent variable (age) while controlling for a categorical nuisance variable (gender). Age and brain volume are correlated. There are no gender differences in age but there are significant gender differences in brain volume. Therefore, I would like to control for gender when assessing the association between brain volume and age. Any help would be very much appreciated. Jon From izahn at psych.rochester.edu Tue Mar 1 01:31:17 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Mon, 28 Feb 2011 19:31:17 -0500 Subject: [R] regression with categorical nuisance variable In-Reply-To: References: Message-ID: Hi Jon, Just enter it as a predictor in the model. You almost can't go wrong with this one. Usually I would caution you to convert your categorical variables to factors and make sure the contrasts are set how you want them, but in this case it doesn't matter because there are (I assume) only two levels of gender, and you don't really care about interpreting the coefficient anyway. Best, Ista On Mon, Feb 28, 2011 at 7:14 PM, Jonathan DuBois wrote: > Hi, > > I am new to R, so I am unsure of the formula to set up this analysis. > I would like to run a linear model with a continuous dependent > variable (brain volume) and a continuous independent variable (age) > while controlling for a categorical nuisance variable (gender). > > Age and brain volume are correlated. > There are no gender differences in age but there are significant > gender differences in brain volume. > Therefore, I would like to control for gender when assessing the > association between brain volume and age. > > Any help would be very much appreciated. > > Jon > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From ehlers at ucalgary.ca Tue Mar 1 01:31:43 2011 From: ehlers at ucalgary.ca (Peter Ehlers) Date: Mon, 28 Feb 2011 16:31:43 -0800 Subject: [R] nls not solving In-Reply-To: <1298931261660-3328862.post@n4.nabble.com> References: <1298924582492-3328647.post@n4.nabble.com> <1298931261660-3328862.post@n4.nabble.com> Message-ID: <4D6C3E6F.8030101@ucalgary.ca> On 2011-02-28 14:14, Schatzi wrote: > I am not sure how you simplified the model to: > y = a + b(1 - exp(kl)) - b exp(-kx) > > I tried simplifying it but only got to: > y = a + b - b * exp(kl) * exp(-kx) > > I agree that the model must not be identifiable. That makes sense, > especially given that removing either a or l makes the model work. Can you > please further explain the math though as I am not understanding it? I do > not see you obtained your equation and when I tried to solve using your > equation I got quite different numbers. Thank you. You can obviously write your function as f <- f(x, A, B, K) {A - B * exp(-Kx)} i.e. in terms of *3* parameters. In that form, it's apple pie for nls(). fm <- nls(y ~ f(x, A, B, K), start = list(A = 50, B = 60, K = 1) coef(fm) xx <- seq(0, 72, length = 101) yy <- predict(fm, newdata = list(x = xx)) plot(x, y) lines(xx, yy, col = "red") Peter Ehlers From laomeng.3 at gmail.com Tue Mar 1 01:52:06 2011 From: laomeng.3 at gmail.com (Lao Meng) Date: Tue, 1 Mar 2011 08:52:06 +0800 Subject: [R] regression with categorical nuisance variable In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From arrayprofile at yahoo.com Tue Mar 1 02:02:54 2011 From: arrayprofile at yahoo.com (array chip) Date: Mon, 28 Feb 2011 17:02:54 -0800 (PST) Subject: [R] nested case-control study In-Reply-To: <1298905163.15393.13.camel@punchbuggy> References: <1298905163.15393.13.camel@punchbuggy> Message-ID: <253166.80532.qm@web56305.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rosyaraur at gmail.com Tue Mar 1 01:58:13 2011 From: rosyaraur at gmail.com (Umesh Rosyara) Date: Mon, 28 Feb 2011 19:58:13 -0500 Subject: [R] stuk at another point: simple question References: Message-ID: <357B5A86D35642DF950B091F25CE333F@OwnerPC> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ufukbeyaztas at gmail.com Tue Mar 1 01:39:13 2011 From: ufukbeyaztas at gmail.com (ufuk beyaztas) Date: Mon, 28 Feb 2011 16:39:13 -0800 (PST) Subject: [R] selection of a subset from a loop Message-ID: <1298939953082-3329057.post@n4.nabble.com> Hi dear all, The code like this; e <- rnorm(n=50, mean=0, sd=sqrt(0.5625)) x0 <- c(rep(1,50)) x1 <- rnorm(n=50,mean=2,sd=1) x2 <- rnorm(n=50,mean=2,sd=1) x3 <- rnorm(n=50,mean=2,sd=1) x4 <- rnorm(n=50,mean=2,sd=1) y <- 1+ 2*x1+4*x2+3*x3+2*x4+e x2[1] = 10 #influential observarion y[1] = 10 #influential observarion X <- matrix(c(x0,x1,x2,x3,x4),ncol=5) Y <- matrix(y,ncol=1) Design.data <- cbind(X, Y) result <- list () for( i in 1: 3100) { data <- Design.data[sample(50,50,replace=TRUE),] dataX <- data[,1:5] dataY <- data[,6] B.cap.simulation <- solve(crossprod(dataX)) %*% crossprod(dataX, dataY) P.simulation <- dataX %*% solve(crossprod(dataX)) %*% t(dataX) Y.cap.simulation <- P.simulation %*% dataY e.simulation <- dataY - Y.cap.simulation dX.simulation <- nrow(dataX) - ncol(dataX) var.cap.simulation <- crossprod(e.simulation) / (dX.simulation) ei.simulation <- as.vector(dataY - dataX %*% B.cap.simulation) pi.simulation <- diag(P.simulation) var.cap.i.simulation <- (((dX.simulation) * var.cap.simulation)/(dX.simulation - 1)) - (ei.simulation^2/((dX.simulation - 1) * (1 - pi.simulation))) ti.simulation <- ei.simulation / sqrt(var.cap.simulation * (1 - pi.simulation)) ti.star.simulation <- ei.simulation / sqrt(var.cap.i.simulation * (1 - pi.simulation)) pi.star.simulation <- pi.simulation + ei.simulation^2 / crossprod(e.simulation) WKi.simulation <- (ti.star.simulation)*sqrt(pi.simulation/(1-pi.simulation)) result<- c(result,list(WKi.simulation)) } Finally i get the result which contains 3100 WKi.simulation. I'm trying to get a subset for those subset do not contain any Y[1,] that is point 10. Can anyone help me about how to be? Thanks for any idea... -- View this message in context: http://r.789695.n4.nabble.com/selection-of-a-subset-from-a-loop-tp3329057p3329057.html Sent from the R help mailing list archive at Nabble.com. From matt at biostatmatt.com Tue Mar 1 03:59:45 2011 From: matt at biostatmatt.com (Matt Shotwell) Date: Mon, 28 Feb 2011 21:59:45 -0500 Subject: [R] Robust variance estimation with rq (failure of the bootstrap?) In-Reply-To: References: Message-ID: <1298948385.1668.106.camel@matt-laptop> Jim, If repeated measurements on patients are correlated, then resampling all measurements independently induces an incorrect sampling distribution (=> incorrect variance) on a statistic of these data. One solution, as you mention, is the block or cluster bootstrap, which preserves the correlation among repeated observations in resamples. I don't immediately see why the cluster bootstrap is unsuitable. Beyond this, I would be concerned about *any* variance estimates that are blind to correlated observations. The bootstrap variance estimate may be larger than the asymptotic variance estimate, but that alone isn't evidence to favor one over the other. Also, I can't justify (to myself) why skew would hamper the quality of bootstrap variance estimates. I wonder how it affects the sandwich variance estimate... Best, Matt On Mon, 2011-02-28 at 17:50 -0600, James Shaw wrote: > I am fitting quantile regression models using data collected from a > sample of 124 patients. When modeling cross-sectional associations, I > have noticed that nonparametric bootstrap estimates of the variances > of parameter estimates are much greater in magnitude than the > empirical Huber estimates derived using summary.rq's "nid" option. > The outcome variable is severely skewed, and I am afraid that this may > be affecting the consistency of the bootstrap variance estimates. I > have read that the m out of n bootstrap can be used to overcome this > problem. However, this procedure requires both the original sample > (n) and the subsample (m) sizes to be large. The version implemented > in rq.boot does not appear to provide any improvement over the naive > bootstrap. Ultimately, I am interested in using median regression to > model changes in the outcome variable over time. Summary.rq's robust > variance estimator is not applicable to repeated-measures data. I > question whether the block (cluster) bootstrap variance estimator, > which can accommodate intraclass correlation, would perform well. Can > anyone suggest alternatives for variance estimation in this situation? > Regards, > > Jim > > > James W. Shaw, Ph.D., Pharm.D., M.P.H. > Assistant Professor > Department of Pharmacy Administration > College of Pharmacy > University of Illinois at Chicago > 833 South Wood Street, M/C 871, Room 266 > Chicago, IL 60612 > Tel.: 312-355-5666 > Fax: 312-996-0868 > Mobile Tel.: 215-852-3045 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From chen_1002 at fisher.osu.edu Tue Mar 1 03:57:13 2011 From: chen_1002 at fisher.osu.edu (chen jia) Date: Mon, 28 Feb 2011 21:57:13 -0500 Subject: [R] Data type problem when extract data from SQLite to R by using RSQLite In-Reply-To: References: Message-ID: Hi Seth, Thanks for the reply. I provide info from sessionInfo() and about schema that you ask. Please take a look. The output from sessionInfo() is > sessionInfo() R version 2.12.2 (2011-02-25) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UT> sessionInfo() R version 2.12.2 (2011-02-25) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] plyr_1.2.1 RSQLite_0.9-2 DBI_0.2-5 filehash_2.1-1F-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] plyr_1.2.1 RSQLite_0.9-2 DBI_0.2-5 filehash_2.1-1 The .schema of table annual_data3 is sqlite> .schema annual_data3 CREATE TABLE "annual_data3"( PERMNO INT, DATE INT, CUSIP TEXT, EXCHCD INT, SICCD INT, SHROUT INT, PRC REAL, RET REAL, ... pret_var, pRET_sd, nmret, pya_var, pya_sd, nya, pya_var_ebi, pya_sd_ebi, pya_var_ebit, pya_sd_ebit, pya_var_ebitda, pya_sd_ebitda, logage REAL, logasset REAL, ... loglead1stdaret, loglead2stdaret) Table annual_data3 is created by joining table annual_data2 and ya_vol. The column pya_var is initially in ya_vol. dbGetQuery(sql.industry, "create table annual_data3 as select a.*, b.pya_var, b.pya_sd, b.nya, b.pya_var_ebi, b.pya_sd_ebi, b.pya_var_ebit, b.pya_sd_ebit, b.pya_var_ebitda, b.pya_sd_ebitda from annual_data2 as a left join ya_vol as b on a.permno = b.permno and a.year = b.year order by permno, year") Table ya_vol is created by dbGetQuery(sql.industry, "create table ya_vol as select PERMNO, year, variance(ya) as pya_var, stdev(ya) as pya_sd, count(*) as nya, variance(ya_ebi) as pya_var_ebi, stdev(ya_ebi) as pya_sd_ebi, variance(ya_ebit) as pya_var_ebit, stdev(ya_ebit) as pya_sd_ebit, variance(ya_ebitda) as pya_var_ebitda, stdev(ya_ebitda) as pya_sd_ebitda from past_ya where ya is not null group by PERMNO, year order by PERMNO, year") The schema info of ya_vol is sqlite> .schema ya_vol CREATE TABLE ya_vol( PERMNO INT, year INT, pya_var, pya_sd, nya, pya_var_ebi, pya_sd_ebi, pya_var_ebit, pya_sd_ebit, pya_var_ebitda, pya_sd_ebitda ); CREATE INDEX ya_vol_permno_year_idx on ya_vol (permno,year); Interestingly, I find that the problem I reported does not for columns labeled real in the schema info. For example, the type of column RET never changes no matter what the first observation is. > str(dbGetQuery(sql.industry, + "select RET from annual_data3 + where RET is not null limit 5")) 'data.frame': 5 obs. of 1 variable: $ RET: num -0.03354 -0.02113 0.03797 0.0013 -0.00678 > > str(dbGetQuery(sql.industry, + "select RET from annual_data3 + where RET is null limit 5")) 'data.frame': 5 obs. of 1 variable: $ RET: num NA NA NA NA NA > sapply(dbGetQuery(sql.industry, + "select RET from annual_data3 + where RET is null limit 5"), + typeof) RET "double" > sapply(dbGetQuery(sql.industry, + "select RET from annual_data3 + where RET is not null limit 5"), + typeof) RET "double" I still don't know how to solve this problem for variable pya_var, please help. Thanks. Best, Jia On Mon, Feb 28, 2011 at 6:48 PM, Seth Falcon wrote: > Hi Jia, > > On Mon, Feb 28, 2011 at 12:37 PM, chen jia wrote: >> When I extract data from SQLite to R, the data types (or modes) of the >> extracted data seems to be determined by the value of the first row. >> Please see the following example. > > It would help to provide the output of sessionInfo() as well as the > schema definition for the table in SQLite (or at least description of > how it was created). > > Here's an example that works as you'd like: > > ? ?> library(RSQLite) > ? ?> db = dbConnect(SQLite(), dbname = ":memory:") > ? ?> dbGetQuery(db, "create table t (a int, b real, c text)") > ? ?> df = data.frame(a=c(NA, 1L, 2L), b=c(NA, 1.1, 2.2), c=c(NA, "x", > "y"),stringsAsFactors=FALSE) > ? ?> df > ? ? ? a ? b ? ?c > ? ?1 NA ?NA > ? ?2 ?1 1.1 ? ?x > ? ?3 ?2 2.2 ? ?y > ? ?> dbGetPreparedQuery(db, "insert into t values (?, ?, ?)", df) > ? ?> dbGetQuery(db, "select * from t") > ? ? ? a ? b ? ?c > ? ?1 NA ?NA > ? ?2 ?1 1.1 ? ?x > ? ?3 ?2 2.2 ? ?y > ? ?> sapply(dbGetQuery(db, "select * from t"), typeof) > ? ? ? ? ? ? ?a ? ? ? ? ? b ? ? ? ? ? c > ? ? ?"integer" ? ?"double" "character" > ? ?> sapply(dbGetQuery(db, "select * from t limit 1"), typeof) > ? ? ? ? ? ? ?a ? ? ? ? ? b ? ? ? ? ? c > ? ? ?"integer" ? ?"double" "character" > ? ?> sapply(dbGetQuery(db, "select a from t limit 1"), typeof) > ? ? ? ? ? ?a > ? ?"integer" > ? ?> sapply(dbGetQuery(db, "select a from t limit 2"), typeof) > ? ? ? ? ? ?a > ? ?"integer" > ? ?> sapply(dbGetQuery(db, "select a from t limit 1"), typeof) > ? ? ? ? ? ?a > ? ?"integer" > > >> sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats ? ? graphics ?grDevices datasets ?utils ? ? methods ? base > > other attached packages: > [1] RSQLite_0.9-4 DBI_0.2-5 > > loaded via a namespace (and not attached): > [1] tools_2.11.1 > > > > > -- > Seth Falcon | @sfalcon | http://userprimary.net/ > -- 700 Fisher Hall 2100 Neil Ave. Columbus, Ohio? 43210 http://www.fisher.osu.edu/~chen_1002/ From violagirl470 at msn.com Tue Mar 1 03:17:15 2011 From: violagirl470 at msn.com (Laura Clasemann) Date: Tue, 1 Mar 2011 02:17:15 +0000 Subject: [R] Entering table with multiple columns & rows Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From izahn at psych.rochester.edu Tue Mar 1 04:22:01 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Tue, 1 Mar 2011 03:22:01 +0000 Subject: [R] Entering table with multiple columns & rows In-Reply-To: References: Message-ID: Hi Laura, If you type diet you will see that it has 4 columns and 2 rows. Since it has only 2 rows you cannot give it rownames of length 4. Best, Ista On Tue, Mar 1, 2011 at 2:17 AM, Laura Clasemann wrote: > > Hi, > > I'm having difficulty with getting a table to show with > multiple rows and columns. Below is the commands that I've typed in and > errors that I am getting. Thank you. > > Laura > > > Table trying to enter: > > Diet: ? ? ? ? ? ? ? ? Binger-yes: ? ? ? ? ? Binger-No: ? ? ? ? ? ? ?Total: > None 24 134 158 > Healthy 9 52 61 > Unhealthy 23 72 95 > Dangerous 12 15 27 > > > > > >> diet=matrix(c(24,134,9,52,23,72,12,15),ncol=4,byrow=TRUE) >> rownames(diet)=c("none", "healthy", "unhealthy", "dangerous") > Error in dimnames(x) <- dn : > ?length of 'dimnames' [1] not equal to array extent >> diet=matrix(c(24,134,9,52,23,72,12,15), ncol=4, byrow=4) >> rownanes(diet)=c("none", "healthy", "unhealthy", "dangerous") > Error in rownanes(diet) = c("none", "healthy", "unhealthy", "dangerous") : > ?could not find function "rownanes<-" >> rownames(diet)=c("none", "healthy", "unhealthy", "dangerous") > Error in dimnames(x) <- dn : > ?length of 'dimnames' [1] not equal to array extent >> diet=matrix(c(24,134,9,52,23,72,12,15), ncol=4, byrow=4) >> rownames(diet)=c("none", "healthy", "unhealthy", "dangerous") > Error in dimnames(x) <- dn : > ?length of 'dimnames' [1] not equal to array extent >> > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From jorgeivanvelez at gmail.com Tue Mar 1 06:23:31 2011 From: jorgeivanvelez at gmail.com (Jorge Ivan Velez) Date: Tue, 1 Mar 2011 00:23:31 -0500 Subject: [R] Entering table with multiple columns & rows In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From seth at userprimary.net Tue Mar 1 07:16:42 2011 From: seth at userprimary.net (Seth Falcon) Date: Mon, 28 Feb 2011 22:16:42 -0800 Subject: [R] Data type problem when extract data from SQLite to R by using RSQLite In-Reply-To: References: Message-ID: Hi Jia, On Mon, Feb 28, 2011 at 6:57 PM, chen jia wrote: > The .schema of table annual_data3 is > sqlite> .schema annual_data3 > CREATE TABLE "annual_data3"( > ?PERMNO INT, > ?DATE INT, > ?CUSIP TEXT, > ?EXCHCD INT, > ?SICCD INT, > ?SHROUT INT, > ?PRC REAL, > ?RET REAL, > ?... > ?pret_var, > ?pRET_sd, > ?nmret, > ?pya_var, [snip] Is there a reason that you've told SQLite the expected data type for only some of the columns? > Interestingly, I find that the problem I reported does not for columns > labeled real in the schema info. For example, the type of column RET > never changes no matter what the first observation is. Yes, that is expected and I think it is the solution to your problem: setup your schema so that all columns have a declared type. For some details on SQLite's type system see http://www.sqlite.org/datatype3.html. RSQLite currently maps NA values to NULL in the database. Pulling data out of a SELECT query, RSQLite uses the sqlite3_column_type SQLite API to determine the data type and map it to an R type. If NULL is encountered, then the schema is inspected using sqlite3_column_decltype to attempt to obtain a type. If that fails, the data is mapped to a character vector at the R level. The type selection is done once after the first row has been fetched. To work around this you can: - make sure your schema has well defined types (which will help SQLite perform its operations); - check whether the returned column has the expected type and convert if needed at the R level. - remove NA/NULL values from the db or decide on a different way of encoding them (e.g you might be able to use -1 in the db in some situation to indicate missing). Your R code would then need to map these to proper NA. Hope that helps. + seth -- Seth Falcon | @sfalcon | http://userprimary.net/ From jeroenooms at gmail.com Tue Mar 1 05:17:28 2011 From: jeroenooms at gmail.com (Jeroen Ooms) Date: Mon, 28 Feb 2011 20:17:28 -0800 (PST) Subject: [R] getting attributes of list without the "names". Message-ID: <1298953048526-3329209.post@n4.nabble.com> I am trying to encode arbitrary S3 objects by recursively looping over the object and all its attributes. However, there is an unfortunate feature of the attributes() function that is causing trouble. From the manual for ?attributes: The names of a pairlist are not stored as attributes, but are reported as if they were (and can be set by the replacement method for attributes). Now because of this, my program ends up in infinite recursion, because it will try to encode attributes(attributes(attributes(attributes(list(foo=123)))) etc. I can't remove the 'names' attribute, because this will actually affect the list structure. And even when I do: attributes(attributes(obj)[names(attributes(obj)) != "names"]) This will keep giving me a named list. Is there any way I can get the attributes() of a list without it reporting the names of a list as attributes? I.e it should hold that: atr1 <- attributes(list(foo="bar")); atr2 <- attributes(list()); identical(atr1,atr2); -- View this message in context: http://r.789695.n4.nabble.com/getting-attributes-of-list-without-the-names-tp3329209p3329209.html Sent from the R help mailing list archive at Nabble.com. From kadodamball at hotmail.com Tue Mar 1 04:18:18 2011 From: kadodamball at hotmail.com (bwaxxlo) Date: Mon, 28 Feb 2011 19:18:18 -0800 (PST) Subject: [R] Simulation Message-ID: <1298949498526-3329173.post@n4.nabble.com> I tried looking for help but I couldn't locate the exact solution. I have data that has several variables. I want to do several sample simulations using only two of the variables (eg: say you have data between people and properties owned. You only want to check how many in the samples will come up with bicycles) to estimate probabilities and that sort of thing. Now, I can only do a simulation in terms of this code: sample(1:10, size = 15, replace = TRUE). I do not know how select specific variables only. I'll appreciate the help -- View this message in context: http://r.789695.n4.nabble.com/Simulation-tp3329173p3329173.html Sent from the R help mailing list archive at Nabble.com. From violagirl470 at msn.com Tue Mar 1 05:05:28 2011 From: violagirl470 at msn.com (Laura Clasemann) Date: Tue, 1 Mar 2011 04:05:28 +0000 Subject: [R] Help Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dao4free at gmail.com Tue Mar 1 07:24:09 2011 From: dao4free at gmail.com (Pavel Goldstein) Date: Tue, 1 Mar 2011 08:24:09 +0200 Subject: [R] Explained variance for ICA Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ivan.calandra at uni-hamburg.de Tue Mar 1 09:45:20 2011 From: ivan.calandra at uni-hamburg.de (Ivan Calandra) Date: Tue, 01 Mar 2011 09:45:20 +0100 Subject: [R] Help In-Reply-To: References: Message-ID: <4D6CB220.9000105@uni-hamburg.de> Hi Laura, Have you read this documentation http://cran.r-project.org/doc/manuals/R-data.pdf ? If not, you should. Specifically, see read.table() and read.csv(). When you'll have this, you can look for functions that can import xls or xlsx spreadsheets directly, but you're not that far yet. There are also plenty of documentation for beginners that you should read, see for example http://www.burns-stat.com/pages/Tutor/hints_R_begin.html This one is always good too http://www.burns-stat.com/pages/Tutor/hints_R_begin.html Concerning your data, it already looks good (except for the ":"). R can work with Excel spreadsheets without problems. I think that if you read that, you will know how to do barplots too. If you then have problems using some specific functions or doing some specific task, you can ask the list again, with a reproducible example, with the code you tried, with the error/warning you got, and with the desired output if relevant. For this, read the posting guide if you still haven't. Now good luck with the readings, Ivan Le 3/1/2011 05:05, Laura Clasemann a ?crit : > Hi, > > I've been working with R the past few days in trying to get a proper table set up but have not had any luck at all and the manuals I've read have not worked for me either. I was wondering if anyone here would be able to help me in setting this data up in R both as a table and as a bargraph as I have not been able to do it myself. If somebody could help me with doing this and send me an overview in a document on how it should look, I would really appreciate it. Thank you. > > Laura > > > Diet: Binger-yes: Binger-No: Total: > None 24 134 158 > Healthy 9 52 61 > Unhealthy 23 72 95 > Dangerous 12 15 27 > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php From ivan.calandra at uni-hamburg.de Tue Mar 1 09:48:10 2011 From: ivan.calandra at uni-hamburg.de (Ivan Calandra) Date: Tue, 01 Mar 2011 09:48:10 +0100 Subject: [R] Simulation In-Reply-To: <1298949498526-3329173.post@n4.nabble.com> References: <1298949498526-3329173.post@n4.nabble.com> Message-ID: <4D6CB2CA.2070304@uni-hamburg.de> Well, knowing how your data looks like would definitely help! Say your data object is called "mydata", just paste the output from dput(mydata) into the email you want to send to the list. Ivan Le 3/1/2011 04:18, bwaxxlo a ?crit : > I tried looking for help but I couldn't locate the exact solution. > I have data that has several variables. I want to do several sample > simulations using only two of the variables (eg: say you have data between > people and properties owned. You only want to check how many in the samples > will come up with bicycles) to estimate probabilities and that sort of > thing. > Now, I can only do a simulation in terms of this code: sample(1:10, size = > 15, replace = TRUE). > I do not know how select specific variables only. > I'll appreciate the help > -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php From djmuser at gmail.com Tue Mar 1 10:04:32 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Tue, 1 Mar 2011 01:04:32 -0800 Subject: [R] stuk at another point: simple question In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From grant.j.gillis at gmail.com Tue Mar 1 10:06:40 2011 From: grant.j.gillis at gmail.com (Grant Gillis) Date: Tue, 1 Mar 2011 09:06:40 +0000 Subject: [R] tricky (for me) merging of data...more clarity Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From joongi at hanmail.net Tue Mar 1 10:27:54 2011 From: joongi at hanmail.net (JoonGi) Date: Tue, 1 Mar 2011 01:27:54 -0800 (PST) Subject: [R] How to Save R library data into xls or dta format Message-ID: <1298971674645-3329489.post@n4.nabble.com> Thanks in advance. I'm having a trouble with data saving. I want to run the same data which is in Ecdat library at different statistic programs(excel, stata and matlab) The data I want to use is library(Ecdat) data(Housing) and I want to extract this data our of R as *.dta *.xls formats. So, my first try was to open the data in R window and drag and paste to excel or notepad. BUT, it didn't work. Do you have any decent skills to extract this library data? Please, share to me. -- View this message in context: http://r.789695.n4.nabble.com/How-to-Save-R-library-data-into-xls-or-dta-format-tp3329489p3329489.html Sent from the R help mailing list archive at Nabble.com. From santosh.srinivas at gmail.com Tue Mar 1 11:13:25 2011 From: santosh.srinivas at gmail.com (Santosh Srinivas) Date: Tue, 1 Mar 2011 15:43:25 +0530 Subject: [R] How to Save R library data into xls or dta format In-Reply-To: <1298971674645-3329489.post@n4.nabble.com> References: <1298971674645-3329489.post@n4.nabble.com> Message-ID: for excel .. see library(xlsx) On Tue, Mar 1, 2011 at 2:57 PM, JoonGi wrote: > > Thanks in advance. > > I'm having a trouble with data saving. > > I want to run the same data which is in Ecdat library at different statistic > programs(excel, stata and matlab) > > The data I want to use is > > library(Ecdat) > data(Housing) > > and I want to extract this data our of R as *.dta *.xls formats. > So, my first try was to open the data in R window and drag and paste to > excel or notepad. > > BUT, it didn't work. > > Do you have any decent skills to extract this library data? > > Please, share to me. > > > > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-Save-R-library-data-into-xls-or-dta-format-tp3329489p3329489.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From john.seers at googlemail.com Tue Mar 1 11:01:38 2011 From: john.seers at googlemail.com (John Seers) Date: Tue, 1 Mar 2011 10:01:38 +0000 Subject: [R] RWinEdt difficulties Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Martyn.Byng at nag.co.uk Tue Mar 1 11:22:23 2011 From: Martyn.Byng at nag.co.uk (Martyn Byng) Date: Tue, 1 Mar 2011 10:22:23 -0000 Subject: [R] Help References: Message-ID: <49E76DF37649DC48A4CE882BC8CE51C9018B0ACC@nagmail2.nag.co.uk> Hi, It is a bit unclear what it is you are trying to do, as mentioned in replies by a variety of people previously, if you are just trying to get your data into R and label rows / columns, then tt = matrix(c(24,134,158,9,52,61,23,72,95,12,15,27),ncol=3,byrow=T) rownames(tt) = c("None", "Healthy","Unhealthy","Dangerous") colnames(tt) = c("Binger-yes:", "Binger-no:", "Total") is sufficient, and bar charts can be created using the barplot command, for example: barplot(t(tt[,1:2]),legend.text=colnames(tt[,1:2]),xlab="Diet") If you are trying to do something other than this then you will need to supply some additional information Martyn -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Laura Clasemann Sent: 01 March 2011 04:05 To: r-help at r-project.org Subject: [R] Help Hi, I've been working with R the past few days in trying to get a proper table set up but have not had any luck at all and the manuals I've read have not worked for me either. I was wondering if anyone here would be able to help me in setting this data up in R both as a table and as a bargraph as I have not been able to do it myself. If somebody could help me with doing this and send me an overview in a document on how it should look, I would really appreciate it. Thank you. Laura Diet: Binger-yes: Binger-No: Total: None 24 134 158 Healthy 9 52 61 Unhealthy 23 72 95 Dangerous 12 15 27 [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ________________________________________________________________________ This e-mail has been scanned for all viruses by Star.\ _...{{dropped:12}} From murdoch.duncan at gmail.com Tue Mar 1 11:30:27 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Tue, 01 Mar 2011 05:30:27 -0500 Subject: [R] getting attributes of list without the "names". In-Reply-To: <1298953048526-3329209.post@n4.nabble.com> References: <1298953048526-3329209.post@n4.nabble.com> Message-ID: <4D6CCAC3.1080504@gmail.com> On 11-02-28 11:17 PM, Jeroen Ooms wrote: > I am trying to encode arbitrary S3 objects by recursively looping over the > object and all its attributes. However, there is an unfortunate feature of > the attributes() function that is causing trouble. From the manual for > ?attributes: > > The names of a pairlist are not stored as attributes, but are reported as if > they were (and can be set by the replacement method for attributes). > > Now because of this, my program ends up in infinite recursion, because it > will try to encode > attributes(attributes(attributes(attributes(list(foo=123)))) etc. I can't > remove the 'names' attribute, because this will actually affect the list > structure. And even when I do: > > attributes(attributes(obj)[names(attributes(obj)) != "names"]) > > This will keep giving me a named list. Is there any way I can get the > attributes() of a list without it reporting the names of a list as > attributes? I.e it should hold that: > > atr1<- attributes(list(foo="bar")); > atr2<- attributes(list()); > identical(atr1,atr2); The names of a list (a generic vector) are attributes, just like the names of other vectors. The documentation is talking about pairlists, a mostly internal structure, used for example to store parts of expressions. So your premise might be wrong about the cause of the recursion... But assuming you really want to see all attributes except names. Then just write your own version: nonameattributes <- function(obj) { result <- attributes(obj) if (!is.null(result$names)) result$names <- NULL # This removes the empty names of the result if there were no other # attributes. It's optional, but you said you wanted # identical(atr1, atr2) if (!length(result)) names(result) <- NULL result } You can make the conditional more complicated, only making the change for pairlists, etc., using tests on typeof(obj) or other tests. Duncan Murdoch From ripley at stats.ox.ac.uk Tue Mar 1 11:33:50 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Tue, 1 Mar 2011 10:33:50 +0000 (GMT) Subject: [R] How to Save R library data into xls or dta format In-Reply-To: References: <1298971674645-3329489.post@n4.nabble.com> Message-ID: On Tue, 1 Mar 2011, Santosh Srinivas wrote: > for excel .. see library(xlsx) (That's for Excel >= 2007 only, .xlsx not .xls are requested.) Simply consult the relevant manual, 'R Data Import/Export': all of this is covered there. There is a very new package XLConnect that is only covered in the latest version (post R 2.12.2). > > On Tue, Mar 1, 2011 at 2:57 PM, JoonGi wrote: >> >> Thanks in advance. >> >> I'm having a trouble with data saving. >> >> I want to run the same data which is in Ecdat library at different statistic >> programs(excel, stata and matlab) >> >> The data I want to use is >> >> library(Ecdat) >> data(Housing) >> >> and I want to extract this data our of R as *.dta *.xls formats. >> So, my first try was to open the data in R window and drag and paste to >> excel or notepad. >> >> BUT, it didn't work. >> >> Do you have any decent skills to extract this library data? >> >> Please, share to me. We already did: we wrote a manual for you! -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From jrheinlaender at gmx.de Tue Mar 1 11:58:06 2011 From: jrheinlaender at gmx.de (Jan) Date: Tue, 01 Mar 2011 11:58:06 +0100 Subject: [R] df.residual for rlm() Message-ID: <1298977086.15672.63.camel@jan-laptop> Hello, for testing coefficients of lm(), I wrote the following function (with the kind support of this mailing list): # See Verzani, simpleR (pdf), p. 80 coeff.test <- function(lm.result, idx, value) { # idx = 1 is the intercept, idx>1 the other coefficients # null hypothesis: coeff = value # alternative hypothesis: coeff != value coeff <- coefficients(lm.result)[idx] SE <- coefficients(summary(lm.result))[idx,"Std. Error"] n <- df.residual(lm.result) t <- (coeff - value )/SE 2 * pt(-abs(t),n) # times two because problem is two-sided } This works fine for lm() objects, but fails for rlm() because df.residual() is NA. Can I get the degrees of freedom by calculating n = length(lm.result) - length(coefficients(lm.result)) Thanks for any help! Jan From Laszlo.Bodnar at erstebank.hu Tue Mar 1 11:30:33 2011 From: Laszlo.Bodnar at erstebank.hu (Bodnar Laszlo EB_HU) Date: Tue, 1 Mar 2011 11:30:33 +0100 Subject: [R] bootstrap resampling question Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From joongi at hanmail.net Tue Mar 1 11:41:00 2011 From: joongi at hanmail.net (JoonGi) Date: Tue, 1 Mar 2011 02:41:00 -0800 (PST) Subject: [R] Is there any Command showing correlation of all variables in a dataset? Message-ID: <1298976060344-3329599.post@n4.nabble.com> Thanks in advance. I want to derive correlations of variables in a dataset Specifically library(Ecdat) data(Housing) attach(Housing) cor(lotsize, bathrooms) this code results only the correlationship between two variables. But I want to examine all the combinations of variables in this dataset. And I will finally make a table in Latex. How can I test correlations for all combinations of variables? with one simple command? -- View this message in context: http://r.789695.n4.nabble.com/Is-there-any-Command-showing-correlation-of-all-variables-in-a-dataset-tp3329599p3329599.html Sent from the R help mailing list archive at Nabble.com. From s067835 at alumni.cuhk.net Tue Mar 1 11:49:19 2011 From: s067835 at alumni.cuhk.net (vikkiyft) Date: Tue, 1 Mar 2011 02:49:19 -0800 (PST) Subject: [R] which does the "S.D." returned by {Hmisc} rcorr.cens measure? Message-ID: <1298976559588-3329609.post@n4.nabble.com> Dear R-help, This is an example in the {Hmisc} manual under rcorr.cens function: > set.seed(1) > x <- round(rnorm(200)) > y <- rnorm(200) > round(rcorr.cens(x, y, outx=F),4) C Index Dxy S.D. n missing uncensored Relevant Pairs Concordant Uncertain 0.4831 -0.0338 0.0462 200.0000 0.0000 200.0000 39800.0000 19228.0000 0.0000 That S.D. confuses me!! It is obviously not the standard deviation of x or y.. but there is only one realization of the c-index or Dxy for this sample dataset, where does the variation come from..?? if I use the conventional formula for calculating the standard deviation of proportions: sqrt((C Index)*(1-C Index)/n), I get 0.0353 instead of 0.0462.. Any advice is appreciated. Vikki -- View this message in context: http://r.789695.n4.nabble.com/which-does-the-S-D-returned-by-Hmisc-rcorr-cens-measure-tp3329609p3329609.html Sent from the R help mailing list archive at Nabble.com. From mantino84 at libero.it Tue Mar 1 12:00:47 2011 From: mantino84 at libero.it (Manta) Date: Tue, 1 Mar 2011 03:00:47 -0800 (PST) Subject: [R] SetInternet2, RCurl and proxy In-Reply-To: References: Message-ID: <1298977247406-3329624.post@n4.nabble.com> Dear all, I am facing a problem. I am trying to install packages using a proxy, but I am not able to call the setInternet2 function, either with the small or capital s. What package do I have to call then? And, could there be a reason why this does not function? Thanks, Marco -- View this message in context: http://r.789695.n4.nabble.com/SetInternet2-RCurl-and-proxy-tp3248576p3329624.html Sent from the R help mailing list archive at Nabble.com. From shawjw at gmail.com Tue Mar 1 12:35:41 2011 From: shawjw at gmail.com (James Shaw) Date: Tue, 1 Mar 2011 05:35:41 -0600 Subject: [R] Robust variance estimation with rq (failure of the bootstrap?) In-Reply-To: <1298948385.1668.106.camel@matt-laptop> References: <1298948385.1668.106.camel@matt-laptop> Message-ID: Matt: Thanks for your prompt reply. The disparity between the bootstrap and sandwich variance estimates derived when modeling the highly skewed outcome suggest that either (A) the empirical robust variance estimator is underestimating the variance or (B) the bootstrap is breaking down. The bootstrap variance estimate of a robust location estimate is not necessarily robust, see Statistics & Probability Letters 50 (2000) 49-53. Since submitting my earlier post, I have noticed that the the robust kernel variance estimate is similar to the bootstrap estimate. Under what conditions would one expect Koenker and Machado's sandwich variance estimator, which uses a local estimate of the sparsity, to fail? -- Jim On Mon, Feb 28, 2011 at 8:59 PM, Matt Shotwell wrote: > Jim, > > If repeated measurements on patients are correlated, then resampling all > measurements independently induces an incorrect sampling distribution > (=> incorrect variance) on a statistic of these data. One solution, as > you mention, is the block or cluster bootstrap, which preserves the > correlation among repeated observations in resamples. I don't > immediately see why the cluster bootstrap is unsuitable. > > Beyond this, I would be concerned about *any* variance estimates that > are blind to correlated observations. > > The bootstrap variance estimate may be larger than the asymptotic > variance estimate, but that alone isn't evidence to favor one over the > other. > > Also, I can't justify (to myself) why skew would hamper the quality of > bootstrap variance estimates. I wonder how it affects the sandwich > variance estimate... > > Best, > Matt > > On Mon, 2011-02-28 at 17:50 -0600, James Shaw wrote: >> I am fitting quantile regression models using data collected from a >> sample of 124 patients. ?When modeling cross-sectional associations, I >> have noticed that nonparametric bootstrap estimates of the variances >> of parameter estimates are much greater in magnitude than the >> empirical Huber estimates derived using summary.rq's "nid" option. >> The outcome variable is severely skewed, and I am afraid that this may >> be affecting the consistency of the bootstrap variance estimates. ?I >> have read that the m out of n bootstrap can be used to overcome this >> problem. ?However, this procedure requires both the original sample >> (n) and the subsample (m) sizes to be large. ?The version implemented >> in rq.boot does not appear to provide any improvement over the naive >> bootstrap. ?Ultimately, I am interested in using median regression to >> model changes in the outcome variable over time. ?Summary.rq's robust >> variance estimator is not applicable to repeated-measures data. ?I >> question whether the block (cluster) bootstrap variance estimator, >> which can accommodate intraclass correlation, would perform well. ?Can >> anyone suggest alternatives for variance estimation in this situation? >> Regards, >> >> Jim >> >> >> James W. Shaw, Ph.D., Pharm.D., M.P.H. >> Assistant Professor >> Department of Pharmacy Administration >> College of Pharmacy >> University of Illinois at Chicago >> 833 South Wood Street, M/C 871, Room 266 >> Chicago, IL 60612 >> Tel.: 312-355-5666 >> Fax: 312-996-0868 >> Mobile Tel.: 215-852-3045 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > -- James W. Shaw, Ph.D., Pharm.D., M.P.H. Assistant Professor Department of Pharmacy Administration College of Pharmacy University of Illinois at Chicago 833 South Wood Street, M/C 871, Room 266 Chicago, IL 60612 Tel.: 312-355-5666 Fax: 312-996-0868 Mobile Tel.: 215-852-3045 From marchywka at hotmail.com Tue Mar 1 12:47:13 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Tue, 1 Mar 2011 06:47:13 -0500 Subject: [R] Is there any Command showing correlation of all variables in a dataset? In-Reply-To: <1298976060344-3329599.post@n4.nabble.com> References: <1298976060344-3329599.post@n4.nabble.com> Message-ID: ---------------------------------------- > Date: Tue, 1 Mar 2011 02:41:00 -0800 > From: joongi at hanmail.net > To: r-help at r-project.org > Subject: [R] Is there any Command showing correlation of all variables in a dataset? > > > Thanks in advance. > > I want to derive correlations of variables in a dataset > > Specifically > > library(Ecdat) > data(Housing) > attach(Housing) > cor(lotsize, bathrooms) > > this code results only the correlationship between two variables. > But I want to examine all the combinations of variables in this dataset. > And I will finally make a table in Latex. > > How can I test correlations for all combinations of variables? > with one simple command? > This appears to work as expected although I don't have enough intuition to know if these numbers are plausible beyond rough guess( diag of course is one LOL), > df<-data.frame(a=rnorm(100), b=runif(100),c=runif(100)+rnorm(100)) > str(df) 'data.frame':?? 100 obs. of? 3 variables: ?$ a: num? 0.1841 0.2296 -1.2251 0.0898 -0.961 ... ?$ b: num? 0.586 0.343 0.821 0.41 0.352 ... ?$ c: num? -0.373 2.225 0.102 1.186 -0.737 ... > cor(df) ?????????? a?????????? b?????????? c a 1.00000000? 0.07710107? 0.11088579 b 0.07710107? 1.00000000 -0.02424471 c 0.11088579 -0.02424471? 1.00000000 > ?cor > df<-data.frame(a=.1*rnorm(100), b=(1:100)/100,c=(1:100)/100+.1*rnorm(100)) > cor(df) ???????????? a?????????? b??????????? c a? 1.000000000 -0.01970874 -0.003665239 b -0.019708737? 1.00000000? 0.950375445 c -0.003665239? 0.95037544? 1.000000000 > From haipreeja at gmail.com Tue Mar 1 12:30:54 2011 From: haipreeja at gmail.com (preeja) Date: Tue, 1 Mar 2011 03:30:54 -0800 (PST) Subject: [R] Hi Message-ID: <1298979054909-3329650.post@n4.nabble.com> Hi, I am facing problem for classification based on graph kernels. we calculated the kernel between two molecule data set.Then I am confused about classification -- View this message in context: http://r.789695.n4.nabble.com/Hi-tp3329650p3329650.html Sent from the R help mailing list archive at Nabble.com. From marchywka at hotmail.com Tue Mar 1 13:19:42 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Tue, 1 Mar 2011 07:19:42 -0500 Subject: [R] Simulation In-Reply-To: <1298949498526-3329173.post@n4.nabble.com> References: <1298949498526-3329173.post@n4.nabble.com> Message-ID: ---------------------------------------- > Date: Mon, 28 Feb 2011 19:18:18 -0800 > From: kadodamball at hotmail.com > To: r-help at r-project.org > Subject: [R] Simulation > > I tried looking for help but I couldn't locate the exact solution. > I have data that has several variables. I want to do several sample > simulations using only two of the variables (eg: say you have data between > people and properties owned. You only want to check how many in the samples > will come up with bicycles) to estimate probabilities and that sort of > thing. > Now, I can only do a simulation in terms of this code: sample(1:10, size = > 15, replace = TRUE). > I do not know how select specific variables only. > I'll appreciate the help This is probably not the best R but you can do something like either of these. Note that this is just the easiest derivative of stuff I already had and can be fixed to your needs, I usually use runif instead of sample for example. The first example probably being much less efficient than the second, df<-data.frame(a=.1*rnorm(100), b=(1:100)/100,c=(1:100)/100+.1*rnorm(100)) res=1:100; for ( i in 1:100) {res[i]=cor(df[which(runif(100)>.9),])[1,3] } res hist(res) res=1:100; for ( i in 1:100) {wh=which(runif(100)>.9); res[i]=cor(df$a[wh],df$c[wh]); } res > > -- > View this message in context: http://r.789695.n4.nabble.com/Simulation-tp3329173p3329173.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jsorkin at grecc.umaryland.edu Tue Mar 1 13:55:00 2011 From: jsorkin at grecc.umaryland.edu (John Sorkin) Date: Tue, 01 Mar 2011 07:55:00 -0500 Subject: [R] Components of variance with lme Message-ID: <4D6CA654020000CB00081963@medicine.umaryland.edu> R 2.10 Windows Vista Is it possible to run a variance-components analysis using lme? I looked at Pinheiro and Bates' book and don't see code that will perform these analyses. If the analyses can not be done using lme, what package might I try? Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} From therneau at mayo.edu Tue Mar 1 14:13:04 2011 From: therneau at mayo.edu (Terry Therneau) Date: Tue, 01 Mar 2011 07:13:04 -0600 Subject: [R] nested case-control study In-Reply-To: <253166.80532.qm@web56305.mail.re3.yahoo.com> References: <1298905163.15393.13.camel@punchbuggy> <253166.80532.qm@web56305.mail.re3.yahoo.com> Message-ID: <1298985184.13202.7.camel@punchbuggy> 1. Using offset(logweight) in coxph is the same as using an "offset logweight;" statement in SAS, and neither is the same as case weights. 2. For a nested case control, which is what you said you have, the "strata" controls who is in what risk set. No trickery with start,stop times is needed. It does no harm, but is not needed. In a case-cohort design the start= stop-epsilon trick is one way to set risk sets up properly. But that's not the design you gave for your study. Terry T. On Mon, 2011-02-28 at 17:02 -0800, array chip wrote: > Terry, thanks very much! > > Professor Langholz used a SAS software trick to estimate absolute risk > by creating a fake variable "entry_time" that is 0.001 less than the > variable "exit_time" (i.e. time to event), and then use both variables > in Phreg. Is this equivalent to your creating a dummy survival with > time=1? > > Another question is, is using offset(logweight) inside the formula of > coxph() the same as using weight=logweight argument in coxph(), > because my understanding of Professor Langholz's approach for nested > case-control study is weighted regression? > > Thank you very much for the help. > > John From scttchamberlain4 at gmail.com Tue Mar 1 14:12:58 2011 From: scttchamberlain4 at gmail.com (Scott Chamberlain) Date: Tue, 1 Mar 2011 07:12:58 -0600 Subject: [R] Components of variance with lme In-Reply-To: <4D6CA654020000CB00081963@medicine.umaryland.edu> References: <4D6CA654020000CB00081963@medicine.umaryland.edu> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From gregmacfarlane at gmail.com Tue Mar 1 14:32:54 2011 From: gregmacfarlane at gmail.com (gmacfarlane) Date: Tue, 1 Mar 2011 05:32:54 -0800 (PST) Subject: [R] mlogit.data In-Reply-To: References: Message-ID: <1298986374092-3329821.post@n4.nabble.com> http://r.789695.n4.nabble.com/file/n3329821/workdata.csv workdata.csv The code I posted is exactly what I am running. What you need is this data. Here is the code again. > hbwmode<-mlogit.data("worktrips.csv", shape="long", choice="CHOSEN", > alt.var="ALTNUM") > hbwmode<-mlogit.data(hbwtrips, shape="long", choice="CHOSEN", > alt.var="ALTNUM") -- View this message in context: http://r.789695.n4.nabble.com/mlogit-data-tp3328739p3329821.html Sent from the R help mailing list archive at Nabble.com. From lists at remoteinformation.com.au Tue Mar 1 14:50:29 2011 From: lists at remoteinformation.com.au (Ben Madin) Date: Tue, 1 Mar 2011 21:50:29 +0800 Subject: [R] unicode&pdf font problem RESOLVED In-Reply-To: References: Message-ID: <8B3A681F-9EAF-44DD-89BC-741F7F351668@remoteinformation.com.au> Just to add to this (I've been looking through the archive) problem with display unicode fonts in pdf document in R If you can use the Cairo package to create pdf on Mac, it seems quite happy with pushing unicode characters through (probably still font family dependant whether it will display) probstring <- c(' \u2264 0.2',' \u2268 0.4',' \u00FC 0.6',' \u2264 0.8',' \u2264 1.0') Cairo(type='pdf', file='outputs/demo.pdf', width=9,height=12, units='in', bg='transparent') plot(1:5,1:5, type='n') text(1:5,1:5,probstring) dev.off() ?Cairo suggests encoding is ignored if you do try to set it. cheers Ben On 14/01/2011, at 7:00 PM, r-help-request at r-project.org wrote: > Date: Thu, 13 Jan 2011 10:47:09 -0500 > From: David Winsemius > To: Sascha Vieweg > Cc: r-help at r-project.org > Subject: Re: [R] unicode&pdf font problem RESOLVED > Message-ID: <74FA099F-4CE5-45C7-A05A-4A1DE6C87EC8 at comcast.net> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes > > > On Jan 13, 2011, at 10:41 AM, Sascha Vieweg wrote: > >> I have many German umlauts in my data sets and code them UTF-8. When >> it comes to plotting on pdf, I figured out that "CP1257" is a good >> choice to output Umlauts. I have no experiences with "CP1250", but >> maybe this small hint helps: >> >> pdf(file=paste(sharepath, "/filename.pdf", sep=""), 9, 6, pointsize >> = 11, family = "Helvetica", encoding = "CP1257") > > Just an FYI for the archives, that encoding fails with > pdf(encoding="CP1257") on a Mac when printing that target umlaut. > > David. >> >> *S* >> >> On 11-01-13 16:17, tdenes at cogpsyphy.hu wrote: >> >>> Date: Thu, 13 Jan 2011 16:17:04 +0100 (CET) >>> From: tdenes at cogpsyphy.hu >>> To: David Winsemius >>> Cc: r-help at r-project.org >>> Subject: Re: [R] unicode&pdf font problem RESOLVED >>> >>> Dear David, >>> >>> Thank you for your efforts. Inspired by your remarks, I started a new >>> google-search and found this: >>> http://stackoverflow.com/questions/3434349/sweave-not-printing-localized-characters >>> >>> SO HERE COMES THE SOLUTION (it works on both OSs): >>> >>> pdf.options(encoding = "CP1250") >>> pdf() >>> plot(1,type="n") >>> text(1,1,"\U0171") >>> dev.off() >>> >>> CP1250 should work for all Central-European languages: >>> http://en.wikipedia.org/wiki/Windows-1250 >>> >>> >>> Thank you again, >>> Denes >>> >>> >>> >>>> >>>> On Jan 13, 2011, at 7:01 AM, tdenes at cogpsyphy.hu wrote: >>>> >>>>> >>>>> Hi! >>>>> >>>>> Sorry for the missing specs, here they are: >>>>>> version >>>>> _ >>>>> platform i386-pc-mingw32 >>>>> arch i386 >>>>> os mingw32 >>>>> system i386, mingw32 >>>>> status >>>>> major 2 >>>>> minor 12.1 >>>>> year 2010 >>>>> month 12 >>>>> day 16 >>>>> svn rev 53855 >>>>> language R >>>>> version.string R version 2.12.1 (2010-12-16) >>>>> >>>>> OS: Windows 7 (English version, 32 bit) >>>>> >>>>> >>>> >>>> You are after what Adobe calls: udblacute; 0171. It is recognized >>>> in >>>> the list of adobe glyphs: >>>>> str(tools::Adobe_glyphs[371, ]) >>>> 'data.frame': 1 obs. of 2 variables: >>>> $ adobe : chr "udblacute" >>>> $ unicode: chr "0171" >>>> >>>> Consulted the help pages >>>> points {graphics} >>>> postscript {grDevices} >>>> pdf {grDevices} >>>> charsets {tools} >>>> postscriptFonts {grDevices} >>>> >>>> I have tried a variety of the pdfFonts installed on my Mac without >>>> success. You can perhaps make a list of fonts on your machines with >>>> names(pdfFonts()). Perhaps the range of fonts and the glyphs they >>>> contain is different on your machines. I get consistently warning >>>> messages saying there is a conversion failure: >>>> >>>>> pdf("trial.pdf", family="Helvetica") >>>> # also tried with font="Helvetica" but I think that is erroneous >>>>> plot(1,type="n") >>>>> text(1,1,"print \U0170\U0171") >>>> Warning messages: >>>> 1: In text.default(1, 1, "print ????") : >>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot >>>> substituted >>>> for >>>> 2: In text.default(1, 1, "print ????") : >>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot >>>> substituted >>>> for >>>> 3: In text.default(1, 1, "print ????") : >>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot >>>> substituted >>>> for >>>> 4: In text.default(1, 1, "print ????") : >>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot >>>> substituted >>>> for >>>> 5: In text.default(1, 1, "print ????") : >>>> font metrics unknown for Unicode character U+0170 >>>> 6: In text.default(1, 1, "print ????") : >>>> font metrics unknown for Unicode character U+0171 >>>> 7: In text.default(1, 1, "print ????") : >>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot >>>> substituted >>>> for >>>> 8: In text.default(1, 1, "print ????") : >>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot >>>> substituted >>>> for >>>> 9: In text.default(1, 1, "print ????") : >>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot >>>> substituted >>>> for >>>> 10: In text.default(1, 1, "print ????") : >>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot >>>> substituted >>>> for >>>> >>>> And this is despite my system saying the \U0170 and \U0171 are >>>> present >>>> in the Helvetica font. Also tried family=URWHelvetica and >>>> family=NimbusSanand and a bunch of others without success, but my >>>> last >>>> best hope after reading the material in help(postscript) in the >>>> "Families" section had been NimbusSan. There is also information on >>>> that page regarding encodings that appears to be very machine >>>> specific. >>>> >>>>> >>>>> Note that \U0171 != ??. See >>>>> http://www.fileformat.info/info/unicode/char/171/index.htm >>>>> Anyway, I have no problem with ű (~u") and other special >>>>> Hungarian >>>>> characters in my R-Gui. It is correctly displayed in the console, >>>>> in >>>>> plots, etc. The problem is with the pdf conversion. >>>>> >>>>> The same holds for my Ubuntu Hardy Heron system*, with exactly the >>>>> same >>>>> error messages as reported in an earlier thread >>>>> http://www.mail-archive.com/r-help at r-project.org/msg89792.html >>>>> As far as I know, Hershey fonts do not contain \U0171. >>>>> >>>>> >>>>> Regards, >>>>> Denes >>>>> >>>>> * The specs of Ubuntu: >>>>>> version >>>>> _ >>>>> platform x86_64-pc-linux-gnu >>>>> arch x86_64 >>>>> os linux-gnu >>>>> system x86_64, linux-gnu >>>>> status >>>>> major 2 >>>>> minor 12.0 >>>>> year 2010 >>>>> month 10 >>>>> day 15 >>>>> svn rev 53317 >>>>> language R >>>>> version.string R version 2.12.0 (2010-10-15) >>>>> >>>>> >>>>>> >>>>>> On Jan 12, 2011, at 11:11 PM, tdenes at cogpsyphy.hu wrote: >>>>>> >>>>>>> >>>>>>> Dear List, >>>>>>> >>>>>>> I would like to print a plot into pdf. The problem is that the >>>>>>> character >>>>>>> \U0171 is replaced by a simple 'u' (i.e. without accents) in >>>>>>> the pdf >>>>>>> file. >>>>>>> >>>>>>> Example: >>>>>>> # this works fine >>>>>>> plot(1,type="n") >>>>>>> text(1,1,"print \U0171") >>>>>>> >>>>>>> # this fails >>>>>>> pdf("trial.pdf") >>>>>>> plot(1,type="n") >>>>>>> text(1,1,"print \U0171") >>>>>>> dev.off() >>>>>> >>>>>> Have you tried: >>>>>> >>>>>> pdf("trial.pdf") >>>>>> plot(1,type="n") >>>>>> text(1,1,"print ??") >>>>>> dev.off() >>>>>> >>>>>> Your default screen fonts may not be the same as your default pdf >>>>>> fonts. A lot depends on system specifics, none of which have you >>>>>> provided. >>>>>> >>>>>> >>>>>>> >>>>>>> I found an earlier post at >>>>>>> http://www.mail-archive.com/r-help at r-project.org/msg65541.html, >>>>>>> but >>>>>>> it is >>>>>>> too hard to understand at my R-level. Any help is appreciated. >>>>>> >>>>>> >>>>>> >>>>>> David Winsemius, MD >>>>>> West Hartford, CT >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> David Winsemius, MD >>>> West Hartford, CT >>>> >>>> >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> -- >> Sascha Vieweg, saschaview at gmail.com > > David Winsemius, MD > West Hartford, CT From rex.dwyer at syngenta.com Tue Mar 1 15:01:58 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Tue, 1 Mar 2011 09:01:58 -0500 Subject: [R] Explained variance for ICA In-Reply-To: References: Message-ID: <36180405F8418449918AD20618D110FC095BF0B913@USETCMSXMB02.NAFTA.SYNGENTA.ORG> You determine the variance explained by *any* unit vector by taking its inner product with the data points, then finding the variance of the results. In the case of FastICA, the variance explained by the ICs collectively is exactly the same as the variance explained by the principal components (collectively) from which they are derived. HTH Rex -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Pavel Goldstein Sent: Tuesday, March 01, 2011 1:24 AM To: r-help at r-project.org Subject: [R] Explained variance for ICA Hello, I think to use FastICA package for microarray data clusterization, but one question stops me: can I know how much variance explain each component (or all components together) ? I will be very thankful for the help. Thanks, Pavel [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. From f.harrell at vanderbilt.edu Tue Mar 1 15:05:44 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Tue, 1 Mar 2011 06:05:44 -0800 (PST) Subject: [R] what does the "S.D." returned by {Hmisc} rcorr.cens measure? In-Reply-To: <1298976559588-3329609.post@n4.nabble.com> References: <1298976559588-3329609.post@n4.nabble.com> Message-ID: <1298988344382-3329899.post@n4.nabble.com> Vikki, The formula you used for std. error of C is not correct. C is not a simple per-observation proportion. SD in the output is the standard error of Dxy. Dxy = 2(C - .5). Backsolve for std err of C. Variation in Dxy or C comes from the usual source: sampling variability. You can also see this by sampling from the original dataset (a la bootstrap). Frank vikkiyft wrote: > > Dear R-help, > > This is an example in the {Hmisc} help manual in the section of > rcorr.cens: > >> set.seed(1) >> x <- round(rnorm(200)) >> y <- rnorm(200) >> round(rcorr.cens(x, y, outx=F),4) > C Index Dxy S.D. n missing > uncensored Relevant Pairs Concordant Uncertain > 0.4831 -0.0338 0.0462 200.0000 0.0000 > 200.0000 39800.0000 19228.0000 0.0000 > > That S.D. confuses me!! > It is obviously not the standard deviation of x or y.. but there is only > one realization of the c-index or Dxy for this sample dataset, where does > the variation come from..?? if I use the conventional formula for > calculating the standard deviation of proportions: sqrt((C Index)*(1-C > Index)/n), I get 0.0353 instead of 0.0462.. > > Any advice is appreciated. > > > Vikki > ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/what-does-the-S-D-returned-by-Hmisc-rcorr-cens-measure-tp3329609p3329899.html Sent from the R help mailing list archive at Nabble.com. From ggrothendieck at gmail.com Tue Mar 1 15:12:45 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Tue, 1 Mar 2011 09:12:45 -0500 Subject: [R] How to Save R library data into xls or dta format In-Reply-To: <1298971674645-3329489.post@n4.nabble.com> References: <1298971674645-3329489.post@n4.nabble.com> Message-ID: On Tue, Mar 1, 2011 at 4:27 AM, JoonGi wrote: > > Thanks in advance. > > I'm having a trouble with data saving. > > I want to run the same data which is in Ecdat library at different statistic > programs(excel, stata and matlab) > > The data I want to use is > > library(Ecdat) > data(Housing) > > and I want to extract this data our of R as *.dta *.xls formats. > So, my first try was to open the data in R window and drag and paste to > excel or notepad. > For excel see: http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windows&s=excel -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From danzur at hotmail.it Tue Mar 1 15:07:45 2011 From: danzur at hotmail.it (danielepippo) Date: Tue, 1 Mar 2011 06:07:45 -0800 (PST) Subject: [R] High standard error Message-ID: <1298988465882-3329903.post@n4.nabble.com> Hi to everyone, if the estimate of the parameter results in 0.196 and his standard error is 0.426, can I say that this parameter is not significant for the model? Thank you very much Pippo -- View this message in context: http://r.789695.n4.nabble.com/High-standard-error-tp3329903p3329903.html Sent from the R help mailing list archive at Nabble.com. From jrkrideau at yahoo.ca Tue Mar 1 15:25:56 2011 From: jrkrideau at yahoo.ca (John Kane) Date: Tue, 1 Mar 2011 06:25:56 -0800 (PST) Subject: [R] High standard error In-Reply-To: <1298988465882-3329903.post@n4.nabble.com> Message-ID: <710355.45133.qm@web38408.mail.mud.yahoo.com> Sure why not? You do realize, do you not that no one has the slightest idea of what you are doing? PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. --- On Tue, 3/1/11, danielepippo wrote: > From: danielepippo > Subject: [R] High standard error > To: r-help at r-project.org > Received: Tuesday, March 1, 2011, 9:07 AM > Hi to everyone, > ? ? > ? ???if the estimate of the parameter > results in 0.196 and his standard > error is 0.426, can I say that this parameter is not > significant for the > model? > > > Thank you very much > > Pippo > > -- > View this message in context: http://r.789695.n4.nabble.com/High-standard-error-tp3329903p3329903.html > Sent from the R help mailing list archive at Nabble.com. From matt at biostatmatt.com Tue Mar 1 15:35:10 2011 From: matt at biostatmatt.com (Matt Shotwell) Date: Tue, 01 Mar 2011 09:35:10 -0500 Subject: [R] Robust variance estimation with rq (failure of the bootstrap?) In-Reply-To: References: <1298948385.1668.106.camel@matt-laptop> Message-ID: <1298990110.2018.0.camel@matt-laptop> Jim, Thanks for pointing me to this article. The authors argue that the bootstrap intervals for a robust estimator may not be as robust as the estimator. In this context, robustness is measured by the breakdown point, which is supposed to measure robustness to outliers. Even so, the authors found that the upper bound of a quantile bootstrap interval for the sample median was nearly as robust as the sample median. That brings some comfort in using quantile bootstrap intervals in quantile regression. Does the sandwich estimator assume that errors are independent? And a related question: Does the rq function allow the user to specify clusters/grouping among the observations? Best, Matt On Tue, 2011-03-01 at 05:35 -0600, James Shaw wrote: > Matt: > > Thanks for your prompt reply. > > The disparity between the bootstrap and sandwich variance estimates > derived when modeling the highly skewed outcome suggest that either > (A) the empirical robust variance estimator is underestimating the > variance or (B) the bootstrap is breaking down. The bootstrap > variance estimate of a robust location estimate is not necessarily > robust, see Statistics & Probability Letters 50 (2000) 49-53. Since > submitting my earlier post, I have noticed that the the robust kernel > variance estimate is similar to the bootstrap estimate. Under what > conditions would one expect Koenker and Machado's sandwich variance > estimator, which uses a local estimate of the sparsity, to fail? > > -- > Jim > > > > On Mon, Feb 28, 2011 at 8:59 PM, Matt Shotwell wrote: > > Jim, > > > > If repeated measurements on patients are correlated, then resampling all > > measurements independently induces an incorrect sampling distribution > > (=> incorrect variance) on a statistic of these data. One solution, as > > you mention, is the block or cluster bootstrap, which preserves the > > correlation among repeated observations in resamples. I don't > > immediately see why the cluster bootstrap is unsuitable. > > > > Beyond this, I would be concerned about *any* variance estimates that > > are blind to correlated observations. > > > > The bootstrap variance estimate may be larger than the asymptotic > > variance estimate, but that alone isn't evidence to favor one over the > > other. > > > > Also, I can't justify (to myself) why skew would hamper the quality of > > bootstrap variance estimates. I wonder how it affects the sandwich > > variance estimate... > > > > Best, > > Matt > > > > On Mon, 2011-02-28 at 17:50 -0600, James Shaw wrote: > >> I am fitting quantile regression models using data collected from a > >> sample of 124 patients. When modeling cross-sectional associations, I > >> have noticed that nonparametric bootstrap estimates of the variances > >> of parameter estimates are much greater in magnitude than the > >> empirical Huber estimates derived using summary.rq's "nid" option. > >> The outcome variable is severely skewed, and I am afraid that this may > >> be affecting the consistency of the bootstrap variance estimates. I > >> have read that the m out of n bootstrap can be used to overcome this > >> problem. However, this procedure requires both the original sample > >> (n) and the subsample (m) sizes to be large. The version implemented > >> in rq.boot does not appear to provide any improvement over the naive > >> bootstrap. Ultimately, I am interested in using median regression to > >> model changes in the outcome variable over time. Summary.rq's robust > >> variance estimator is not applicable to repeated-measures data. I > >> question whether the block (cluster) bootstrap variance estimator, > >> which can accommodate intraclass correlation, would perform well. Can > >> anyone suggest alternatives for variance estimation in this situation? > >> Regards, > >> > >> Jim > >> > >> > >> James W. Shaw, Ph.D., Pharm.D., M.P.H. > >> Assistant Professor > >> Department of Pharmacy Administration > >> College of Pharmacy > >> University of Illinois at Chicago > >> 833 South Wood Street, M/C 871, Room 266 > >> Chicago, IL 60612 > >> Tel.: 312-355-5666 > >> Fax: 312-996-0868 > >> Mobile Tel.: 215-852-3045 > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > -- > James W. Shaw, Ph.D., Pharm.D., M.P.H. > Assistant Professor > Department of Pharmacy Administration > College of Pharmacy > University of Illinois at Chicago > 833 South Wood Street, M/C 871, Room 266 > Chicago, IL 60612 > Tel.: 312-355-5666 > Fax: 312-996-0868 > Mobile Tel.: 215-852-3045 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From adele_thompson at cargill.com Tue Mar 1 15:38:50 2011 From: adele_thompson at cargill.com (Schatzi) Date: Tue, 1 Mar 2011 06:38:50 -0800 (PST) Subject: [R] nls not solving In-Reply-To: <4D6C3E6F.8030101@ucalgary.ca> References: <1298924582492-3328647.post@n4.nabble.com> <1298931261660-3328862.post@n4.nabble.com> <4D6C3E6F.8030101@ucalgary.ca> Message-ID: <1298990330157-3329936.post@n4.nabble.com> Here is a reply by Bart: Yes you're right (I should have taken off my glasses and looked closer). However, the argument is essentially the same: Suppose you have a solution with a,b,k,l. Then for any positive c, [a+b-bc] + [bc] + (bc) *exp(kl')exp(-kx) is also a solution, where l' = l - log(c)/k . Cheers, Bert (Feel free to post this correction if you like) This is from me: The problem with dropping the "l" parameter is that it is supposed to account for the lag component. This equation was published in the literature and has been being solved in SAS. When I put it in excel, it solves, but not very well as it comes to a different solution for each time that I change the starting values. As such, I'm not sure how SAS solves for it and I'm not sure what I should do about the equation. Maybe I should just drop the parameter "a." Thanks for the help. -- View this message in context: http://r.789695.n4.nabble.com/nls-not-solving-tp3328647p3329936.html Sent from the R help mailing list archive at Nabble.com. From kitty.a1000 at gmail.com Tue Mar 1 16:18:16 2011 From: kitty.a1000 at gmail.com (sadz a) Date: Tue, 1 Mar 2011 15:18:16 +0000 Subject: [R] Quantreg model error and goodness of fit Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rex.dwyer at syngenta.com Tue Mar 1 16:54:32 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Tue, 1 Mar 2011 10:54:32 -0500 Subject: [R] Is there any Command showing correlation of all variables in a dataset? In-Reply-To: <1298976060344-3329599.post@n4.nabble.com> References: <1298976060344-3329599.post@n4.nabble.com> Message-ID: <36180405F8418449918AD20618D110FC095BF0BB43@USETCMSXMB02.NAFTA.SYNGENTA.ORG> ?cor answers that question. If Housing is a dataframe, cor(Housing) should do it. Surprisingly, ??correlation doesn't point you to ?cor. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of JoonGi Sent: Tuesday, March 01, 2011 5:41 AM To: r-help at r-project.org Subject: [R] Is there any Command showing correlation of all variables in a dataset? Thanks in advance. I want to derive correlations of variables in a dataset Specifically library(Ecdat) data(Housing) attach(Housing) cor(lotsize, bathrooms) this code results only the correlationship between two variables. But I want to examine all the combinations of variables in this dataset. And I will finally make a table in Latex. How can I test correlations for all combinations of variables? with one simple command? -- View this message in context: http://r.789695.n4.nabble.com/Is-there-any-Command-showing-correlation-of-all-variables-in-a-dataset-tp3329599p3329599.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. From Laszlo.Bodnar at erstebank.hu Tue Mar 1 17:22:42 2011 From: Laszlo.Bodnar at erstebank.hu (Bodnar Laszlo EB_HU) Date: Tue, 1 Mar 2011 17:22:42 +0100 Subject: [R] bootstrap resampling - simplified Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From tintin_jb at hotmail.com Tue Mar 1 17:34:11 2011 From: tintin_jb at hotmail.com (Jon Toledo) Date: Tue, 1 Mar 2011 17:34:11 +0100 Subject: [R] Problem on flexmix when trying to apply signature developed in one model to a new sample Message-ID: An embedded and charset-unspecified text was scrubbed... Name: no disponible URL: From Greg.Snow at imail.org Tue Mar 1 17:42:06 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Tue, 1 Mar 2011 09:42:06 -0700 Subject: [R] bootstrap resampling question In-Reply-To: References: Message-ID: Here are a couple of thoughts. If you want to use the boot package then the statistic function you give it just receives the bootstrapped indexes, you could test the indexes for your condition of not more than 5 of each and if it fails return an NA instead of computing the statistic. Then in your output just remove the NAs (you should increase the total number of samples tried so that you have a reasonable number after deletion). If you want to do it by hand, just use the rep function to create 5 replicates of your data, then sample from that without replacement. You can get up to 5 copies of each value (from the hand replication), but no more since it samples without replacement. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Bodnar Laszlo EB_HU > Sent: Tuesday, March 01, 2011 3:31 AM > To: 'r-help at r-project.org' > Subject: [R] bootstrap resampling question > > Hello there, > > > > I have a problem concerning bootstrapping in R - especially focusing on > the resampling part of it. I try to sum it up in a simplified way so > that I would not confuse anybody. > > > > I have a small database consisting of 20 observations (basically > numbers from 1 to 20, I mean: 1, 2, 3, 4, 5, ... 18, 19, 20). > > > > I would like to resample this database many times for the bootstrap > process with the following two conditions. The resampled databases > should also have 20 observations and you can select each of the > previously mentioned 20 numbers with replacement. I guess it is obvious > so far. Now the more difficult second condition is that one number can > be selected only maximum 5 times. In order to make this clear I try to > show you an example. So there can be resampled databases like the > following ones: > > > > (1st database) 1,2,1,2,1,2,1,2,1,2,3,3,3,3,3,4,4,4,4,4 > > (4 different numbers are chosen, each selected 5 times) > > > > (2nd database) 1,8,8,6,8,8,8,2,3,4,5,6,6,6,6,7,19,1,1,1 > > (Two numbers - 8 and 6 - selected 5 times, number "1" selected four > times, the others selected less than 4 times) > > > > My very first guess that came to my mind whilst thinking about the > problem was the sample function where there are settings like > replace=TRUE and prob=... where you can create a probability vector > i.e. how much should be the probability of selecting a number. So I > tried to calculate probabilities first. I thought the problem can > basically described as a k-combination with repetitions. Unfortunately > the only thing I could calculate so far is the total number of all > possible selections which amounts to 137 846 527 049. > > > > Anybody knows how to implement my second "tricky" condition into one of > the R functions? Are 'boot' and 'bootstrap' packages capable of > managing this? I guess they are, I just couldn't figure it out yet... > > > > Thanks very much! Best regards, > > Laszlo Bodnar > > > > _______________________________________________________________________ > _____________________________ > > Ez az e-mail ?s az ?sszes hozz? tartoz? csatolt mell?klet titkos > ?s/vagy jogilag, szakmailag vagy m?s m?don v?dett inform?ci?t > tartalmazhat. Amennyiben nem ?n a lev?l c?mzettje akkor a lev?l > tartalm?nak k?zl?se, reproduk?l?sa, m?sol?sa, vagy egy?b m?s ?ton > t?rt?n? terjeszt?se, felhaszn?l?sa szigor?an tilos. Amennyiben > t?ved?sb?l kapta meg ezt az ?zenetet k?rj?k azonnal ?rtes?tse az ?zenet > k?ld?j?t. Az Erste Bank Hungary Zrt. (EBH) nem v?llal felel?ss?get az > inform?ci? teljes ?s pontos - c?mzett(ek)hez t?rt?n? - eljuttat?s??rt, > valamint semmilyen k?s?s?rt, kapcsolat megszakad?sb?l ered? hib??rt, > vagy az inform?ci? felhaszn?l?s?b?l vagy annak megb?zhatatlans?g?b?l > ered? k?r?rt. > > > > Az ?zenetek EBH-n k?v?li k?ld?je vagy c?mzettje tudom?sul veszi ?s > hozz?j?rul, hogy az ?zenetekhez m?s banki alkalmazott is hozz?f?rhet az > EBH folytonos munkamenet?nek biztos?t?sa ?rdek?ben. > > > > > > This e-mail and any attached files are confidential > and/...{{dropped:19}} From gpetris at uark.edu Tue Mar 1 15:37:31 2011 From: gpetris at uark.edu (Giovanni Petris) Date: Tue, 01 Mar 2011 08:37:31 -0600 Subject: [R] bootstrap resampling question In-Reply-To: References: Message-ID: <1298990251.1675.1757.camel@definetti> A simple way of sampling with replacement from 1:20, with the additional constraint that each number can be selected at most five times is > sample(rep(1:20, 5), 20) HTH, Giovanni On Tue, 2011-03-01 at 11:30 +0100, Bodnar Laszlo EB_HU wrote: > Hello there, > > I have a problem concerning bootstrapping in R - especially focusing on the resampling part of it. I try to sum it up in a simplified way so that I would not confuse anybody. > > I have a small database consisting of 20 observations (basically numbers from 1 to 20, I mean: 1, 2, 3, 4, 5, ... 18, 19, 20). > > I would like to resample this database many times for the bootstrap process with the following two conditions. The resampled databases should also have 20 observations and you can select each of the previously mentioned 20 numbers with replacement. I guess it is obvious so far. Now the more difficult second condition is that one number can be selected only maximum 5 times. In order to make this clear I try to show you an example. So there can be resampled databases like the following ones: > > (1st database) 1,2,1,2,1,2,1,2,1,2,3,3,3,3,3,4,4,4,4,4 > (4 different numbers are chosen, each selected 5 times) > > (2nd database) 1,8,8,6,8,8,8,2,3,4,5,6,6,6,6,7,19,1,1,1 > (Two numbers - 8 and 6 - selected 5 times, number "1" selected four times, the others selected less than 4 times) > > My very first guess that came to my mind whilst thinking about the problem was the sample function where there are settings like replace=TRUE and prob=... where you can create a probability vector i.e. how much should be the probability of selecting a number. So I tried to calculate probabilities first. I thought the problem can basically described as a k-combination with repetitions. Unfortunately the only thing I could calculate so far is the total number of all possible selections which amounts to 137 846 527 049. > > Anybody knows how to implement my second "tricky" condition into one of the R functions? Are 'boot' and 'bootstrap' packages capable of managing this? I guess they are, I just couldn't figure it out yet... > > Thanks very much! Best regards, > Laszlo Bodnar > > ____________________________________________________________________________________________________ > Ez az e-mail ?s az ?sszes hozz? tartoz? csatolt mell?klet titkos ?s/vagy jogilag, szakmailag vagy m?s m?don v?dett inform?ci?t tartalmazhat. Amennyiben nem ?n a lev?l c?mzettje akkor a lev?l tartalm?nak k?zl?se, reproduk?l?sa, m?sol?sa, vagy egy?b m?s ?ton t?rt?n? terjeszt?se, felhaszn?l?sa szigor?an tilos. Amennyiben t?ved?sb?l kapta meg ezt az ?zenetet k?rj?k azonnal ?rtes?tse az ?zenet k?ld?j?t. Az Erste Bank Hungary Zrt. (EBH) nem v?llal felel?ss?get az inform?ci? teljes ?s pontos - c?mzett(ek)hez t?rt?n? - eljuttat?s??rt, valamint semmilyen k?s?s?rt, kapcsolat megszakad?sb?l ered? hib??rt, vagy az inform?ci? felhaszn?l?s?b?l vagy annak megb?zhatatlans?g?b?l ered? k?r?rt. > > Az ?zenetek EBH-n k?v?li k?ld?je vagy c?mzettje tudom?sul veszi ?s hozz?j?rul, hogy az ?zenetekhez m?s banki alkalmazott is hozz?f?rhet az EBH folytonos munkamenet?nek biztos?t?sa ?rdek?ben. > > > This e-mail and any attached files are confidential and/...{{dropped:19}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Giovanni Petris Associate Professor Department of Mathematical Sciences University of Arkansas - Fayetteville, AR 72701 Ph: (479) 575-6324, 575-8630 (fax) http://definetti.uark.edu/~gpetris/ From heberto.ghezzo at mcgill.ca Tue Mar 1 17:58:02 2011 From: heberto.ghezzo at mcgill.ca (R Heberto Ghezzo, Dr) Date: Tue, 1 Mar 2011 11:58:02 -0500 Subject: [R] problems with playwith Message-ID: hello, i tried to run playwith but : > library(playwith) Loading required package: lattice Loading required package: cairoDevice Loading required package: gWidgetsRGtk2 Loading required package: gWidgets Error in inDL(x, as.logical(local), as.logical(now), ...) : unable to load shared object 'H:/R/cran/RGtk2/libs/i386/RGtk2.dll': LoadLibrary failure: The specified procedure could not be found. Failed to load RGtk2 dynamic library, attempting to install it. Learn more about GTK+ at http://www.gtk.org If the package still does not load, please ensure that GTK+ is installed and that it is on your PATH environment variable IN ANY CASE, RESTART R BEFORE TRYING TO LOAD THE PACKAGE AGAIN Error : .onAttach failed in attachNamespace() for 'gWidgetsRGtk2', details: call: .Call(name, ..., PACKAGE = PACKAGE) error: C symbol name "S_gtk_icon_factory_new" not in DLL for package "RGtk2" Error: package 'gWidgetsRGtk2' could not be loaded > > Sys.getenv("PATH") PATH "H:\\R/GTK/bin;H:\\R/GTK/lib;H:\\R/ImageMagick;C:\\windows\\system32;C:\\windows;C:\\windows\\System32\\Wbem;C:\\windows\\System32\\WindowsPowerShell\\v1.0\\;C:\\Program Files\\Common Files\\Ulead Systems\\MPEG;C:\\Program Files\\QuickTime\\QTSystem\\;H:\\R\\GTK\\GTK2-Runtime\\bin;H:\\PortableUSB/PortableApps/MikeTex/miktex/bin" > packages(lattice, cairoDevice, gWidgetsRGtk2, gWidgets, RGtk2, playwith) were reinstalled program GTK was reinstalled. using R-2-12-2 on Windows 7 Can anybody suggest a solution? thanks R.Heberto Ghezzo Ph.D. Montreal - Canada From sbigelow at fs.fed.us Tue Mar 1 18:01:35 2011 From: sbigelow at fs.fed.us (Seth W Bigelow) Date: Tue, 1 Mar 2011 09:01:35 -0800 Subject: [R] Does POSIXlt extract date components properly? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ajaytalati at googlemail.com Tue Mar 1 17:53:50 2011 From: ajaytalati at googlemail.com (AjayT) Date: Tue, 1 Mar 2011 08:53:50 -0800 (PST) Subject: [R] Speed up sum of outer products? Message-ID: <1298998430979-3330160.post@n4.nabble.com> Hi, I'm new to R and stats, and I'm trying to speed up the following sum, for (i in 1:n){ C = C + (X[i,] %o% X[i,]) # the sum of outer products - this is very slow according to Rprof() } where X is a data matrix (nrows=1000 X ncols=50), and n=1000. The sum has to be calculated over 10,000 times for different X. I think it is similar to estimating a co-variance matrix for demeaned data X. I tried using cov, but got different answers, and it was'nt much quicker? Any help gratefully appreciated, -- View this message in context: http://r.789695.n4.nabble.com/Speed-up-sum-of-outer-products-tp3330160p3330160.html Sent from the R help mailing list archive at Nabble.com. From scttchamberlain4 at gmail.com Tue Mar 1 18:02:05 2011 From: scttchamberlain4 at gmail.com (Scott Chamberlain) Date: Tue, 1 Mar 2011 11:02:05 -0600 Subject: [R] Is there any Command showing correlation of all variables in a dataset? In-Reply-To: <36180405F8418449918AD20618D110FC095BF0BB43@USETCMSXMB02.NAFTA.SYNGENTA.ORG> References: <1298976060344-3329599.post@n4.nabble.com> <36180405F8418449918AD20618D110FC095BF0BB43@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Message-ID: <0D4219830ACD47988E3809B9F0260D59@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mark_difford at yahoo.co.uk Tue Mar 1 17:45:45 2011 From: mark_difford at yahoo.co.uk (Mark Difford) Date: Tue, 1 Mar 2011 08:45:45 -0800 (PST) Subject: [R] mlogit.data In-Reply-To: <1298997668219-3330139.post@n4.nabble.com> References: <1298986374092-3329821.post@n4.nabble.com> <1298997668219-3330139.post@n4.nabble.com> Message-ID: <1298997945135-3330148.post@n4.nabble.com> My previous posting seems to have got mangled. This reposts it. On Mar 01, 2011; 03:32pm gmacfarlane wrote: >> workdata.csv >> The code I posted is exactly what I am running. What you need is this >> data. Here is the code again. > hbwmode<-mlogit.data("worktrips.csv", shape="long", choice="CHOSEN", > alt.var="ALTNUM") > hbwmode<-mlogit.data(hbwtrips, shape="long", choice="CHOSEN", > alt.var="ALTNUM") You still have not done what the posting guide asks for but have expected me (or someone else) to scrutinize a large unknown data set (22003 rows). Fortunately there are other routes. Had you studied Yves Croissant's examples (?mlogit.data), which do work, you would have seen that your input or "raw" data have to have a particular format for mlogit.data to work. In particular, the "alt.var" ("mode" in the TravelMode data set and "ALTNUM" in your data set) has to go through all its levels in sequence. Yours don't (your variable has 6 levels but sometimes runs from 1 to 5, sometimes from 2 to 6, and so on). Within each run there must be only one choice. ## > library(mlogit) > data("TravelMode", package = "AER") > head(TravelMode, n= 20) individual mode choice wait vcost travel gcost income size 1 1 air no 69 59 100 70 35 1 2 1 train no 34 31 372 71 35 1 3 1 bus no 35 25 417 70 35 1 4 1 car yes 0 10 180 30 35 1 5 2 air no 64 58 68 68 30 2 6 2 train no 44 31 354 84 30 2 7 2 bus no 53 25 399 85 30 2 8 2 car yes 0 11 255 50 30 2 9 3 air no 69 115 125 129 40 1 10 3 train no 34 98 892 195 40 1 11 3 bus no 35 53 882 149 40 1 12 3 car yes 0 23 720 101 40 1 13 4 air no 64 49 68 59 70 3 14 4 train no 44 26 354 79 70 3 15 4 bus no 53 21 399 81 70 3 16 4 car yes 0 5 180 32 70 3 17 5 air no 64 60 144 82 45 2 18 5 train no 44 32 404 93 45 2 19 5 bus no 53 26 449 94 45 2 20 5 car yes 0 8 600 99 45 2 When we look at just the relevant part of your data we have the following: > hbwtrips<-read.csv("E:/Downloads/workdata.csv", header=TRUE, sep=",", > dec=".", row.names=NULL) > head(hbwtrips[, c(2:11)], n=25) HHID PERID CASE ALTNUM NUMALTS CHOSEN IVTT OVTT TVTT COST 1 2 1 1 1 5 1 13.38 2.00 15.38 70.63 2 2 1 1 2 5 0 18.38 2.00 20.38 35.32 3 2 1 1 3 5 0 20.38 2.00 22.38 20.18 4 2 1 1 4 5 0 25.90 15.20 41.10 115.64 5 2 1 1 5 5 0 40.50 2.00 42.50 0.00 6 3 1 2 1 5 0 29.92 10.00 39.92 390.81 7 3 1 2 2 5 0 34.92 10.00 44.92 195.40 8 3 1 2 3 5 0 21.92 10.00 31.92 97.97 9 3 1 2 4 5 1 22.96 14.20 37.16 185.00 10 3 1 2 5 5 0 58.95 10.00 68.95 0.00 11 5 1 3 1 4 1 8.60 6.00 14.60 37.76 12 5 1 3 2 4 0 13.60 6.00 19.60 18.88 13 5 1 3 3 4 0 15.60 6.00 21.60 10.79 14 5 1 3 4 4 0 16.87 21.40 38.27 105.00 15 6 1 4 1 4 0 30.60 8.50 39.10 417.32 16 6 1 4 2 4 0 35.70 8.50 44.20 208.66 17 6 1 4 3 4 0 22.70 8.50 31.20 105.54 18 6 1 4 4 4 1 24.27 9.00 33.27 193.49 19 8 2 5 2 4 1 23.04 3.00 26.04 29.95 20 8 2 5 3 4 0 25.04 3.00 28.04 17.12 21 8 2 5 4 4 0 25.04 23.50 48.54 100.00 22 8 2 5 5 4 0 34.35 3.00 37.35 0.00 23 8 3 6 2 5 0 11.14 3.50 14.64 14.00 24 8 3 6 3 5 0 13.14 3.50 16.64 8.00 25 8 3 6 4 5 1 3.95 16.24 20.19 100.00 To show you that this is so we will mock up two variables that have the characteristics described above and use them to execute the function. ## hbwtrips$CHOICEN <- rep(c(rep(0,10),1), 2003) hbwtrips$ALTNUMTest <- gl(11,1,22033, labels=LETTERS[1:11]) hbwtrips[1:30, c(1:11,44,45)] hbwmode <- mlogit.data(hbwtrips, varying=c(8:11), shape="long", choice="CHOICEN", alt.var="ALTNUMTest") Hope that helps, Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/mlogit-data-tp3328739p3330148.html Sent from the R help mailing list archive at Nabble.com. From landronimirc at gmail.com Tue Mar 1 18:08:18 2011 From: landronimirc at gmail.com (Liviu Andronic) Date: Tue, 1 Mar 2011 18:08:18 +0100 Subject: [R] problems with playwith In-Reply-To: References: Message-ID: On Tue, Mar 1, 2011 at 5:58 PM, R Heberto Ghezzo, Dr wrote: > hello, i tried to run playwith but : > >> library(playwith) > Loading required package: lattice > Loading required package: cairoDevice > Loading required package: gWidgetsRGtk2 > Loading required package: gWidgets > Error in inDL(x, as.logical(local), as.logical(now), ...) : > ?unable to load shared object 'H:/R/cran/RGtk2/libs/i386/RGtk2.dll': > ?LoadLibrary failure: ?The specified procedure could not be found. > Failed to load RGtk2 dynamic library, attempting to install it. > Did you install RGtk2? [1] Liviu [1] https://code.google.com/p/playwith/ > Learn more about GTK+ at http://www.gtk.org > If the package still does not load, please ensure that GTK+ is installed and that it is on your PATH environment variable > IN ANY CASE, RESTART R BEFORE TRYING TO LOAD THE PACKAGE AGAIN > Error : .onAttach failed in attachNamespace() for 'gWidgetsRGtk2', details: > ?call: .Call(name, ..., PACKAGE = PACKAGE) > ?error: C symbol name "S_gtk_icon_factory_new" not in DLL for package "RGtk2" > Error: package 'gWidgetsRGtk2' could not be loaded >> >> Sys.getenv("PATH") > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?PATH > "H:\\R/GTK/bin;H:\\R/GTK/lib;H:\\R/ImageMagick;C:\\windows\\system32;C:\\windows;C:\\windows\\System32\\Wbem;C:\\windows\\System32\\WindowsPowerShell\\v1.0\\;C:\\Program Files\\Common Files\\Ulead Systems\\MPEG;C:\\Program Files\\QuickTime\\QTSystem\\;H:\\R\\GTK\\GTK2-Runtime\\bin;H:\\PortableUSB/PortableApps/MikeTex/miktex/bin" >> > packages(lattice, cairoDevice, gWidgetsRGtk2, gWidgets, RGtk2, playwith) were reinstalled > program GTK was reinstalled. > using R-2-12-2 on Windows 7 > Can anybody suggest a solution? > thanks > > R.Heberto Ghezzo Ph.D. > Montreal - Canada > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail From landronimirc at gmail.com Tue Mar 1 18:11:03 2011 From: landronimirc at gmail.com (Liviu Andronic) Date: Tue, 1 Mar 2011 18:11:03 +0100 Subject: [R] Is there any Command showing correlation of all variables in a dataset? In-Reply-To: <1298976060344-3329599.post@n4.nabble.com> References: <1298976060344-3329599.post@n4.nabble.com> Message-ID: On Tue, Mar 1, 2011 at 11:41 AM, JoonGi wrote: > > Thanks in advance. > > I want to derive correlations of variables in a dataset > > Specifically > > library(Ecdat) > data(Housing) > attach(Housing) > cor(lotsize, bathrooms) > > this code results only the correlationship between two variables. > But I want to examine all the combinations of variables in this dataset. > And I will finally make a table in Latex. > > How can I test correlations for all combinations of variables? > with one simple command? > See Rcmdr for an example. Liviu > > -- > View this message in context: http://r.789695.n4.nabble.com/Is-there-any-Command-showing-correlation-of-all-variables-in-a-dataset-tp3329599p3329599.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail From millerlp at gmail.com Tue Mar 1 18:17:43 2011 From: millerlp at gmail.com (Luke Miller) Date: Tue, 1 Mar 2011 12:17:43 -0500 Subject: [R] Does POSIXlt extract date components properly? In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ligges at statistik.tu-dortmund.de Tue Mar 1 18:28:06 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Tue, 01 Mar 2011 18:28:06 +0100 Subject: [R] SetInternet2, RCurl and proxy In-Reply-To: <1298977247406-3329624.post@n4.nabble.com> References: <1298977247406-3329624.post@n4.nabble.com> Message-ID: <4D6D2CA6.9020506@statistik.tu-dortmund.de> On 01.03.2011 12:00, Manta wrote: > Dear all, > > I am facing a problem. I am trying to install packages using a proxy, but I > am not able to call the setInternet2 function, either with the small or > capital s. What package do I have to call then? And, could there be a reason > why this does not function? What does the error message say? Is this a recent version of R? Uwe Ligges > > Thanks, > Marco > From spector at stat.berkeley.edu Tue Mar 1 18:30:43 2011 From: spector at stat.berkeley.edu (Phil Spector) Date: Tue, 1 Mar 2011 09:30:43 -0800 (PST) Subject: [R] Speed up sum of outer products? In-Reply-To: <1298998430979-3330160.post@n4.nabble.com> References: <1298998430979-3330160.post@n4.nabble.com> Message-ID: What you're doing is breaking up the calculation of X'X into n steps. I'm not sure what you mean by "very slow": > X = matrix(rnorm(1000*50),1000,50) > n = 1000 > system.time({C=matrix(0,50,50);for(i in 1:n)C = C + (X[i,] %o% X[i,])}) user system elapsed 0.096 0.008 0.104 Of course, you could just do the calculation directly: > system.time({C1 = t(X) %*% X}) user system elapsed 0.008 0.000 0.007 > all.equal(C,C1) [1] TRUE - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Tue, 1 Mar 2011, AjayT wrote: > Hi, I'm new to R and stats, and I'm trying to speed up the following sum, > > for (i in 1:n){ > C = C + (X[i,] %o% X[i,]) # the sum of outer products - this is very slow > according to Rprof() > } > > where X is a data matrix (nrows=1000 X ncols=50), and n=1000. The sum has to > be calculated over 10,000 times for different X. > > I think it is similar to estimating a co-variance matrix for demeaned data > X. I tried using cov, but got different answers, and it was'nt much quicker? > > Any help gratefully appreciated, > > -- > View this message in context: http://r.789695.n4.nabble.com/Speed-up-sum-of-outer-products-tp3330160p3330160.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From HDoran at air.org Tue Mar 1 18:43:16 2011 From: HDoran at air.org (Doran, Harold) Date: Tue, 1 Mar 2011 12:43:16 -0500 Subject: [R] Speed up sum of outer products? In-Reply-To: References: <1298998430979-3330160.post@n4.nabble.com> Message-ID: Isn't the following the canonical (R-ish) way of doing this: X = matrix(rnorm(1000*50),1000,50) system.time({C1 = t(X) %*% X}) # Phil's example C2 <- crossprod(X) # use crossprod instead > all.equal(C1,C2) [1] TRUE > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On > Behalf Of Phil Spector > Sent: Tuesday, March 01, 2011 12:31 PM > To: AjayT > Cc: r-help at r-project.org > Subject: Re: [R] Speed up sum of outer products? > > What you're doing is breaking up the calculation of X'X > into n steps. I'm not sure what you mean by "very slow": > > > X = matrix(rnorm(1000*50),1000,50) > > n = 1000 > > system.time({C=matrix(0,50,50);for(i in 1:n)C = C + (X[i,] %o% X[i,])}) > user system elapsed > 0.096 0.008 0.104 > > Of course, you could just do the calculation directly: > > > system.time({C1 = t(X) %*% X}) > user system elapsed > 0.008 0.000 0.007 > > all.equal(C,C1) > [1] TRUE > > > - Phil Spector > Statistical Computing Facility > Department of Statistics > UC Berkeley > spector at stat.berkeley.edu > > > > On Tue, 1 Mar 2011, AjayT wrote: > > > Hi, I'm new to R and stats, and I'm trying to speed up the following sum, > > > > for (i in 1:n){ > > C = C + (X[i,] %o% X[i,]) # the sum of outer products - this is very > slow > > according to Rprof() > > } > > > > where X is a data matrix (nrows=1000 X ncols=50), and n=1000. The sum has to > > be calculated over 10,000 times for different X. > > > > I think it is similar to estimating a co-variance matrix for demeaned data > > X. I tried using cov, but got different answers, and it was'nt much quicker? > > > > Any help gratefully appreciated, > > > > -- > > View this message in context: http://r.789695.n4.nabble.com/Speed-up-sum-of- > outer-products-tp3330160p3330160.html > > Sent from the R help mailing list archive at Nabble.com. > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ehlers at ucalgary.ca Tue Mar 1 18:57:27 2011 From: ehlers at ucalgary.ca (Peter Ehlers) Date: Tue, 01 Mar 2011 09:57:27 -0800 Subject: [R] nls not solving In-Reply-To: <1298990330157-3329936.post@n4.nabble.com> References: <1298924582492-3328647.post@n4.nabble.com> <1298931261660-3328862.post@n4.nabble.com> <4D6C3E6F.8030101@ucalgary.ca> <1298990330157-3329936.post@n4.nabble.com> Message-ID: <4D6D3387.3080601@ucalgary.ca> On 2011-03-01 06:38, Schatzi wrote: > Here is a reply by Bart: > Yes you're right (I should have taken off my glasses and looked closer). > However, the argument is essentially the same: > > Suppose you have a solution with a,b,k,l. Then for any positive c, [a+b-bc] > + [bc] + (bc) *exp(kl')exp(-kx) is also a solution, where l' > = l - log(c)/k . > > Cheers, > Bert > > (Feel free to post this correction if you like) > > > This is from me: > The problem with dropping the "l" parameter is that it is supposed to > account for the lag component. This equation was published in the literature > and has been being solved in SAS. When I put it in excel, it solves, but not > very well as it comes to a different solution for each time that I change > the starting values. As such, I'm not sure how SAS solves for it and I'm not > sure what I should do about the equation. Maybe I should just drop the > parameter "a." Thanks for the help. When you say 'published in the literature' you should provide a reference; you may be misinterpreting what's published. If SAS provides a 'solution', then there's an added assumption being made (perhaps 'l' is being fixed?). What Excel does is of little interest. 'Dropping' the parameter 'a' is equivalent to setting a=0. You could also set, say, a = -10 or l = 50, or ... The point is that, as Bert says, the model is nonidentifiable. Peter Ehlers From mantino84 at libero.it Tue Mar 1 18:41:07 2011 From: mantino84 at libero.it (Manta) Date: Tue, 1 Mar 2011 09:41:07 -0800 (PST) Subject: [R] SetInternet2, RCurl and proxy In-Reply-To: <4D6D2CA6.9020506@statistik.tu-dortmund.de> References: <1298977247406-3329624.post@n4.nabble.com> <4D6D2CA6.9020506@statistik.tu-dortmund.de> Message-ID: <1299001267732-3330244.post@n4.nabble.com> It says the function does not exist. The version is around 2.8, cant check right now. Is it because it's an older version? If so, is there any way to do it in a different way then? -- View this message in context: http://r.789695.n4.nabble.com/SetInternet2-RCurl-and-proxy-tp3248576p3330244.html Sent from the R help mailing list archive at Nabble.com. From chen_1002 at fisher.osu.edu Tue Mar 1 19:06:24 2011 From: chen_1002 at fisher.osu.edu (chen jia) Date: Tue, 1 Mar 2011 13:06:24 -0500 Subject: [R] Data type problem when extract data from SQLite to R by using RSQLite In-Reply-To: References: Message-ID: Hi Seth, Thanks so much for identifying the problem and explaining everything. I think the first solution that you suggest--make sure the schema has well defined types--would work the best for me. But, I have one question about how to implement it, which is more about sqlite itself. First, I found out that the columns that don't have the expected data types in the table annual_data3 are created by aggregate functions in a separate table. These columns are later combined with other columns that do. I read the link that you provide, http://www.sqlite.org/datatype3.html. One paragraph says "When grouping values with the GROUP BY clause values with different storage classes are considered distinct, except for INTEGER and REAL values which are considered equal if they are numerically equal. No affinities are applied to any values as the result of a GROUP by clause." If I understand it correctly, the columns created by aggregate functions with a GROUP by clause do not have any expected data types. My solution is to use CREATE TABLE clause to declare the expected datatype and then insert the values of columns created by the aggregate functions with the GROUP by clause. However, this solution requires a CREATE TABLE cause every time the aggregate function and the GROUP by clause is used. My question is: Is this the best way to make sure that the columns as a result of a GROUP by clause have the expected data types? Thanks. Best, Jia On Tue, Mar 1, 2011 at 1:16 AM, Seth Falcon wrote: > Hi Jia, > > On Mon, Feb 28, 2011 at 6:57 PM, chen jia wrote: >> The .schema of table annual_data3 is >> sqlite> .schema annual_data3 >> CREATE TABLE "annual_data3"( >> ?PERMNO INT, >> ?DATE INT, >> ?CUSIP TEXT, >> ?EXCHCD INT, >> ?SICCD INT, >> ?SHROUT INT, >> ?PRC REAL, >> ?RET REAL, >> ?... >> ?pret_var, >> ?pRET_sd, >> ?nmret, >> ?pya_var, > > [snip] > > Is there a reason that you've told SQLite the expected data type for > only some of the columns? > >> Interestingly, I find that the problem I reported does not for columns >> labeled real in the schema info. For example, the type of column RET >> never changes no matter what the first observation is. > > Yes, that is expected and I think it is the solution to your problem: > setup your schema so that all columns have a declared type. ?For some > details on SQLite's type system see > http://www.sqlite.org/datatype3.html. > > RSQLite currently maps NA values to NULL in the database. ?Pulling > data out of a SELECT query, RSQLite uses the sqlite3_column_type > SQLite API to determine the data type and map it to an R type. ?If > NULL is encountered, then the schema is inspected using > sqlite3_column_decltype to attempt to obtain a type. ?If that fails, > the data is mapped to a character vector at the R level. ?The type > selection is done once after the first row has been fetched. > > To work around this you can: > > - make sure your schema has well defined > ?types (which will help SQLite perform its operations); > > - check whether the returned column has the expected type and convert > ?if needed at the R level. > > - remove NA/NULL values from the db or decide on a different way of > ?encoding them (e.g you might be able to use -1 in the db in some > ?situation to indicate missing). ?Your R code would then need to map > ?these to proper NA. > > Hope that helps. > > + seth > > > > -- > Seth Falcon | @sfalcon | http://userprimary.net/ > -- 700 Fisher Hall 2100 Neil Ave. Columbus, Ohio? 43210 http://www.fisher.osu.edu/~chen_1002/ From avsmith at gmail.com Tue Mar 1 19:10:41 2011 From: avsmith at gmail.com (Albert Vernon Smith) Date: Tue, 1 Mar 2011 18:10:41 +0000 Subject: [R] Adjusting values via vectorization Message-ID: I'm adjusting values in a list based on a couple of matrixes. One matrix specifies the row to be taken from the adjustment matrix, while using the aligned column values. I have an approach which works, but I might find an approach with vectorization. Here is code with my solution: -- nids <- 10 npredictors <- 2 ncol <- 4 values <- sample(c(-1,0,1),nids,replace=TRUE) input <- matrix(sample(1:ncol,nids*npredictors,replace=TRUE),nrow=nids) values.adjust <- matrix(rnorm(ncol*npredictors),ncol=npredictors) for(i in 1:nids){ for(j in 1:npredictors){ values[i] <- values[i] + values.adjust[input[i,j],j] } } -- I'm using this as an example to hopefully better understand R syntax w.r.t. vectorization. Is there such a way to replace my for loops? Thanks, -albert From ripley at stats.ox.ac.uk Tue Mar 1 19:14:29 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Tue, 1 Mar 2011 18:14:29 +0000 (GMT) Subject: [R] Does POSIXlt extract date components properly? In-Reply-To: References: Message-ID: On Tue, 1 Mar 2011, Seth W Bigelow wrote: > I would like to use POSIX classes to store dates and extract components of > dates. Following the example in Spector ("Data Manipulation in R"), I > create a date > >> mydate = as. POSIXlt('2005-4-19 7:01:00') > > I then successfully extract the day with the command > >> mydate$day > [1] 19 > > But when I try to extract the month > > > mydate$mon > [1] 3 > > it returns the wrong month. And mydate$year is off by about 2,000 years. > Am I doing something wrong? Not reading the documentation (nor the posting guide). ?DateTimeClasses says ?mon? 0-11: months after the first of the year. ?year? years since 1900. That is the POSIX standard ... you could also have looked there. > Dr. Seth W. Bigelow > Biologist, USDA-FS Pacific Southwest Research Station > 1731 Research Park Drive, Davis California > sbigelow at fs.fed.us / ph. 530 759 1718 > [[alternative HTML version deleted]] Please note what the posting guide said about that! > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From stp08emj at shef.ac.uk Tue Mar 1 19:09:29 2011 From: stp08emj at shef.ac.uk (emj83) Date: Tue, 1 Mar 2011 10:09:29 -0800 (PST) Subject: [R] boa library and plots In-Reply-To: References: <1298550107927-3322508.post@n4.nabble.com> Message-ID: <1299002969632-3330299.post@n4.nabble.com> Many thanks for your response, and I am sorry I did not post correctly. I have found dev.copy2eps() useful. Emma -- View this message in context: http://r.789695.n4.nabble.com/boa-library-and-plots-tp3322508p3330299.html Sent from the R help mailing list archive at Nabble.com. From jdaily at usgs.gov Tue Mar 1 19:18:20 2011 From: jdaily at usgs.gov (Jonathan P Daily) Date: Tue, 1 Mar 2011 13:18:20 -0500 Subject: [R] bootstrap resampling question In-Reply-To: <1298990251.1675.1757.camel@definetti> References: <1298990251.1675.1757.camel@definetti> Message-ID: I'm not sure that is equivalent to sampling with replacement, since if the first "draw" is 1, then the probability that the next draw will be one is 4/100 instead of the 1/20 it would be in sampling with replacement. I think the way to do this would be what Greg suggested - something like: bigsamp <- sample(1:20, 100, T) idx <- sort(unlist(sapply(1:20, function(x) which(bigsamp == x)[1:5])))[1:20] samp <- bigsamp[idx] -------------------------------------- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 "Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it." - Jubal Early, Firefly r-help-bounces at r-project.org wrote on 03/01/2011 09:37:31 AM: > [image removed] > > Re: [R] bootstrap resampling question > > Giovanni Petris > > to: > > Bodnar Laszlo EB_HU > > 03/01/2011 11:58 AM > > Sent by: > > r-help-bounces at r-project.org > > Cc: > > "'r-help at r-project.org'" > > A simple way of sampling with replacement from 1:20, with the additional > constraint that each number can be selected at most five times is > > > sample(rep(1:20, 5), 20) > > HTH, > Giovanni > > On Tue, 2011-03-01 at 11:30 +0100, Bodnar Laszlo EB_HU wrote: > > Hello there, > > > > I have a problem concerning bootstrapping in R - especially > focusing on the resampling part of it. I try to sum it up in a > simplified way so that I would not confuse anybody. > > > > I have a small database consisting of 20 observations (basically > numbers from 1 to 20, I mean: 1, 2, 3, 4, 5, ... 18, 19, 20). > > > > I would like to resample this database many times for the > bootstrap process with the following two conditions. The resampled > databases should also have 20 observations and you can select each > of the previously mentioned 20 numbers with replacement. I guess it > is obvious so far. Now the more difficult second condition is that > one number can be selected only maximum 5 times. In order to make > this clear I try to show you an example. So there can be resampled > databases like the following ones: > > > > (1st database) 1,2,1,2,1,2,1,2,1,2,3,3,3,3,3,4,4,4,4,4 > > (4 different numbers are chosen, each selected 5 times) > > > > (2nd database) 1,8,8,6,8,8,8,2,3,4,5,6,6,6,6,7,19,1,1,1 > > (Two numbers - 8 and 6 - selected 5 times, number "1" selected > four times, the others selected less than 4 times) > > > > My very first guess that came to my mind whilst thinking about the > problem was the sample function where there are settings like > replace=TRUE and prob=... where you can create a probability vector > i.e. how much should be the probability of selecting a number. So I > tried to calculate probabilities first. I thought the problem can > basically described as a k-combination with repetitions. > Unfortunately the only thing I could calculate so far is the total > number of all possible selections which amounts to 137 846 527 049. > > > > Anybody knows how to implement my second "tricky" condition into > one of the R functions? Are 'boot' and 'bootstrap' packages capable > of managing this? I guess they are, I just couldn't figure it out yet... > > > > Thanks very much! Best regards, > > Laszlo Bodnar > > > > > ____________________________________________________________________________________________________ > > Ez az e-mail ?s az ?sszes hozz? tartoz? csatolt mell?klet titkos > ?s/vagy jogilag, szakmailag vagy m?s m?don v?dett inform?ci?t > tartalmazhat. Amennyiben nem ?n a lev?l c?mzettje akkor a lev?l > tartalm?nak k?zl?se, reproduk?l?sa, m?sol?sa, vagy egy?b m?s ?ton > t?rt?n? terjeszt?se, felhaszn?l?sa szigor?an tilos. Amennyiben > t?ved?sb?l kapta meg ezt az ?zenetet k?rj?k azonnal ?rtes?tse az > ?zenet k?ld?j?t. Az Erste Bank Hungary Zrt. (EBH) nem v?llal > felel?ss?get az inform?ci? teljes ?s pontos - c?mzett(ek)hez t?rt?n? > - eljuttat?s??rt, valamint semmilyen k?s?s?rt, kapcsolat > megszakad?sb?l ered? hib??rt, vagy az inform?ci? felhaszn?l?s?b?l > vagy annak megb?zhatatlans?g?b?l ered? k?r?rt. > > > > Az ?zenetek EBH-n k?v?li k?ld?je vagy c?mzettje tudom?sul veszi ?s > hozz?j?rul, hogy az ?zenetekhez m?s banki alkalmazott is hozz?f?rhet > az EBH folytonos munkamenet?nek biztos?t?sa ?rdek?ben. > > > > > > This e-mail and any attached files are confidential and/...{{dropped:19}} > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > -- > > Giovanni Petris > Associate Professor > Department of Mathematical Sciences > University of Arkansas - Fayetteville, AR 72701 > Ph: (479) 575-6324, 575-8630 (fax) > http://definetti.uark.edu/~gpetris/ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From stp08emj at shef.ac.uk Tue Mar 1 19:15:45 2011 From: stp08emj at shef.ac.uk (emj83) Date: Tue, 1 Mar 2011 10:15:45 -0800 (PST) Subject: [R] more boa plots questions Message-ID: <1299003345958-3330312.post@n4.nabble.com> I have MCMC output chains A and B for example, I want to produce trace plots for them using the boa command line... #loads boa boa.init() #reads in chains boa.chain.add(boa.importMatrix('A'), 'A') boa.chain.add(boa.importMatrix('B'), 'B') #plot trace plot problems arise here! I know I can get trace plots using boa.plot('trace') but this plots the parameter chains on the same plot- I want separate plots using boa.plot.trace() #from the manual.. boa.plot.trace(lnames, pname, annotate = boa.par("legend")) lnames: Character vector giving the name of the desired MCMC sequence in the working session list of sequences. pname: Character string giving the name of the parameters to be plotted. annotate: Logical value indicating that a legend be included in the plot. I tried boa.plot.trace(B) and boa.plot.trace(B,"B") but both do not give me a trace plot for chain B and print FALSE. I am obviously misinterpreting lnames and pnames. Can anyone help please? Thanks in advance Emma -- View this message in context: http://r.789695.n4.nabble.com/more-boa-plots-questions-tp3330312p3330312.html Sent from the R help mailing list archive at Nabble.com. From price_ja at hotmail.com Tue Mar 1 19:29:57 2011 From: price_ja at hotmail.com (Jim Price) Date: Tue, 1 Mar 2011 10:29:57 -0800 (PST) Subject: [R] Lattice: useOuterStrips and axes Message-ID: <1299004197583-3330338.post@n4.nabble.com> Consider the following: library(lattice) library(latticeExtra) temp <- expand.grid( subject = factor(paste('Subject', 1:3)), var = factor(paste('Variable', 1:3)), time = 1:10 ) temp$resp <- rnorm(nrow(temp), 10 * as.numeric(temp$var), 1) ylimits <- by(temp$resp, temp$var, function(x) range(pretty(x))) useOuterStrips(xyplot( resp ~ time | subject * var, data = temp, as.table = TRUE, scales = list( alternating = 1, tck = c(1, 0), y = list(relation = 'free', rot = 0, limits = rep(ylimits, each = 3)) ) )) This is a matrix of variables on subjects, where it makes sense to have panel-specific y-axes because of the differing variable ranges. In fact, it makes sense to have row-specific y-axes, because of the similarity of intra-variable inter-subject response. However the graphic as presented gives per-panel y-axis annotation - I'd like to drop the 2nd and 3rd columns of y-axes labels, so the panels become a contiguous array. Is this possible? Thanks, Jim Price. Cardiome Pharma. Corp. -- View this message in context: http://r.789695.n4.nabble.com/Lattice-useOuterStrips-and-axes-tp3330338p3330338.html Sent from the R help mailing list archive at Nabble.com. From ajaytalati at googlemail.com Tue Mar 1 19:52:57 2011 From: ajaytalati at googlemail.com (AjayT) Date: Tue, 1 Mar 2011 10:52:57 -0800 (PST) Subject: [R] Speed up sum of outer products? In-Reply-To: References: <1298998430979-3330160.post@n4.nabble.com> Message-ID: <1299005577480-3330378.post@n4.nabble.com> Hey thanks alot guys !!! That really speeds things up !!! I didn't know %*% and crossprod, could operate on matrices. I think you've saved me hours in calculation time. Thanks again. > system.time({C=matrix(0,50,50);for(i in 1:n)C = C + (X[i,] %o% X[i,])}) user system elapsed 0.45 0.00 0.90 > system.time({C1 = t(X) %*% X}) user system elapsed 0.02 0.00 0.05 > system.time({C2 = crossprod(X)}) user system elapsed 0.02 0.00 0.02 -- View this message in context: http://r.789695.n4.nabble.com/Speed-up-sum-of-outer-products-tp3330160p3330378.html Sent from the R help mailing list archive at Nabble.com. From rex.dwyer at syngenta.com Tue Mar 1 20:11:45 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Tue, 1 Mar 2011 14:11:45 -0500 Subject: [R] Finding pairs with least magnitude difference from mean In-Reply-To: References: <36180405F8418449918AD20618D110FC095BF0B0F1@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Message-ID: <36180405F8418449918AD20618D110FC095BF5470F@USETCMSXMB02.NAFTA.SYNGENTA.ORG> No, that's not what I meant, but maybe I didn't understand the question. What I suggested would involve sorting y, not x: "sort the *distances*". If you want to minimize the sd of a subset of numbers, you sort the numbers and find a subset that is clumped together. If the numbers are a function of pairs, you compute the function for all pairs of numbers, and find a subset that's clumped together. Anyway, it's an idea, not a theorem, so proof is left as an exercise for the esteemed reader. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Hans W Borchers Sent: Monday, February 28, 2011 2:17 PM To: r-help at stat.math.ethz.ch Subject: Re: [R] Finding pairs with least magnitude difference from mean syngenta.com> writes: > James, > It seems the 2*mean(x) term is irrelevant if you are seeking to > minimize sd. Then you want to sort the distances from smallest to > largest. Then it seems clear that your five values will be adjacent in > the list, since if you have a set of five adjacent values, exchanging > any of them for one further away in the list will increase the sd. The > only problem I see with this is that you can't use a number more than > once. In any case, you need to compute the best five pairs beginning > at position i in the sorted list, for 1<=i<=choose(n,2), then take the > max over all i. > There no R in my answer such as you'd notice, but I hope it helps just > the same. > Rex You probably mean something like the following: x <- rnorm(10) y <- outer(x, x, "+") - (2 * mean(x)) o <- order(x) sd(c(y[o[1],o[10]], y[o[2],o[9]], y[o[3],o[8]], y[o[4],o[7]], y[o[5],o[6]])) This seems reasonable, though you would have to supply a more stringent argument. I did two tests and it works alright. --Hans Werner ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. From djmuser at gmail.com Tue Mar 1 20:13:37 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Tue, 1 Mar 2011 11:13:37 -0800 Subject: [R] bootstrap resampling - simplified In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ivowel at gmail.com Tue Mar 1 20:19:26 2011 From: ivowel at gmail.com (ivo welch) Date: Tue, 1 Mar 2011 14:19:26 -0500 Subject: [R] inefficient ifelse() ? Message-ID: dear R experts--- t <- 1:30 f <- function(t) { cat("f for", t, "\n"); return(2*t) } g <- function(t) { cat("g for", t, "\n"); return(3*t) } s <- ifelse( t%%2==0, g(t), f(t)) shows that the ifelse function actually evaluates both f() and g() for all values first, and presumably then just picks left or right results based on t%%2. uggh... wouldn't it make more sense to evaluate only the relevant parts of each vector and then reassemble them? /iaw ---- Ivo Welch From wwwhsd at gmail.com Tue Mar 1 20:33:52 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Tue, 1 Mar 2011 16:33:52 -0300 Subject: [R] inefficient ifelse() ? In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From arrayprofile at yahoo.com Tue Mar 1 20:38:22 2011 From: arrayprofile at yahoo.com (array chip) Date: Tue, 1 Mar 2011 11:38:22 -0800 (PST) Subject: [R] glht() used with coxph() Message-ID: <647233.14657.qm@web56305.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From darylm at uw.edu Tue Mar 1 20:32:13 2011 From: darylm at uw.edu (Daryl Morris) Date: Tue, 01 Mar 2011 11:32:13 -0800 Subject: [R] expression help Message-ID: <4D6D49BD.2070007@uw.edu> Hello, I am trying to write math-type on a plot. Due to space limitations on the plot, I want 2 short expressions written on top of each other. It is certainly possible to write them in two separate calls, but that involves fine-tuning locations and a lot of trial and error (and I'm trying to write a general purpose function). Here's where I've gotten to: plot(0:1,0:1,xaxt="n") axis(side=1,at=.3,expression(paste("IFN-", gamma, "\n", "TNF-", alpha))) axis(side=1,at=.6,"a label\n2nd line") What I am trying to do is illustrated by the the non-expression axis label ("a label\n2nd line"). The "\n" forces a new line when I'm not using expressions, but doesn't work for my real example. I have googled for general documentation on expressions, but the only thing I've been able to find is the help for plotmath(), so if someone can point me to more complete documentation that would also be helpful. thanks, Daryl SCHARP, FHCRC, UW Biostatistics From mark.lyman at ngc.com Tue Mar 1 20:07:03 2011 From: mark.lyman at ngc.com (Mark Lyman) Date: Tue, 1 Mar 2011 19:07:03 +0000 Subject: [R] odbcConnectExcel2007 creates corrupted files Message-ID: I tried creating a .xlsx file using odbcConnectExcel2007 and adding a worksheet with sqlSave. This seems to work, I am even able to query the worksheet, but when I try opening the file in Excel I get the following message: "Excel cannot open the file 'test.xlx' because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file." Is this a known issue? Or just user error? The RODBC manual seemed to indicate that sqlSave worked fine with Excel, however it did not mention Excel 2007. I am running Excel 2007 and R 2.12.1 on Windows XP. Below is my example code. $ library(RODBC) $ $ # This doesn't work $ # Connect to an previously non-existent Excel file $ out <- odbcConnectExcel2007("test.xlsx", readOnly=FALSE) $ test <- data.frame(x=1:10, y=rnorm(10)) $ sqlSave(out, test) $ sqlTables(out) TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS 1 C:\\Documents and Settings\\G69974\\My Documents\\test.xlsx test$ SYSTEM TABLE 2 C:\\Documents and Settings\\G69974\\My Documents\\test.xlsx test TABLE $ sqlFetch(out, "test") x y 1 1 0.5832882 2 2 0.4387569 3 3 -0.6444048 4 4 -1.0013450 5 5 1.0324718 6 6 -0.7844128 7 7 -1.6789266 8 8 0.1402672 9 9 0.8650061 10 10 -0.0420201 $ close(out) $ # Opening test.xlsx now fails $ $ # This works $ out <- odbcConnectExcel("test.xls", readOnly=FALSE) $ test <- data.frame(x=1:10, y=rnorm(10)) $ sqlSave(out, test) $ sqlTables(out) TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS 1 C:\\Documents and Settings\\G69974\\My Documents\\test test$ SYSTEM TABLE 2 C:\\Documents and Settings\\G69974\\My Documents\\test test TABLE $ sqlFetch(out, "test") x y 1 1 0.5955787 2 2 1.0517528 3 3 0.3884892 4 4 -2.1408813 5 5 -0.7081686 6 6 0.1511828 7 7 2.0560555 8 8 -0.5801912 9 9 -0.6988058 10 10 -0.1237739 $ close(out) $ # Opening test.xls now works From mmsilva3 at uc.cl Tue Mar 1 20:10:34 2011 From: mmsilva3 at uc.cl (maxsilva) Date: Tue, 1 Mar 2011 11:10:34 -0800 (PST) Subject: [R] Export R dataframes to excel Message-ID: <1299006634234-3330399.post@n4.nabble.com> I'm trying to do this in several ways but havent had any result. Im asked to install python, or perl.... etc. Can anybody suggest a direct, easy and understandable way? Every help would be appreciated. Thx. -- View this message in context: http://r.789695.n4.nabble.com/Export-R-dataframes-to-excel-tp3330399p3330399.html Sent from the R help mailing list archive at Nabble.com. From ADRIAN.KATSCHKE at DFAS.MIL Tue Mar 1 20:54:55 2011 From: ADRIAN.KATSCHKE at DFAS.MIL (KATSCHKE, ADRIAN CIV DFAS) Date: Tue, 1 Mar 2011 14:54:55 -0500 Subject: [R] Export R dataframes to excel In-Reply-To: <1299006634234-3330399.post@n4.nabble.com> References: <1299006634234-3330399.post@n4.nabble.com> Message-ID: write.table() using the sep="," and file extension as .csv works great to pull directly into excel. ?write.table Without more detail as to the problem, it is difficult to give a more specific answer. Adrian -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of maxsilva Sent: Tuesday, March 01, 2011 2:11 PM To: r-help at r-project.org Subject: [R] Export R dataframes to excel I'm trying to do this in several ways but havent had any result. Im asked to install python, or perl.... etc. Can anybody suggest a direct, easy and understandable way? Every help would be appreciated. Thx. -- View this message in context: http://r.789695.n4.nabble.com/Export-R-dataframes-to-excel-tp3330399p3330399.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From seth at userprimary.net Tue Mar 1 20:56:53 2011 From: seth at userprimary.net (Seth Falcon) Date: Tue, 1 Mar 2011 11:56:53 -0800 Subject: [R] Data type problem when extract data from SQLite to R by using RSQLite In-Reply-To: References: Message-ID: On Tue, Mar 1, 2011 at 10:06 AM, chen jia wrote: > Hi Seth, > > Thanks so much for identifying the problem and explaining everything. > I think the first solution that you suggest--make sure the schema has > well defined types--would work the best for me. But, I have one > question about how to implement it, which is more about sqlite itself. > > First, I found out that the columns that don't have the expected data > types in the table annual_data3 are created by aggregate functions in > a separate table. These columns are later combined with other columns > that do. > > I read the link that you provide, > http://www.sqlite.org/datatype3.html. One paragraph says "When > grouping values with the GROUP BY clause values with different storage > classes are considered distinct, except for INTEGER and REAL values > which are considered equal if they are numerically equal. No > affinities are applied to any values as the result of a GROUP by > clause." > > If I understand it correctly, the columns created by aggregate > functions with a GROUP by clause do not have any expected data types. > > My solution is to use CREATE TABLE clause to declare the expected > datatype and then insert the values of columns created by the > aggregate functions with the GROUP by clause. However, this solution > requires a CREATE TABLE cause every time the aggregate function and > the GROUP by clause is used. > > My question is: Is this the best way to make sure that the columns as > a result of a GROUP by clause have the expected data types? Thanks. That might be a good question to post to the SQLite user's list :-) I don't have an answer off the top of my head. My reading of the SQLite docs would lead me to expect that a GROUP BY clause would not change/remove type if the column being grouped contains all the same declared type affinity. + seth > > Best, > Jia > > On Tue, Mar 1, 2011 at 1:16 AM, Seth Falcon wrote: >> Hi Jia, >> >> On Mon, Feb 28, 2011 at 6:57 PM, chen jia wrote: >>> The .schema of table annual_data3 is >>> sqlite> .schema annual_data3 >>> CREATE TABLE "annual_data3"( >>> ?PERMNO INT, >>> ?DATE INT, >>> ?CUSIP TEXT, >>> ?EXCHCD INT, >>> ?SICCD INT, >>> ?SHROUT INT, >>> ?PRC REAL, >>> ?RET REAL, >>> ?... >>> ?pret_var, >>> ?pRET_sd, >>> ?nmret, >>> ?pya_var, >> >> [snip] >> >> Is there a reason that you've told SQLite the expected data type for >> only some of the columns? >> >>> Interestingly, I find that the problem I reported does not for columns >>> labeled real in the schema info. For example, the type of column RET >>> never changes no matter what the first observation is. >> >> Yes, that is expected and I think it is the solution to your problem: >> setup your schema so that all columns have a declared type. ?For some >> details on SQLite's type system see >> http://www.sqlite.org/datatype3.html. >> >> RSQLite currently maps NA values to NULL in the database. ?Pulling >> data out of a SELECT query, RSQLite uses the sqlite3_column_type >> SQLite API to determine the data type and map it to an R type. ?If >> NULL is encountered, then the schema is inspected using >> sqlite3_column_decltype to attempt to obtain a type. ?If that fails, >> the data is mapped to a character vector at the R level. ?The type >> selection is done once after the first row has been fetched. >> >> To work around this you can: >> >> - make sure your schema has well defined >> ?types (which will help SQLite perform its operations); >> >> - check whether the returned column has the expected type and convert >> ?if needed at the R level. >> >> - remove NA/NULL values from the db or decide on a different way of >> ?encoding them (e.g you might be able to use -1 in the db in some >> ?situation to indicate missing). ?Your R code would then need to map >> ?these to proper NA. >> >> Hope that helps. >> >> + seth >> >> >> >> -- >> Seth Falcon | @sfalcon | http://userprimary.net/ >> > > > > -- > 700 Fisher Hall > 2100 Neil Ave. > Columbus, Ohio? 43210 > http://www.fisher.osu.edu/~chen_1002/ > -- Seth Falcon | @sfalcon | http://userprimary.net/ From steve.taylor at aut.ac.nz Tue Mar 1 21:15:37 2011 From: steve.taylor at aut.ac.nz (Steve Taylor) Date: Wed, 02 Mar 2011 09:15:37 +1300 Subject: [R] Export R dataframes to excel In-Reply-To: <1299006634234-3330399.post@n4.nabble.com> References: <1299006634234-3330399.post@n4.nabble.com> Message-ID: <4D6E0AB8.BF7C.0029.1@aut.ac.nz> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bhh at xs4all.nl Tue Mar 1 21:17:53 2011 From: bhh at xs4all.nl (Berend Hasselman) Date: Tue, 1 Mar 2011 12:17:53 -0800 (PST) Subject: [R] Export R dataframes to excel In-Reply-To: <1299009938878-3330491.post@n4.nabble.com> References: <1299006634234-3330399.post@n4.nabble.com> <1299009938878-3330491.post@n4.nabble.com> Message-ID: <1299010673914-3330518.post@n4.nabble.com> maxsilva wrote: > > Thx, but im looking for a more direct solution... my problem is very > simple, I have a dataframe and I want to create a standard excel > spreadsheet. My dataframe could be something like this > More or less the same question was answered several hours ago. See http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windows&s=excel as Gabor Grothendieck suggested. /Berend -- View this message in context: http://r.789695.n4.nabble.com/Export-R-dataframes-to-excel-tp3330399p3330518.html Sent from the R help mailing list archive at Nabble.com. From dmck at u.washington.edu Tue Mar 1 21:23:24 2011 From: dmck at u.washington.edu (Don McKenzie) Date: Tue, 1 Mar 2011 12:23:24 -0800 Subject: [R] Export R dataframes to excel In-Reply-To: <1299010673914-3330518.post@n4.nabble.com> References: <1299006634234-3330399.post@n4.nabble.com> <1299009938878-3330491.post@n4.nabble.com> <1299010673914-3330518.post@n4.nabble.com> Message-ID: <6413FAF9-78BC-4343-86CA-A8D75D6C9D82@u.washington.edu> Or ?write.csv which excel will import On 1-Mar-11, at 12:17 PM, Berend Hasselman wrote: > > maxsilva wrote: >> >> Thx, but im looking for a more direct solution... my problem is very >> simple, I have a dataframe and I want to create a standard excel >> spreadsheet. My dataframe could be something like this >> > > More or less the same question was answered several hours ago. > See http://rwiki.sciviews.org/doku.php?id=tips:data- > io:ms_windows&s=excel > as Gabor Grothendieck suggested. > > /Berend > > -- > View this message in context: http://r.789695.n4.nabble.com/Export- > R-dataframes-to-excel-tp3330399p3330518.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. Why does the universe go to all the bother of existing? -- Stephen Hawking #define QUESTION ((bb) || !(bb)) -- William Shakespeare Don McKenzie, Research Ecologist Pacific WIldland Fire Sciences Lab US Forest Service Affiliate Professor School of Forest Resources, College of the Environment CSES Climate Impacts Group University of Washington desk: 206-732-7824 cell: 206-321-5966 dmck at uw.edu donaldmckenzie at fs.fed.us From ehlers at ucalgary.ca Tue Mar 1 21:27:13 2011 From: ehlers at ucalgary.ca (Peter Ehlers) Date: Tue, 01 Mar 2011 12:27:13 -0800 Subject: [R] Lattice: useOuterStrips and axes In-Reply-To: <1299004197583-3330338.post@n4.nabble.com> References: <1299004197583-3330338.post@n4.nabble.com> Message-ID: <4D6D56A1.5020902@ucalgary.ca> On 2011-03-01 10:29, Jim Price wrote: > Consider the following: > > > library(lattice) > library(latticeExtra) > > temp<- expand.grid( > subject = factor(paste('Subject', 1:3)), > var = factor(paste('Variable', 1:3)), > time = 1:10 > ) > temp$resp<- rnorm(nrow(temp), 10 * as.numeric(temp$var), 1) > > ylimits<- by(temp$resp, temp$var, function(x) range(pretty(x))) > > useOuterStrips(xyplot( > resp ~ time | subject * var, > data = temp, > as.table = TRUE, > scales = list( > alternating = 1, tck = c(1, 0), > y = list(relation = 'free', rot = 0, limits = rep(ylimits, each = > 3)) > ) > )) > > > This is a matrix of variables on subjects, where it makes sense to have > panel-specific y-axes because of the differing variable ranges. In fact, it > makes sense to have row-specific y-axes, because of the similarity of > intra-variable inter-subject response. However the graphic as presented > gives per-panel y-axis annotation - I'd like to drop the 2nd and 3rd columns > of y-axes labels, so the panels become a contiguous array. > > Is this possible? Looks like you want the combineLimits() function in latticeExtra. Specifically, p <- [your code] combineLimits(p, margin.y = 1, extend = FALSE) Peter Ehlers > > Thanks, > Jim Price. > Cardiome Pharma. Corp. > > > From ivo.welch at gmail.com Tue Mar 1 21:36:12 2011 From: ivo.welch at gmail.com (ivo welch) Date: Tue, 1 Mar 2011 15:36:12 -0500 Subject: [R] inefficient ifelse() ? In-Reply-To: References: Message-ID: thanks, Henrique. did you mean as.vector(t(mapply(function(x, f)f(x), split(t, ((t %% 2)==0)), list(f, g)))) ? otherwise, you get a matrix. its a good solution, but unfortunately I don't think this can be used to redefine ifelse(cond,ift,iff) in a way that is transparent. the ift and iff functions will always be evaluated before the function call happens, even with lazy evaluation. :-( I still think that it makes sense to have a smarter vectorized %if% in a vectorized language like R. just my 5 cents. /iaw ---- Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com) On Tue, Mar 1, 2011 at 2:33 PM, Henrique Dallazuanna wrote: > Try this: > > mapply(function(x, f)f(x), split(t, t %% 2), list(g, f)) > > On Tue, Mar 1, 2011 at 4:19 PM, ivo welch wrote: >> >> dear R experts--- >> >> ?t <- 1:30 >> ?f <- function(t) { cat("f for", t, "\n"); return(2*t) } >> ?g <- function(t) { cat("g for", t, "\n"); return(3*t) } >> ?s <- ifelse( t%%2==0, g(t), f(t)) >> >> shows that the ifelse function actually evaluates both f() and g() for >> all values first, and presumably then just picks left or right results >> based on t%%2. ?uggh... wouldn't it make more sense to evaluate only >> the relevant parts of each vector and then reassemble them? >> >> /iaw >> ---- >> Ivo Welch >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > Henrique Dallazuanna > Curitiba-Paran?-Brasil > 25? 25' 40" S 49? 16' 22" O > From djmuser at gmail.com Tue Mar 1 21:42:28 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Tue, 1 Mar 2011 12:42:28 -0800 Subject: [R] Speed up sum of outer products? In-Reply-To: <1299005577480-3330378.post@n4.nabble.com> References: <1298998430979-3330160.post@n4.nabble.com> <1299005577480-3330378.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djmuser at gmail.com Tue Mar 1 21:55:32 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Tue, 1 Mar 2011 12:55:32 -0800 Subject: [R] expression help In-Reply-To: <4D6D49BD.2070007@uw.edu> References: <4D6D49BD.2070007@uw.edu> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Greg.Snow at imail.org Tue Mar 1 21:56:27 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Tue, 1 Mar 2011 13:56:27 -0700 Subject: [R] Regression with many independent variables In-Reply-To: References: Message-ID: You can use ^2 to get all 2 way interactions and ^3 to get all 3 way interactions, e.g.: lm(Sepal.Width ~ (. - Sepal.Length)^2, data=iris) The lm.fit function is what actually does the fitting, so you could go directly there, but then you lose the benefits of using . and ^. The Matrix package has ways of dealing with sparse matricies, but I don't know if that would help here or not. You could also just create x'x and x'y matricies directly since the variables are 0/1 then use solve. A lot depends on what you are doing and what questions you are trying to answer. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: Matthew Douglas [mailto:matt.douglas01 at gmail.com] > Sent: Tuesday, March 01, 2011 1:09 PM > To: Greg Snow > Cc: r-help at r-project.org > Subject: Re: [R] Regression with many independent variables > > Hi Greg, > > Thanks for the help, it works perfectly. To answer your question, > there are 339 independent variables but only 10 will be used at one > time . So at any given line of the data set there will be 10 non zero > entries for the independent variables and the rest will be zeros. > > One more question: > > 1. I still want to find a way to look at the interactions of the > independent variables. > > the regression would look like this: > > y = b12*X1X2 + b23*X2X3 +...+ bk-1k*Xk-1Xk > > so I think the regression in R would look like this: > > lm(MARGIN, P235:P236+P236:P237+....,weights = Poss, data = adj0708), > > my problem is that since I have technically 339 independent variables, > when I do this regression I would have 339 Choose 2 = approx 57000 > independent variables (a vast majority will be 0s though) so I dont > want to have to write all of these out. Is there a way to do this > quickly in R? > > Also just a curious question that I cant seem to find to online: > is there a more efficient model other than lm() that is better for > very sparse data sets like mine? > > Thanks, > Matt > > > On Mon, Feb 28, 2011 at 4:30 PM, Greg Snow wrote: > > Don't put the name of the dataset in the formula, use the data > argument to lm to provide that. ?A single period (".") on the right > hand side of the formula will represent all the columns in the data set > that are not on the left hand side (you can then use "-" to remove any > other columns that you don't want included on the RHS). > > > > For example: > > > >> lm(Sepal.Width ~ . - Sepal.Length, data=iris) > > > > Call: > > lm(formula = Sepal.Width ~ . - Sepal.Length, data = iris) > > > > Coefficients: > > ? ? ?(Intercept) ? ? ? Petal.Length ? ? ? ?Petal.Width > ?Speciesversicolor > > ? ? ? ? ? 3.0485 ? ? ? ? ? ? 0.1547 ? ? ? ? ? ? 0.6234 ? ? ? ? ? ?- > 1.7641 > > ?Speciesvirginica > > ? ? ? ? ?-2.1964 > > > > > > But, are you sure that a regression model with 339 predictors will be > meaningful? > > > > -- > > Gregory (Greg) L. Snow Ph.D. > > Statistical Data Center > > Intermountain Healthcare > > greg.snow at imail.org > > 801.408.8111 > > > > > >> -----Original Message----- > >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > >> project.org] On Behalf Of Matthew Douglas > >> Sent: Monday, February 28, 2011 1:32 PM > >> To: r-help at r-project.org > >> Subject: [R] Regression with many independent variables > >> > >> Hi, > >> > >> I am trying use lm() on some data, the code works fine but I would > >> like to use a more efficient way to do this. > >> > >> The data looks like this (the data is very sparse with a few 1s, -1s > >> and the rest 0s): > >> > >> > head(adj0708) > >> ? ? ? MARGIN Poss P235 P247 P703 P218 P430 P489 P83 P307 P337.... > >> 1 ? 64.28571 ? 29 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 ? ?0 > >> 0 ? ?0 ? ?0 > >> 2 -100.00000 ? ?6 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?1 ? ?0 ? ?0 > >> 0 ? ?0 ? ?0 > >> 3 ?100.00000 ? ?4 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?1 ? ?0 ? ?0 > >> 0 ? ?0 ? ?0 > >> 4 ?-33.33333 ? ?7 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 ? ?0 > >> 0 ? ?0 ? ?0 > >> 5 ?200.00000 ? ?2 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 ? ?0 > >> -1 ? ?0 ? ?0 > >> 6 ?-83.33333 ? 12 ? ?0 ? ?-1 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 ? ?0 > >> 0 ? ?0 ? ?0 > >> > >> adj0708 is actually a 35657x341 data set. Each column after "Poss" > is > >> an independent variable, the dependent variable is "MARGIN" and it > is > >> weighted by "Poss" > >> > >> > >> The regression is below: > >> fit.adj0708 <- lm( adj0708$MARGIN~adj0708$P235 + adj0708$P247 + > >> adj0708$P703 + adj0708$P430 + adj0708$P489 + adj0708$P218 + > >> adj0708$P605 + adj0708$P337 + .... + > >> adj0708$P510,weights=adj0708$Poss) > >> > >> I have two questions: > >> > >> 1. Is there a way to to condense how I write the independent > variables > >> in the lm(), instead of having such a long line of code (I have 339 > >> independent variables to be exact)? > >> 2. I would like to pair the data to look a regression of the > >> interactions between two independent variables. I think it would > look > >> something like this.... > >> fit.adj0708 <- lm( adj0708$MARGIN~adj0708$P235:adj0708$P247 + > >> adj0708$P703:adj0708$P430 + adj0708$P489:adj0708$P218 + > >> adj0708$P605:adj0708$P337 + ....,weights=adj0708$Poss) > >> but there will be 339 Choose 2 combinations, so a lot of independent > >> variables! Is there a more efficient way of writing this code. Is > >> there a way I can do this? > >> > >> Thanks, > >> Matt > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide http://www.R-project.org/posting- > >> guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > From iaingallagher at btopenworld.com Tue Mar 1 22:11:22 2011 From: iaingallagher at btopenworld.com (Iain Gallagher) Date: Tue, 1 Mar 2011 21:11:22 +0000 (GMT) Subject: [R] Export R dataframes to excel In-Reply-To: <4D6E0AB8.BF7C.0029.1@aut.ac.nz> Message-ID: <229111.65291.qm@web86704.mail.ird.yahoo.com> This appeared today on the r-bloggers site and might be useful for you. http://www.r-bloggers.com/release-of-xlconnect-0-1-3/ cheers i --- On Tue, 1/3/11, Steve Taylor wrote: > From: Steve Taylor > Subject: Re: [R] Export R dataframes to excel > To: r-help at r-project.org, "maxsilva" > Date: Tuesday, 1 March, 2011, 20:15 > You can copy it with the following > function and then paste into Excel... > > copy = function (df, buffer.kb=256) { > ? write.table(df, > file=paste("clipboard-",buffer.kb,sep=""), > ? ? ? sep="\t", na='', quote=FALSE, > row.names=FALSE) > } > > > >>> > > From: maxsilva > To: > Date: 2/Mar/2011 8:50a > Subject: [R] Export R dataframes to excel > > I'm trying to do this in several ways but havent had any > result. Im asked to > install python, or perl.... etc. Can anybody suggest a > direct, easy and > understandable way?? Every help would be appreciated. > > > Thx. > > -- > View this message in context: http://r.789695.n4.nabble.com/Export-R-dataframes-to-excel-tp3330399p3330399.html > > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R ( http://www.r/ > )-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > > ??? [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > From tamas.barjak02 at gmail.com Tue Mar 1 22:36:20 2011 From: tamas.barjak02 at gmail.com (Tamas Barjak) Date: Tue, 1 Mar 2011 22:36:20 +0100 Subject: [R] error in saved .csv Message-ID: An embedded and charset-unspecified text was scrubbed... Name: nem el?rhet? URL: From djmuser at gmail.com Tue Mar 1 22:47:07 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Tue, 1 Mar 2011 13:47:07 -0800 Subject: [R] inefficient ifelse() ? In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From f.harrell at vanderbilt.edu Tue Mar 1 22:49:49 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Tue, 1 Mar 2011 15:49:49 -0600 Subject: [R] [R-pkgs] Major update to rms package Message-ID: A new version of rms is now available on CRAN for Linux and Windows (Mac will probably be available very soon). Largest changes include latex methods for validate.* and adding the capability to force a subset of variables to be included in all backwards stepdown models (single model or validation by resampling). Recent updates: * In survplot.rms, fixed bug (curves were undefined if conf='bands' and labelc was FALSE) * In survfit.cph, fixed bug by which n wasn't always defined * In cph, put survival::: on exact fit call * Quit ignoring zlim argument in bplot; added xlabrot argument * Added caption argument for latex.anova.rms * Changed predab to not print summaries of variables selected if bw=TRUE * Changed predab to pass force argument to fastbw * fastbw: implemented force argument * Added force argument to validate.lrm, validate.bj, calibrate.default, calibrate.cph, calibrate.psm, validate.bj, validate.cph, validate.ols * print.validate: added B argument to limit how many resamples are printed summarizing variables selected if BW=TRUE * print.calibrate, print.calibrate.default: added B argument * Added latex method for results produced by validate functions * Fixed survest.cph to convert summary.survfit std.err to log S(t) scale * Fixed val.surv by pulling surv object from survest result * Clarified in predict.lrm help file that doesn't always use the first intercept * lrm.fit, lrm: linear predictor stored in fit object now uses first intercept and not middle one (NOT DOWNWARD COMPATIBLE but makes predict work when using stored linear.predictors) * Fixed argument consistency with validate methods More information is at http://biostat.mc.vanderbilt.edu/Rrms -- Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages From matt.douglas01 at gmail.com Tue Mar 1 21:09:01 2011 From: matt.douglas01 at gmail.com (Matthew Douglas) Date: Tue, 1 Mar 2011 15:09:01 -0500 Subject: [R] Regression with many independent variables In-Reply-To: References: Message-ID: Hi Greg, Thanks for the help, it works perfectly. To answer your question, there are 339 independent variables but only 10 will be used at one time . So at any given line of the data set there will be 10 non zero entries for the independent variables and the rest will be zeros. One more question: 1. I still want to find a way to look at the interactions of the independent variables. the regression would look like this: y = b12*X1X2 + b23*X2X3 +...+ bk-1k*Xk-1Xk so I think the regression in R would look like this: lm(MARGIN, P235:P236+P236:P237+....,weights = Poss, data = adj0708), my problem is that since I have technically 339 independent variables, when I do this regression I would have 339 Choose 2 = approx 57000 independent variables (a vast majority will be 0s though) so I dont want to have to write all of these out. Is there a way to do this quickly in R? Also just a curious question that I cant seem to find to online: is there a more efficient model other than lm() that is better for very sparse data sets like mine? Thanks, Matt On Mon, Feb 28, 2011 at 4:30 PM, Greg Snow wrote: > Don't put the name of the dataset in the formula, use the data argument to lm to provide that. ?A single period (".") on the right hand side of the formula will represent all the columns in the data set that are not on the left hand side (you can then use "-" to remove any other columns that you don't want included on the RHS). > > For example: > >> lm(Sepal.Width ~ . - Sepal.Length, data=iris) > > Call: > lm(formula = Sepal.Width ~ . - Sepal.Length, data = iris) > > Coefficients: > ? ? ?(Intercept) ? ? ? Petal.Length ? ? ? ?Petal.Width ?Speciesversicolor > ? ? ? ? ? 3.0485 ? ? ? ? ? ? 0.1547 ? ? ? ? ? ? 0.6234 ? ? ? ? ? ?-1.7641 > ?Speciesvirginica > ? ? ? ? ?-2.1964 > > > But, are you sure that a regression model with 339 predictors will be meaningful? > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.snow at imail.org > 801.408.8111 > > >> -----Original Message----- >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- >> project.org] On Behalf Of Matthew Douglas >> Sent: Monday, February 28, 2011 1:32 PM >> To: r-help at r-project.org >> Subject: [R] Regression with many independent variables >> >> Hi, >> >> I am trying use lm() on some data, the code works fine but I would >> like to use a more efficient way to do this. >> >> The data looks like this (the data is very sparse with a few 1s, -1s >> and the rest 0s): >> >> > head(adj0708) >> ? ? ? MARGIN Poss P235 P247 P703 P218 P430 P489 P83 P307 P337.... >> 1 ? 64.28571 ? 29 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 ? ?0 >> 0 ? ?0 ? ?0 >> 2 -100.00000 ? ?6 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?1 ? ?0 ? ?0 >> 0 ? ?0 ? ?0 >> 3 ?100.00000 ? ?4 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?1 ? ?0 ? ?0 >> 0 ? ?0 ? ?0 >> 4 ?-33.33333 ? ?7 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 ? ?0 >> 0 ? ?0 ? ?0 >> 5 ?200.00000 ? ?2 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 ? ?0 >> -1 ? ?0 ? ?0 >> 6 ?-83.33333 ? 12 ? ?0 ? ?-1 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 ? ?0 >> 0 ? ?0 ? ?0 >> >> adj0708 is actually a 35657x341 data set. Each column after "Poss" is >> an independent variable, the dependent variable is "MARGIN" and it is >> weighted by "Poss" >> >> >> The regression is below: >> fit.adj0708 <- lm( adj0708$MARGIN~adj0708$P235 + adj0708$P247 + >> adj0708$P703 + adj0708$P430 + adj0708$P489 + adj0708$P218 + >> adj0708$P605 + adj0708$P337 + .... + >> adj0708$P510,weights=adj0708$Poss) >> >> I have two questions: >> >> 1. Is there a way to to condense how I write the independent variables >> in the lm(), instead of having such a long line of code (I have 339 >> independent variables to be exact)? >> 2. I would like to pair the data to look a regression of the >> interactions between two independent variables. I think it would look >> something like this.... >> fit.adj0708 <- lm( adj0708$MARGIN~adj0708$P235:adj0708$P247 + >> adj0708$P703:adj0708$P430 + adj0708$P489:adj0708$P218 + >> adj0708$P605:adj0708$P337 + ....,weights=adj0708$Poss) >> but there will be 339 Choose 2 combinations, so a lot of independent >> variables! Is there a more efficient way of writing this code. Is >> there a way I can do this? >> >> Thanks, >> Matt >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. > From mmsilva3 at uc.cl Tue Mar 1 21:05:38 2011 From: mmsilva3 at uc.cl (maxsilva) Date: Tue, 1 Mar 2011 12:05:38 -0800 (PST) Subject: [R] Export R dataframes to excel In-Reply-To: References: <1299006634234-3330399.post@n4.nabble.com> Message-ID: <1299009938878-3330491.post@n4.nabble.com> Thx, but im looking for a more direct solution... my problem is very simple, I have a dataframe and I want to create a standard excel spreadsheet. My dataframe could be something like this id sex weight 1 M 5'8 2 F 6'2 3 F 5'5 4 M 5'7 5 F 6'3 -- View this message in context: http://r.789695.n4.nabble.com/Export-R-dataframes-to-excel-tp3330399p3330491.html Sent from the R help mailing list archive at Nabble.com. From manototh at gmail.com Tue Mar 1 21:43:51 2011 From: manototh at gmail.com (Mano Gabor Toth) Date: Tue, 1 Mar 2011 21:43:51 +0100 Subject: [R] Logistic Stepwise Criterion Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pjmiller_57 at yahoo.com Tue Mar 1 22:07:53 2011 From: pjmiller_57 at yahoo.com (Paul Miller) Date: Tue, 1 Mar 2011 13:07:53 -0800 (PST) Subject: [R] Pairwise T-Tests and Dunnett's Test (possibly using multcomp) In-Reply-To: Message-ID: <454646.25040.qm@web161620.mail.bf1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From price_ja at hotmail.com Tue Mar 1 22:09:19 2011 From: price_ja at hotmail.com (Jim Price) Date: Tue, 1 Mar 2011 13:09:19 -0800 (PST) Subject: [R] Lattice: useOuterStrips and axes In-Reply-To: <4D6D56A1.5020902@ucalgary.ca> References: <1299004197583-3330338.post@n4.nabble.com> <4D6D56A1.5020902@ucalgary.ca> Message-ID: <1299013759458-3330613.post@n4.nabble.com> Thank you, that's exactly what I needed. -- View this message in context: http://r.789695.n4.nabble.com/Lattice-useOuterStrips-and-axes-tp3330338p3330613.html Sent from the R help mailing list archive at Nabble.com. From dfrankow at gmail.com Tue Mar 1 22:25:04 2011 From: dfrankow at gmail.com (Dan Frankowski) Date: Tue, 1 Mar 2011 15:25:04 -0600 Subject: [R] How to understand output from R's polr function (ordered logistic regression)? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dfrankow at gmail.com Tue Mar 1 22:29:46 2011 From: dfrankow at gmail.com (Dan Frankowski) Date: Tue, 1 Mar 2011 15:29:46 -0600 Subject: [R] How to understand output from R's polr function (ordered logistic regression)? In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From darcy.webber at gmail.com Tue Mar 1 22:55:27 2011 From: darcy.webber at gmail.com (Darcy Webber) Date: Wed, 2 Mar 2011 10:55:27 +1300 Subject: [R] splitting and stacking matrices Message-ID: Dear R users, I am having some difficulty arranging some matrices and wondered if anyone could help out. As an example, consider the following matrix: a <- matrix(1:32, nrow = 4, ncol = 8) a [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,] 1 5 9 13 17 21 25 29 [2,] 2 6 10 14 18 22 26 30 [3,] 3 7 11 15 19 23 27 31 [4,] 4 8 12 16 20 24 28 32 I would like it to look like the following matrix: [,1] [,2] [,3] [,4] [1,] 1 5 9 13 [2,] 2 6 10 14 [3,] 3 7 11 15 [4,] 4 8 12 16 [5,] 17 21 25 29 [6,] 18 22 26 30 [7,] 19 23 27 31 [8,] 20 24 28 32 I can achieve this using the following: a1 <- a[, 1:4] a2 <- a[, 5:8] b <- rbind(a1, a2) However, my initial matrix often has a varibale number of columns (in multiples of 4, and I still want to split the columns into blocks of 4 and stack these). I have considered working out how many blocks the matrix must be split into using: no.blocks <- ncol(a)/4. My problem is then implementing this information to actually split the matrix up and then stack it. Any guidance on this would be much appreciated. Regards Darcy Webber From tlumley at uw.edu Tue Mar 1 22:59:27 2011 From: tlumley at uw.edu (Thomas Lumley) Date: Wed, 2 Mar 2011 10:59:27 +1300 Subject: [R] inefficient ifelse() ? In-Reply-To: References: Message-ID: On Wed, Mar 2, 2011 at 9:36 AM, ivo welch wrote: > thanks, Henrique. ?did you mean > > ? ?as.vector(t(mapply(function(x, f)f(x), split(t, ((t %% 2)==0)), > list(f, g)))) ? ? > > otherwise, you get a matrix. > > its a good solution, but unfortunately I don't think this can be used > to redefine ifelse(cond,ift,iff) in a way that is transparent. ?the > ift and iff functions will always be evaluated before the function > call happens, even with lazy evaluation. ?:-( > > I still think that it makes sense to have a smarter vectorized %if% in > a vectorized language like R. ?just my 5 cents. > Ivo, There is no guarantee in general that f(x[3,5,7]) is the same as f(x)[3,5,7] -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland From wdunlap at tibco.com Tue Mar 1 23:13:04 2011 From: wdunlap at tibco.com (William Dunlap) Date: Tue, 1 Mar 2011 14:13:04 -0800 Subject: [R] inefficient ifelse() ? In-Reply-To: References: Message-ID: <77EB52C6DD32BA4D87471DCD70C8D70003F7474A@NA-PA-VBE03.na.tibco.com> An ifelse-like function that only evaluated what was needed would be fine, but it would have to be different from ifelse itself. The trick is to come up with a good parameterization. E.g., how would it deal with things like ifelse(is.na(x), mean(x, na.rm=TRUE), x) or ifelse(x>1, log(x), runif(length(x),-1,0)) or ifelse(x>1, log(x), -seq_along(x)) Would it reject such things? Deciding that the x in mean(x,na.rm=TRUE) should be replaced by x[is.na(x)] would be wrong. Deciding that runif(length(x)) should be replaced by runif(sum(x>1)) seems a bit much to expect. Replacing seq_along(x) with seq_len(sum(x>1)) is wrong. It would be better to parameterize the new function so it wouldn't have to think about those cases. Would you want it to depend only on a logical vector or perhaps also on a factor (a vectorized switch/case function)? Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of ivo welch > Sent: Tuesday, March 01, 2011 12:36 PM > To: Henrique Dallazuanna > Cc: r-help > Subject: Re: [R] inefficient ifelse() ? > > thanks, Henrique. did you mean > > as.vector(t(mapply(function(x, f)f(x), split(t, ((t %% 2)==0)), > list(f, g)))) ? > > otherwise, you get a matrix. > > its a good solution, but unfortunately I don't think this can be used > to redefine ifelse(cond,ift,iff) in a way that is transparent. the > ift and iff functions will always be evaluated before the function > call happens, even with lazy evaluation. :-( > > I still think that it makes sense to have a smarter vectorized %if% in > a vectorized language like R. just my 5 cents. > > /iaw > > ---- > Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com) > > > > > > On Tue, Mar 1, 2011 at 2:33 PM, Henrique Dallazuanna > wrote: > > Try this: > > > > mapply(function(x, f)f(x), split(t, t %% 2), list(g, f)) > > > > On Tue, Mar 1, 2011 at 4:19 PM, ivo welch wrote: > >> > >> dear R experts--- > >> > >> ?t <- 1:30 > >> ?f <- function(t) { cat("f for", t, "\n"); return(2*t) } > >> ?g <- function(t) { cat("g for", t, "\n"); return(3*t) } > >> ?s <- ifelse( t%%2==0, g(t), f(t)) > >> > >> shows that the ifelse function actually evaluates both f() > and g() for > >> all values first, and presumably then just picks left or > right results > >> based on t%%2. ?uggh... wouldn't it make more sense to > evaluate only > >> the relevant parts of each vector and then reassemble them? > >> > >> /iaw > >> ---- > >> Ivo Welch > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > > > > > -- > > Henrique Dallazuanna > > Curitiba-Paran?-Brasil > > 25? 25' 40" S 49? 16' 22" O > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From ivo.welch at gmail.com Tue Mar 1 23:20:26 2011 From: ivo.welch at gmail.com (ivo welch) Date: Tue, 1 Mar 2011 17:20:26 -0500 Subject: [R] inefficient ifelse() ? In-Reply-To: <77EB52C6DD32BA4D87471DCD70C8D70003F7474A@NA-PA-VBE03.na.tibco.com> References: <77EB52C6DD32BA4D87471DCD70C8D70003F7474A@NA-PA-VBE03.na.tibco.com> Message-ID: yikes. you are asking me too much. thanks everybody for the information. I learned something new. my suggestion would be for the much smarter language designers (than I) to offer us more or less blissfully ignorant users another vector-related construct in R. It could perhaps be named %if% %else%, analogous to if else (with naming inspired by %in%, and with evaluation only of relevant parts [just as if else for scalars]), with different outcomes in some cases, but with the advantage of typically evaluating only half as many conditions as the ifelse() vector construct. %if% %else% may work only in a subset of cases, but when it does work, it would be nice to have. it would probably be my first "goto" function, with ifelse() use only as a fallback. of course, I now know how to fix my specific issue. I was just surprised that my first choice, ifelse(), was not as optimized as I had thought. best, /iaw On Tue, Mar 1, 2011 at 5:13 PM, William Dunlap wrote: > An ifelse-like function that only evaluated > what was needed would be fine, but it would > have to be different from ifelse itself. ?The > trick is to come up with a good parameterization. > > E.g., how would it deal with things like > ? ifelse(is.na(x), mean(x, na.rm=TRUE), x) > or > ? ifelse(x>1, log(x), runif(length(x),-1,0)) > or > ? ifelse(x>1, log(x), -seq_along(x)) > Would it reject such things? ?Deciding that the > x in mean(x,na.rm=TRUE) should be replaced by > x[is.na(x)] would be wrong. ?Deciding that > runif(length(x)) should be replaced by runif(sum(x>1)) > seems a bit much to expect. ?Replacing seq_along(x) with > seq_len(sum(x>1)) is wrong. ?It would be better to > parameterize the new function so it wouldn't have to > think about those cases. > > Would you want it to depend only on a logical > vector or perhaps also on a factor (a vectorized > switch/case function)? > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > >> -----Original Message----- >> From: r-help-bounces at r-project.org >> [mailto:r-help-bounces at r-project.org] On Behalf Of ivo welch >> Sent: Tuesday, March 01, 2011 12:36 PM >> To: Henrique Dallazuanna >> Cc: r-help >> Subject: Re: [R] inefficient ifelse() ? >> >> thanks, Henrique. ?did you mean >> >> ? ? as.vector(t(mapply(function(x, f)f(x), split(t, ((t %% 2)==0)), >> list(f, g)))) ? ? >> >> otherwise, you get a matrix. >> >> its a good solution, but unfortunately I don't think this can be used >> to redefine ifelse(cond,ift,iff) in a way that is transparent. ?the >> ift and iff functions will always be evaluated before the function >> call happens, even with lazy evaluation. ?:-( >> >> I still think that it makes sense to have a smarter vectorized %if% in >> a vectorized language like R. ?just my 5 cents. >> >> /iaw >> >> ---- >> Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com) >> >> >> >> >> >> On Tue, Mar 1, 2011 at 2:33 PM, Henrique Dallazuanna >> wrote: >> > Try this: >> > >> > mapply(function(x, f)f(x), split(t, t %% 2), list(g, f)) >> > >> > On Tue, Mar 1, 2011 at 4:19 PM, ivo welch wrote: >> >> >> >> dear R experts--- >> >> >> >> ?t <- 1:30 >> >> ?f <- function(t) { cat("f for", t, "\n"); return(2*t) } >> >> ?g <- function(t) { cat("g for", t, "\n"); return(3*t) } >> >> ?s <- ifelse( t%%2==0, g(t), f(t)) >> >> >> >> shows that the ifelse function actually evaluates both f() >> and g() for >> >> all values first, and presumably then just picks left or >> right results >> >> based on t%%2. ?uggh... wouldn't it make more sense to >> evaluate only >> >> the relevant parts of each vector and then reassemble them? >> >> >> >> /iaw >> >> ---- >> >> Ivo Welch >> >> >> >> ______________________________________________ >> >> R-help at r-project.org mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> > >> > >> > >> > -- >> > Henrique Dallazuanna >> > Curitiba-Paran?-Brasil >> > 25? 25' 40" S 49? 16' 22" O >> > >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > From jholtman at gmail.com Tue Mar 1 23:18:36 2011 From: jholtman at gmail.com (jim holtman) Date: Tue, 1 Mar 2011 17:18:36 -0500 Subject: [R] error in saved .csv In-Reply-To: References: Message-ID: I am not sure what you are saying your problem is? Is the format incorrect? BTW, notice that write.csv does not have a 'sep' parameter. Maybe you should be using write.table. On Tue, Mar 1, 2011 at 4:36 PM, Tamas Barjak wrote: > Help me please! > > I would like to be saved a data table: > > write.csv(random.t1, "place", dec=",", append = T, quote = FALSE, sep = " ", > qmethod = "double", eol = "\n", row.names=F) > > It's OK! > > But the rows of file > > ?1,1,21042,-4084.87179487179,2457.66483516483,-582.275562799881 > 2,2,23846,-6383.86480186479,-3409.98451548449,-3569.72145340269 > and no > > 1 > 21042 - ? ? ? ? 4084.87179487179 2457.66483516483 > Not proportional... > > What's the problem??? > > Thanks! > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From peter.langfelder at gmail.com Tue Mar 1 23:24:28 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Tue, 1 Mar 2011 14:24:28 -0800 Subject: [R] error in saved .csv In-Reply-To: References: Message-ID: On Tue, Mar 1, 2011 at 1:36 PM, Tamas Barjak wrote: > Help me please! > > I would like to be saved a data table: > > write.csv(random.t1, "place", dec=",", append = T, quote = FALSE, sep = " ", > qmethod = "double", eol = "\n", row.names=F) > > It's OK! > > But the rows of file > ?1,1,21042,-4084.87179487179,2457.66483516483,-582.275562799881 > 2,2,23846,-6383.86480186479,-3409.98451548449,-3569.72145340269 > and no > > 1 > 21042 - ? ? ? ? 4084.87179487179 2457.66483516483 > Not proportional... > > What's the problem??? If I understand you correctly, you want a text file where the separator is a white space. You cannot get that with write.csv - you have to use write.table(). The function write.csv does not allow you to change the sep argument. Here's what the help file says: ?write.csv? and ?write.csv2? provide convenience wrappers for writing CSV files. They set ?sep?, ?dec? and ?qmethod?, and ?col.names? to ?NA? if ?row.names = TRUE? and ?TRUE? otherwise. ?write.csv? uses ?"."? for the decimal point and a comma for the separator. ?write.csv2? uses a comma for the decimal point and a semicolon for the separator, the Excel convention for CSV files in some Western European locales. These wrappers are deliberately inflexible: they are designed to ensure that the correct conventions are used to write a valid file. Attempts to change ?append?, ?col.names?, ?sep?, ?dec? or ?qmethod? are ignored, with a warning. From Bill.Venables at csiro.au Tue Mar 1 23:39:50 2011 From: Bill.Venables at csiro.au (Bill.Venables at csiro.au) Date: Wed, 2 Mar 2011 09:39:50 +1100 Subject: [R] Logistic Stepwise Criterion In-Reply-To: References: Message-ID: <1BDAE2969943D540934EE8B4EF68F95FB27CB86DE2@EXNSW-MBX03.nexus.csiro.au> The "probability OF the residual deviance" is zero. The significance level for the residual deviance according to its asymptotic Chi-squared distribution is a possible criterion, but a silly one. If you want to minimise that, just fit no variables at all. That's the best you can do. If you want to maximise it, just minimise the deviance itself, which means include all possible variables in the regression, together with as many interactions as you can as well. (Incidently R doesn't have restrictions on how many interaction terms it can handle, those are imposed my your computer.) I suggest you think again about what criterion you really want to use. Somehow you need to balance fit in the training sample against some complexity measure. AIC and BIC are commonly used criteria, but not the only ones. I suggest you start with these and see if either does the kind of job you want. Stepwise regression with interaction terms can be a bit tricky if you want to impose the marginality constraints, but that is a bigger issue. Bill Venables. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Mano Gabor Toth Sent: Wednesday, 2 March 2011 6:44 AM To: r-help at r-project.org Subject: [R] Logistic Stepwise Criterion Dear R-help members, I'd like to run a binomial logistic stepwise regression with ten explanatory variables and as many interaction terms as R can handle. I'll come up with the right R command sooner or later, but my real question is whether and how the criterion for the evaluation of the different models can be set to be the probability of the residual deviance in the Chi-Square distribution (which would be more informative of overall model fit than AIC). Thanks in advance for all your help. Kind regards, Mano Gabor TOTH MA Political Science Central European University [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From wakinchauemil at gmail.com Tue Mar 1 23:20:58 2011 From: wakinchauemil at gmail.com (Ning Cheng) Date: Tue, 1 Mar 2011 16:20:58 -0600 Subject: [R] How to prove the MLE estimators are normal distributed? Message-ID: Dear List, I'm now working on MLE and OSL estimators.I just noticed that the textbook argues they are joint normal distributed.But how to prove the conclusion? Thanks for your time in advance! Best, Ning From tamas.barjak02 at gmail.com Tue Mar 1 23:50:53 2011 From: tamas.barjak02 at gmail.com (Tamas Barjak) Date: Tue, 1 Mar 2011 23:50:53 +0100 Subject: [R] error in saved .csv In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: nem el?rhet? URL: From ggrothendieck at gmail.com Tue Mar 1 23:56:40 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Tue, 1 Mar 2011 17:56:40 -0500 Subject: [R] splitting and stacking matrices In-Reply-To: References: Message-ID: On Tue, Mar 1, 2011 at 4:55 PM, Darcy Webber wrote: > Dear R users, > > I am having some difficulty arranging some matrices and wondered if > anyone could help out. As an example, consider the following matrix: > > a <- matrix(1:32, nrow = 4, ncol = 8) > a > ? ? [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] > [1,] ? ?1 ? ?5 ? ?9 ? 13 ? 17 ? 21 ? 25 ? 29 > [2,] ? ?2 ? ?6 ? 10 ? 14 ? 18 ? 22 ? 26 ? 30 > [3,] ? ?3 ? ?7 ? 11 ? 15 ? 19 ? 23 ? 27 ? 31 > [4,] ? ?4 ? ?8 ? 12 ? 16 ? 20 ? 24 ? 28 ? 32 > > I would like it to look like the following matrix: > > ? ? [,1] [,2] [,3] [,4] > [1,] ? ?1 ? ?5 ? ?9 ? 13 > [2,] ? ?2 ? ?6 ? 10 ? 14 > [3,] ? ?3 ? ?7 ? 11 ? 15 > [4,] ? ?4 ? ?8 ? 12 ? 16 > [5,] ?17 ? 21 ? 25 ? 29 > [6,] ?18 ? 22 ? 26 ? 30 > [7,] ?19 ? 23 ? 27 ? 31 > [8,] ?20 ? 24 ? 28 ? 32 > > I can achieve this using the following: > > a1 <- a[, 1:4] > a2 <- a[, 5:8] > b <- rbind(a1, a2) > > However, my initial matrix often has a varibale number of columns (in > multiples of 4, and I still want to split the columns into blocks of 4 > and stack these). I have considered working out how many blocks the > matrix must be split into using: no.blocks <- ncol(a)/4. My problem is > then implementing this information to actually split the matrix up and > then stack it. Any guidance on this would be much appreciated. > > Regards > Darcy Webber > Try converting to a 3d array, swapping the last two dimensions and reconstituting it as a matrix: > matrix(aperm(array(a, c(4, 4, 2)), c(1, 3, 2)), nc = 4) [,1] [,2] [,3] [,4] [1,] 1 5 9 13 [2,] 2 6 10 14 [3,] 3 7 11 15 [4,] 4 8 12 16 [5,] 17 21 25 29 [6,] 18 22 26 30 [7,] 19 23 27 31 [8,] 20 24 28 32 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From manototh at gmail.com Tue Mar 1 23:58:47 2011 From: manototh at gmail.com (Mano Gabor Toth) Date: Tue, 1 Mar 2011 23:58:47 +0100 Subject: [R] Logistic Stepwise Criterion In-Reply-To: <1BDAE2969943D540934EE8B4EF68F95FB27CB86DE2@EXNSW-MBX03.nexus.csiro.au> References: <1BDAE2969943D540934EE8B4EF68F95FB27CB86DE2@EXNSW-MBX03.nexus.csiro.au> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dmck at u.washington.edu Wed Mar 2 00:01:35 2011 From: dmck at u.washington.edu (Don McKenzie) Date: Tue, 1 Mar 2011 15:01:35 -0800 Subject: [R] error in saved .csv In-Reply-To: References: Message-ID: <5D47CDDE-A0AC-4F72-BFFA-9A298429AD3C@u.washington.edu> If you have ONE data frame that you want to export to excel (I believe that was the original request), you probably don't need to change any of the default arguments to write.csv(), except "row.names", which will give you an extra column. On Mar 1, 2011, at 2:50 PM, Tamas Barjak wrote: > Yes, the format is incorrect. I have already tried the write.table, but it > didn't work. > > > 2011/3/1 jim holtman > >> I am not sure what you are saying your problem is? Is the format >> incorrect? BTW, notice that write.csv does not have a 'sep' >> parameter. Maybe you should be using write.table. >> >> On Tue, Mar 1, 2011 at 4:36 PM, Tamas Barjak >> wrote: >>> Help me please! >>> >>> I would like to be saved a data table: >>> >>> write.csv(random.t1, "place", dec=",", append = T, quote = FALSE, sep = " >> ", >>> qmethod = "double", eol = "\n", row.names=F) >>> >>> It's OK! >>> >>> But the rows of file >>> >>> 1,1,21042,-4084.87179487179,2457.66483516483,-582.275562799881 >>> 2,2,23846,-6383.86480186479,-3409.98451548449,-3569.72145340269 >>> and no >>> >>> 1 >>> 21042 - 4084.87179487179 2457.66483516483 >>> Not proportional... >>> >>> What's the problem??? >>> >>> Thanks! >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> -- >> Jim Holtman >> Data Munger Guru >> >> What is the problem that you are trying to solve? >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. Don McKenzie Research Ecologist Pacific Wildland Fire Sciences Lab US Forest Service Affiliate Professor School of Forest Resources and CSES Climate Impacts Group University of Washington phone: 206-732-7824 cell: 206-321-5966 dmck at uw.edu From felipe.parra at quantil.com.co Wed Mar 2 00:06:34 2011 From: felipe.parra at quantil.com.co (Luis Felipe Parra) Date: Wed, 2 Mar 2011 07:06:34 +0800 Subject: [R] Difference in numeric Dates between Excel and R Message-ID: An embedded and charset-unspecified text was scrubbed... Name: no disponible URL: From Bill.Venables at csiro.au Wed Mar 2 00:07:13 2011 From: Bill.Venables at csiro.au (Bill.Venables at csiro.au) Date: Wed, 2 Mar 2011 10:07:13 +1100 Subject: [R] How to prove the MLE estimators are normal distributed? In-Reply-To: References: Message-ID: <1BDAE2969943D540934EE8B4EF68F95FB27CB86DE5@EXNSW-MBX03.nexus.csiro.au> This is a purely statistical question and you should try asking it on some statistics list. This is for help with using R, mostly for data analysis and graphics. A glance at the posting guide (see the footnote below) might be a good idea. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Ning Cheng Sent: Wednesday, 2 March 2011 8:21 AM To: r-help at r-project.org Subject: [R] How to prove the MLE estimators are normal distributed? Dear List, I'm now working on MLE and OSL estimators.I just noticed that the textbook argues they are joint normal distributed.But how to prove the conclusion? Thanks for your time in advance! Best, Ning ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From wdunlap at tibco.com Wed Mar 2 00:08:33 2011 From: wdunlap at tibco.com (William Dunlap) Date: Tue, 1 Mar 2011 15:08:33 -0800 Subject: [R] inefficient ifelse() ? In-Reply-To: References: <77EB52C6DD32BA4D87471DCD70C8D70003F7474A@NA-PA-VBE03.na.tibco.com> Message-ID: <77EB52C6DD32BA4D87471DCD70C8D70003F74789@NA-PA-VBE03.na.tibco.com> Try using [<- more, instead of ifelse(). I rarely find myself really using both of the calls to [<- that ifelse makes. E.g., I use x[x==999] <- NA instead of x <- ifelse(x==999, NA, x) But if you find yourself using ifelse in a certain way often, try writing a function that only allows that case. E.g., transform2 <- function(x, test, ifTrueFunction, ifFalseFunction) { stopifnot(is.logical(test), length(x) != length(test), is.function(ifTrueFunction), is.function(ifFalseFunction)) retval <- x # assume output is of same type as input retval[test] <- ifTrueFunction(x[test]) retval[!test] <- ifFalseFunction(x[!test]) retval } transform2(x, x<=0, f, g) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: ivowel at gmail.com [mailto:ivowel at gmail.com] On Behalf Of > ivo welch > Sent: Tuesday, March 01, 2011 2:20 PM > To: William Dunlap > Cc: r-help > Subject: Re: [R] inefficient ifelse() ? > > yikes. you are asking me too much. > > thanks everybody for the information. I learned something new. > > my suggestion would be for the much smarter language designers (than > I) to offer us more or less blissfully ignorant users another > vector-related construct in R. It could perhaps be named %if% %else%, > analogous to if else (with naming inspired by %in%, and with > evaluation only of relevant parts [just as if else for scalars]), with > different outcomes in some cases, but with the advantage of typically > evaluating only half as many conditions as the ifelse() vector > construct. %if% %else% may work only in a subset of cases, but when > it does work, it would be nice to have. it would probably be my first > "goto" function, with ifelse() use only as a fallback. > > of course, I now know how to fix my specific issue. I was just > surprised that my first choice, ifelse(), was not as optimized as I > had thought. > > best, > > /iaw > > > On Tue, Mar 1, 2011 at 5:13 PM, William Dunlap > wrote: > > An ifelse-like function that only evaluated > > what was needed would be fine, but it would > > have to be different from ifelse itself. ?The > > trick is to come up with a good parameterization. > > > > E.g., how would it deal with things like > > ? ifelse(is.na(x), mean(x, na.rm=TRUE), x) > > or > > ? ifelse(x>1, log(x), runif(length(x),-1,0)) > > or > > ? ifelse(x>1, log(x), -seq_along(x)) > > Would it reject such things? ?Deciding that the > > x in mean(x,na.rm=TRUE) should be replaced by > > x[is.na(x)] would be wrong. ?Deciding that > > runif(length(x)) should be replaced by runif(sum(x>1)) > > seems a bit much to expect. ?Replacing seq_along(x) with > > seq_len(sum(x>1)) is wrong. ?It would be better to > > parameterize the new function so it wouldn't have to > > think about those cases. > > > > Would you want it to depend only on a logical > > vector or perhaps also on a factor (a vectorized > > switch/case function)? > > > > Bill Dunlap > > Spotfire, TIBCO Software > > wdunlap tibco.com > > > >> -----Original Message----- > >> From: r-help-bounces at r-project.org > >> [mailto:r-help-bounces at r-project.org] On Behalf Of ivo welch > >> Sent: Tuesday, March 01, 2011 12:36 PM > >> To: Henrique Dallazuanna > >> Cc: r-help > >> Subject: Re: [R] inefficient ifelse() ? > >> > >> thanks, Henrique. ?did you mean > >> > >> ? ? as.vector(t(mapply(function(x, f)f(x), split(t, ((t %% 2)==0)), > >> list(f, g)))) ? ? > >> > >> otherwise, you get a matrix. > >> > >> its a good solution, but unfortunately I don't think this > can be used > >> to redefine ifelse(cond,ift,iff) in a way that is transparent. ?the > >> ift and iff functions will always be evaluated before the function > >> call happens, even with lazy evaluation. ?:-( > >> > >> I still think that it makes sense to have a smarter > vectorized %if% in > >> a vectorized language like R. ?just my 5 cents. > >> > >> /iaw > >> > >> ---- > >> Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com) > >> > >> > >> > >> > >> > >> On Tue, Mar 1, 2011 at 2:33 PM, Henrique Dallazuanna > >> wrote: > >> > Try this: > >> > > >> > mapply(function(x, f)f(x), split(t, t %% 2), list(g, f)) > >> > > >> > On Tue, Mar 1, 2011 at 4:19 PM, ivo welch > wrote: > >> >> > >> >> dear R experts--- > >> >> > >> >> ?t <- 1:30 > >> >> ?f <- function(t) { cat("f for", t, "\n"); return(2*t) } > >> >> ?g <- function(t) { cat("g for", t, "\n"); return(3*t) } > >> >> ?s <- ifelse( t%%2==0, g(t), f(t)) > >> >> > >> >> shows that the ifelse function actually evaluates both f() > >> and g() for > >> >> all values first, and presumably then just picks left or > >> right results > >> >> based on t%%2. ?uggh... wouldn't it make more sense to > >> evaluate only > >> >> the relevant parts of each vector and then reassemble them? > >> >> > >> >> /iaw > >> >> ---- > >> >> Ivo Welch > >> >> > >> >> ______________________________________________ > >> >> R-help at r-project.org mailing list > >> >> https://stat.ethz.ch/mailman/listinfo/r-help > >> >> PLEASE do read the posting guide > >> >> http://www.R-project.org/posting-guide.html > >> >> and provide commented, minimal, self-contained, > reproducible code. > >> > > >> > > >> > > >> > -- > >> > Henrique Dallazuanna > >> > Curitiba-Paran?-Brasil > >> > 25? 25' 40" S 49? 16' 22" O > >> > > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > From arrayprofile at yahoo.com Wed Mar 2 00:28:17 2011 From: arrayprofile at yahoo.com (array chip) Date: Tue, 1 Mar 2011 15:28:17 -0800 (PST) Subject: [R] glht() used with coxph() In-Reply-To: <647233.14657.qm@web56305.mail.re3.yahoo.com> References: <647233.14657.qm@web56305.mail.re3.yahoo.com> Message-ID: <175385.92649.qm@web56301.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From NordlDJ at dshs.wa.gov Wed Mar 2 00:31:41 2011 From: NordlDJ at dshs.wa.gov (Nordlund, Dan (DSHS/RDA)) Date: Tue, 1 Mar 2011 15:31:41 -0800 Subject: [R] Difference in numeric Dates between Excel and R In-Reply-To: References: Message-ID: <941871A13165C2418EC144ACB212BDB001CD9808@dshsmxoly1504g.dshs.wa.lcl> > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Luis Felipe Parra > Sent: Tuesday, March 01, 2011 3:07 PM > To: r-help > Subject: [R] Difference in numeric Dates between Excel and R > > Hello. I am using some dates I read in excel in R. I know the excel > origin > is supposed to be 1900-1-1. But when I used as.Date with origin=1900-1- > 1 the > dates that R reported me where two days ahead than the ones I read from > Excel. I noticed that when I did in R the following: > > > as.Date("2011-3-4")-as.Date("1900-1-1") > Time difference of 40604 days > > but if I do the same operation in Excel the answer is 40605. Does > anybody > know what can be going on? > I think so. It is a known problem that Excel thinks 1900 was a leap year, but it was not. So Excel counts an extra day (for nonexistent Feb 29, 1900). In addition, Excel considers "1900-01-01" as day 1, not day 0. Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 From peter.shepard at gmail.com Wed Mar 2 00:36:20 2011 From: peter.shepard at gmail.com (Pete Shepard) Date: Tue, 1 Mar 2011 15:36:20 -0800 Subject: [R] does rpy support R 2.12.2 Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From d.scott at auckland.ac.nz Wed Mar 2 02:16:47 2011 From: d.scott at auckland.ac.nz (David Scott) Date: Wed, 02 Mar 2011 14:16:47 +1300 Subject: [R] Difference in numeric Dates between Excel and R In-Reply-To: <941871A13165C2418EC144ACB212BDB001CD9808@dshsmxoly1504g.dshs.wa.lcl> References: <941871A13165C2418EC144ACB212BDB001CD9808@dshsmxoly1504g.dshs.wa.lcl> Message-ID: <4D6D9A7F.4070808@auckland.ac.nz> On 2/03/2011 12:31 p.m., Nordlund, Dan (DSHS/RDA) wrote: >> -----Original Message----- From: r-help-bounces at r-project.org >> [mailto:r-help-bounces at r- project.org] On Behalf Of Luis Felipe >> Parra Sent: Tuesday, March 01, 2011 3:07 PM To: r-help Subject: [R] >> Difference in numeric Dates between Excel and R >> >> Hello. I am using some dates I read in excel in R. I know the >> excel origin is supposed to be 1900-1-1. But when I used as.Date >> with origin=1900-1- 1 the dates that R reported me where two days >> ahead than the ones I read from Excel. I noticed that when I did in >> R the following: >> >>> as.Date("2011-3-4")-as.Date("1900-1-1") >> Time difference of 40604 days >> >> but if I do the same operation in Excel the answer is 40605. Does >> anybody know what can be going on? >> > > I think so. It is a known problem that Excel thinks 1900 was a leap > year, but it was not. So Excel counts an extra day (for nonexistent > Feb 29, 1900). In addition, Excel considers "1900-01-01" as day 1, > not day 0. > > Hope this is helpful, > > Dan An explanation which seems reasonably authoritative is given here: http://www.cpearson.com/excel/datetime.htm David Scott > > Daniel J. Nordlund Washington State Department of Social and Health > Services Planning, Performance, and Accountability Research and Data > Analysis Division Olympia, WA 98504-5204 > > > ______________________________________________ R-help at r-project.org > mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do > read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- _________________________________________________________________ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142, NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.scott at auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics From rhelpacc at gmail.com Wed Mar 2 02:50:14 2011 From: rhelpacc at gmail.com (Robert A'gata) Date: Tue, 1 Mar 2011 20:50:14 -0500 Subject: [R] Plotting a 3D histogram Message-ID: Hi - I am wondering if there is any package that does plotting of joint histogram between 2 variables, i.e. f(x,y). I found rgl but it seems not so intuitive to use. I'm wondering if there is any alternative. Thank you. Robert From ted.rosenbaum at yale.edu Wed Mar 2 03:45:01 2011 From: ted.rosenbaum at yale.edu (Ted Rosenbaum) Date: Tue, 1 Mar 2011 21:45:01 -0500 Subject: [R] merge in data.tables -- "non-visible" In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From zlistserv at gmail.com Wed Mar 2 04:04:43 2011 From: zlistserv at gmail.com (zListserv) Date: Tue, 1 Mar 2011 22:04:43 -0500 Subject: [R] accessing variables inside a function, inside a loop Message-ID: <8E716C90-2889-4A0D-985B-B81EA053BB46@gmail.com> Joshua Great solution. Taking off on your code, the following works but does not display the names of the variables in the formula. Any suggestions about how to modify the function so that it displays the correct formula (e.g., "glm(formula = y1 ~ x1 * x2, data = dat)" instead of "glm(formula = frm, data = dat)")? R> x = runif(2000) R> y = runif(2000) R> z = runif(2000) R> R> y1 = y * x R> y2 = y * sqrt(x) R> R> x1 = y1 / y2 + z R> x2 = y2 / y1 * z + z R> R> dat = data.frame(y1,y2,x1,x2) R> R> xReg = function(y) { + + frm = eval(substitute(p ~ x1 * x2, list(p = as.name(y)))) + mod = glm(frm, data=dat) + } R> R> lapply(names(dat[,1:2]), xReg) [[1]] Call: glm(formula = frm, data = dat) Coefficients: (Intercept) x1 x2 x1:x2 -0.1882452 0.4932059 0.0667401 -0.1310084 Degrees of Freedom: 1999 Total (i.e. Null); 1996 Residual Null Deviance: 99.15032 Residual Deviance: 67.71775 AIC: -1085.354 [[2]] Call: glm(formula = frm, data = dat) Coefficients: (Intercept) x1 x2 x1:x2 -0.005464627 0.386937367 0.037363416 -0.094136334 Degrees of Freedom: 1999 Total (i.e. Null); 1996 Residual Null Deviance: 112.7078 Residual Deviance: 90.24796 AIC: -510.9287 --- Thanks, Alan From lawrence.michael at gene.com Wed Mar 2 04:11:29 2011 From: lawrence.michael at gene.com (Michael Lawrence) Date: Tue, 1 Mar 2011 19:11:29 -0800 Subject: [R] problems with playwith In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mailinglist.honeypot at gmail.com Wed Mar 2 06:25:50 2011 From: mailinglist.honeypot at gmail.com (Steve Lianoglou) Date: Wed, 2 Mar 2011 00:25:50 -0500 Subject: [R] merge in data.tables -- "non-visible" In-Reply-To: References: Message-ID: Hi Ted, On Tue, Mar 1, 2011 at 9:45 PM, Ted Rosenbaum wrote: > Hi, > I am trying to use the merge command in the data.tables package. > However, when I run the command I am not sure if it is running the merge > command from the base package or the merge command from data.tables. > When I run "methods(generic.function="merge")' it informs me that > 'merge.data.table" is "non-visible". > I am just trying to run the merge command on two data tables using the > index, is there anything else that I need to do (my googling has simply left > me uncertain about how to get this to work). > Thanks for your help! Assuming everything is "normal", I'm going to bet the merge.data.table function is the one that is being used. Assuming you are using version <= 1.5.3, though, an easy way to check is to see if the result of the merge ignores the `suffixes` argument. The behavior of merge is being changed for the next version, but this "feature" is an easy way for you to check which merge function is being used in the current version ;-) Hope that helps, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact From deliverable at gmail.com Wed Mar 2 06:28:40 2011 From: deliverable at gmail.com (Alexy Khrabrov) Date: Wed, 2 Mar 2011 00:28:40 -0500 Subject: [R] the features of the truth Message-ID: <027D0B7A-1C20-45CA-866C-6CDB172391AD@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jwiley.psych at gmail.com Wed Mar 2 06:43:33 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Tue, 1 Mar 2011 21:43:33 -0800 Subject: [R] accessing variables inside a function, inside a loop In-Reply-To: <8E716C90-2889-4A0D-985B-B81EA053BB46@gmail.com> References: <8E716C90-2889-4A0D-985B-B81EA053BB46@gmail.com> Message-ID: Hi Alan, Other more knowledgeable people may have better opinions on this than I do. Manipulating language and call objects is seriously stretching my skills. \ In any case, two ways come to mind, both of them sufficiently cumbersome I would seriously question the value (btw, this is a completely different question, right?). To borrow from Barry Rowlingson, I'd like to prefix all these 'solutions' with "Here's how to do it, but don't actually do it." The first option would be to manually construct a call to glm() and then evaluate it. That is, rather than pass "frm" to the formula argument, construct a text string of the entire glm call. Something like: ##################################### foo <- function(y) { paste("glm(formula = ", y, " ~ hp * wt, data = mtcars)", sep = '') } foo("mpg") eval(parse(text = foo("mpg"))) ##################################### The other thought would be to just update the part of the model object containing the call (I actually like this better than my first option, though I'm still not fond of it). Assuming you do not actually need the entire call, you could easily use the formula. Here's an example: ##################################### xReg <- function(y) { frm <- eval(substitute(p ~ hp * wt, list(p = as.name(y)))) mod <- glm(frm, data = mtcars) mod$call <- frm return(mod) } xReg("mpg") ##################################### If you want to know what the formula used for a model is, my suggestion would be to simply have your function xReg() return both the model object AND the formula you used (i.e., "frm"). Here is an example: ##################################### ## *my* preference xReg <- function(y) { frm <- eval(substitute(p ~ hp * wt, list(p = as.name(y)))) mod <- glm(frm, data = mtcars) output <- list(formula = frm, model = mod) attributes(output$formula) <- NULL return(output) } xReg("mpg") ##################################### Side note, Dr. Bates (author of lme4, genius, and a nice, helpful person to boot) taught me how to use substitute() for something I tried once on the ggplot2 list. Cheers, Josh On Tue, Mar 1, 2011 at 7:04 PM, zListserv wrote: > Joshua > > Great solution. ?Taking off on your code, the following works but does not display the names of the variables in the formula. ?Any suggestions about how to modify the function so that it displays the correct formula (e.g., "glm(formula = y1 ~ x1 * x2, data = dat)" instead of "glm(formula = frm, data = dat)")? > > R> x = runif(2000) > R> y = runif(2000) > R> z = runif(2000) > R> > R> y1 = y * x > R> y2 = y * sqrt(x) > R> > R> x1 = y1 / y2 + z > R> x2 = y2 / y1 * z + z > R> > R> dat = data.frame(y1,y2,x1,x2) > R> > R> xReg = function(y) { > + > + ? ? ? frm = eval(substitute(p ~ x1 * x2, list(p = as.name(y)))) > + ? ? ? mod = glm(frm, data=dat) > + ? ? ? } > R> > R> lapply(names(dat[,1:2]), xReg) > [[1]] > > Call: ?glm(formula = frm, data = dat) > > Coefficients: > (Intercept) ? ? ? ? ? x1 ? ? ? ? ? x2 ? ? ? ?x1:x2 > -0.1882452 ? ?0.4932059 ? ?0.0667401 ? -0.1310084 > > Degrees of Freedom: 1999 Total (i.e. Null); ?1996 Residual > Null Deviance: ? ? ?99.15032 > Residual Deviance: 67.71775 ? ? AIC: -1085.354 > > [[2]] > > Call: ?glm(formula = frm, data = dat) > > Coefficients: > (Intercept) ? ? ? ? ? ?x1 ? ? ? ? ? ?x2 ? ? ? ? x1:x2 > -0.005464627 ? 0.386937367 ? 0.037363416 ?-0.094136334 > > Degrees of Freedom: 1999 Total (i.e. Null); ?1996 Residual > Null Deviance: ? ? ?112.7078 > Residual Deviance: 90.24796 ? ? AIC: -510.9287 > > --- > > Thanks, > > Alan > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From jeroenooms at gmail.com Wed Mar 2 07:59:44 2011 From: jeroenooms at gmail.com (Jeroen Ooms) Date: Tue, 1 Mar 2011 22:59:44 -0800 Subject: [R] getting attributes of list without the "names". In-Reply-To: <4D6CCAC3.1080504@gmail.com> References: <1298953048526-3329209.post@n4.nabble.com> <4D6CCAC3.1080504@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pjmiller_57 at yahoo.com Wed Mar 2 00:45:31 2011 From: pjmiller_57 at yahoo.com (Paul Miller) Date: Tue, 1 Mar 2011 15:45:31 -0800 (PST) Subject: [R] Pairwise T-Tests and Dunnett's Test (possibly using multcomp) Message-ID: <614025.85637.qm@web161601.mail.bf1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From xinwei at stat.psu.edu Wed Mar 2 05:52:51 2011 From: xinwei at stat.psu.edu (xin wei) Date: Tue, 1 Mar 2011 20:52:51 -0800 (PST) Subject: [R] a question on sqldf's handling of missing value and factor Message-ID: <1299041571876-3331007.post@n4.nabble.com> Dear subscribers: I am using the following code to read a large number of big text files: library(sqldf) tempd <- file(XXXX) tempdx <- sqldf("select * from tempd", dbname = tempfile(), file.format = list(header = T, sep="\t", row.names = F)) The problem is: all my numberical variable become factor (maybe because these columns all contain missing value). It would be quite cubersome to convert them to numeric variable using as.numeric one by one. Does anyone know how to re-set SQLDF so that it would automatically read the numeric column with missing row as real numeric instead of factor? many thanks -- View this message in context: http://r.789695.n4.nabble.com/a-question-on-sqldf-s-handling-of-missing-value-and-factor-tp3331007p3331007.html Sent from the R help mailing list archive at Nabble.com. From ripley at stats.ox.ac.uk Wed Mar 2 08:15:23 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Wed, 2 Mar 2011 07:15:23 +0000 (GMT) Subject: [R] Difference in numeric Dates between Excel and R In-Reply-To: References: Message-ID: On Wed, 2 Mar 2011, Luis Felipe Parra wrote: > Hello. I am using some dates I read in excel in R. I know the excel origin > is supposed to be 1900-1-1. But when I used as.Date with origin=1900-1-1 the > dates that R reported me where two days ahead than the ones I read from > Excel. I noticed that when I did in R the following: > >> as.Date("2011-3-4")-as.Date("1900-1-1") > Time difference of 40604 days > > but if I do the same operation in Excel the answer is 40605. Does anybody > know what can be going on? We cannot know: you say a difference of 2 and report 1! As the examples from as.Date says ## Excel is said to use 1900-01-01 as day 1 (Windows default) or ## 1904-01-01 as day 0 (Mac default), but this is complicated by Excel ## thinking 1900 was a leap year. ## So for recent dates from Windows Excel as.Date(35981, origin="1899-12-30") # 1998-07-05 ## and Mac Excel as.Date(34519, origin="1904-01-01") # 1998-07-05 So the origin you used is off by 2 days: one for the origin being day 1 and one for Windows Excel's ignorance of the calendar. Note too that these are *default*: they can be changed in Excel. > Thank you > Felipe Parra > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. PLEASE do try to do your own homework (and not send HTML), as we requested there. It is galling that you ask here about bugs in Excel, bugs that are even documented in R's help. In future, please use the Microsoft help you paid for with Excel if it disagrees with R. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From jwiley.psych at gmail.com Wed Mar 2 08:27:54 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Tue, 1 Mar 2011 23:27:54 -0800 Subject: [R] Pairwise T-Tests and Dunnett's Test (possibly using multcomp) In-Reply-To: <614025.85637.qm@web161601.mail.bf1.yahoo.com> References: <614025.85637.qm@web161601.mail.bf1.yahoo.com> Message-ID: Hi Paul, Changing the factor levels will work (as you saw). In this case, you could also edit the contrast matrix. ## look at default contrasts contrasts(gad$dosegrp) model1 <- lm(hama ~ dosegrp, data = gad) summary(model1) ## choose group 3 as base (comparison) contrasts(gad$dosegrp) <- contr.treatment(n = 3, base = 3) model1 <- lm(hama ~ dosegrp, data = gad) summary(model1) If you have MASS 4th ed., I believe a discussion of contrast matrices starts on page 146. As far as improvements, in general, I think it would be preferable to let R's method dispatch system choose the method for summary rather than specifying summary.aov() yourself. You might also consider more spaces (e.g., between arguments). Cheers, Josh On Tue, Mar 1, 2011 at 3:45 PM, Paul Miller wrote: > Hello Everyone, > > Figured out one part of the code. Setting the reference level for a factor is accomplished using the relevel funtion (pg. 383 of MASS; pg. 70 of Data Manipulation with R): > > gad$dosegrp <- relevel(gad$dosegrp,3) > > This works very well. Much better than using a format in SAS procedures that don't allow the "ref=" option for instance. > > Does anyone have suggestions about how to improve other aspects of my code? > > Thanks, > > Paul > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From r.user.spain at gmail.com Wed Mar 2 09:23:00 2011 From: r.user.spain at gmail.com (Usuario R) Date: Wed, 2 Mar 2011 09:23:00 +0100 Subject: [R] R2PPT - Insert data.frame Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sk at seldakorkmaz.com Wed Mar 2 09:41:07 2011 From: sk at seldakorkmaz.com (Selda Korkmaz) Date: Wed, 2 Mar 2011 09:41:07 +0100 Subject: [R] Rcommander Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From tal.galili at gmail.com Wed Mar 2 10:40:45 2011 From: tal.galili at gmail.com (Tal Galili) Date: Wed, 2 Mar 2011 11:40:45 +0200 Subject: [R] tricky (for me) merging of data...more clarity In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From s067835 at alumni.cuhk.net Wed Mar 2 10:00:20 2011 From: s067835 at alumni.cuhk.net (vikkiyft) Date: Wed, 2 Mar 2011 01:00:20 -0800 (PST) Subject: [R] what does the "S.D." returned by {Hmisc} rcorr.cens measure? In-Reply-To: <1298988344382-3329899.post@n4.nabble.com> References: <1298976559588-3329609.post@n4.nabble.com> <1298988344382-3329899.post@n4.nabble.com> Message-ID: <1299056420913-3331186.post@n4.nabble.com> Thanks for your reply Prof Harrell!! Could you kindly list some references of the fomula for calculating the SD of Somer's D in this kind of application? Because I couldnt find any.. -- View this message in context: http://r.789695.n4.nabble.com/what-does-the-S-D-returned-by-Hmisc-rcorr-cens-measure-tp3329609p3331186.html Sent from the R help mailing list archive at Nabble.com. From xavier.bodin at univ-savoie.fr Wed Mar 2 10:19:16 2011 From: xavier.bodin at univ-savoie.fr (Xavier Bodin) Date: Wed, 2 Mar 2011 01:19:16 -0800 (PST) Subject: [R] pb with Date format using filled.contour Message-ID: <1299057556730-3331207.post@n4.nabble.com> Hi R-help community, Can anyone tell me why, while using : x <- seq(as.Date("2001-01-01"),as.Date("2001-01-01") + nrow(volcano)-1,1) y <- seq(1, ncol(volcano),1) when I plot the volcano matrix with that command : filled.contour(x,y,volcano) the graph has a Date format on X-axis, ok ... ... but when adding a contour plot to the filled contour, using this command: filled.contour(x,y,volcano, plot.axes={axis(1);axis(2);contour(x,y,volcano, add = TRUE)}) the Date format doesn't appear anymore ... ?? Thanks in advance for any help, Xavier -- View this message in context: http://r.789695.n4.nabble.com/pb-with-Date-format-using-filled-contour-tp3331207p3331207.html Sent from the R help mailing list archive at Nabble.com. From Juerg.Schulze at stud.unibas.ch Wed Mar 2 11:01:42 2011 From: Juerg.Schulze at stud.unibas.ch (=?iso-8859-1?b?SvxyZw==?= Schulze) Date: Wed, 02 Mar 2011 11:01:42 +0100 Subject: [R] problem with glm(family=binomial) when some levels have only 0 proportion values Message-ID: <20110302110142.7554696fy5afjqti@webmail.unibas.ch> Hello everybody I want to compare the proportions of germinated seeds (seed batches of size 10) of three plant types (1,2,3) with a glm with binomial data (following the method in Crawley: Statistics,an introduction using R, p.247). The problem seems to be that in two plant types (2,3) all plants have proportions = 0. I give you my data and the model I'm running: success failure type [1,] 0 10 3 [2,] 0 10 2 [3,] 0 10 2 [4,] 0 10 2 [5,] 0 10 2 [6,] 0 10 2 [7,] 0 10 2 [8,] 4 6 1 [9,] 4 6 1 [10,] 3 7 1 [11,] 5 5 1 [12,] 7 3 1 [13,] 4 6 1 [14,] 0 10 3 [15,] 0 10 3 [16,] 0 10 3 [17,] 0 10 3 [18,] 0 10 3 [19,] 0 10 3 [20,] 0 10 2 [21,] 0 10 2 [22,] 0 10 2 [23,] 9 1 1 [24,] 6 4 1 [25,] 4 6 1 [26,] 0 10 3 [27,] 0 10 3 y<- cbind(success, failure) Call: glm(formula = y ~ type, family = binomial) Deviance Residuals: Min 1Q Median 3Q -1.3521849 -0.0000427 -0.0000427 -0.0000427 Max 2.6477556 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.04445 0.21087 0.211 0.833 typeFxC -23.16283 6696.13233 -0.003 0.997 typeFxD -23.16283 6696.13233 -0.003 0.997 (Dispersion parameter for binomial family taken to be 1) Null deviance: 134.395 on 26 degrees of freedom Residual deviance: 12.622 on 24 degrees of freedom AIC: 42.437 Number of Fisher Scoring iterations: 20 Huge standard errors are calculated and there is no difference between plant type 1 and 2 or between plant type 1 and 3. If I add 1 to all successes, so that all the 0 values disappear, the standard error becomes lower and I find highly significant differences between the plant types. suc<- success + 1 fail<- 11 - suc Y<- cbind(suc,fail) Call: glm(formula = Y ~ type, family = binomial) Deviance Residuals: Min 1Q Median 3Q -1.279e+00 -4.712e-08 -4.712e-08 0.000e+00 Max 2.584e+00 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.2231 0.2023 1.103 0.27 typeFxC -2.5257 0.4039 -6.253 4.02e-10 *** typeFxD -2.5257 0.4039 -6.253 4.02e-10 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 86.391 on 26 degrees of freedom Residual deviance: 11.793 on 24 degrees of freedom AIC: 76.77 Number of Fisher Scoring iterations: 4 So I think the 0 values of all plants of group 2 and 3 are the problem, do you agree? I don't know why this is a problem, or how I can explain to a reviewer why a data transformation (+ 1) is necessary with such a dataset. I would greatly appreciate any comments. Juerg ______________________________________ J?rg Schulze Department of Environmental Sciences Section of Conservation Biology University of Basel St. Johanns-Vorstadt 10 4056 Basel, Switzerland Tel.: ++41/61/267 08 47 From erich.neuwirth at univie.ac.at Wed Mar 2 12:11:50 2011 From: erich.neuwirth at univie.ac.at (Erich Neuwirth) Date: Wed, 02 Mar 2011 12:11:50 +0100 Subject: [R] Difference in numeric Dates between Excel and R In-Reply-To: References: Message-ID: <4D6E25F6.6090307@univie.ac.at> A detailed description of the Excel problem as seen through the eyes of MS can be found at http://support.microsoft.com/kb/214326 On 3/2/2011 8:15 AM, Prof Brian Ripley wrote: > > ## Excel is said to use 1900-01-01 as day 1 (Windows default) or > ## 1904-01-01 as day 0 (Mac default), but this is complicated by Excel > ## thinking 1900 was a leap year. > ## So for recent dates from Windows Excel > as.Date(35981, origin="1899-12-30") # 1998-07-05 > ## and Mac Excel > as.Date(34519, origin="1904-01-01") # 1998-07-05 > > So the origin you used is off by 2 days: one for the origin being day 1 > and one for Windows Excel's ignorance of the calendar. > > Note too that these are *default*: they can be changed in Excel. > >> Thank you >> Felipe Parra >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > PLEASE do try to do your own homework (and not send HTML), as we > requested there. It is galling that you ask here about bugs in Excel, > bugs that are even documented in R's help. In future, please use the > Microsoft help you paid for with Excel if it disagrees with R. > From ripley at stats.ox.ac.uk Wed Mar 2 12:14:19 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Wed, 2 Mar 2011 11:14:19 +0000 (GMT) Subject: [R] pb with Date format using filled.contour In-Reply-To: <1299057556730-3331207.post@n4.nabble.com> References: <1299057556730-3331207.post@n4.nabble.com> Message-ID: On Wed, 2 Mar 2011, Xavier Bodin wrote: > Hi R-help community, > > Can anyone tell me why, while using : > x <- seq(as.Date("2001-01-01"),as.Date("2001-01-01") + > nrow(volcano)-1,1) > y <- seq(1, ncol(volcano),1) > > when I plot the volcano matrix with that command : > filled.contour(x,y,volcano) > the graph has a Date format on X-axis, ok ... > > ... but when adding a contour plot to the filled contour, using this > command: > filled.contour(x,y,volcano, > plot.axes={axis(1);axis(2);contour(x,y,volcano, add = TRUE)}) > the Date format doesn't appear anymore ... ?? You should not use using axis(1). Look at the code for filled.contour: if (missing(plot.axes)) { if (axes) { title(main = "", xlab = "", ylab = "") Axis(x, side = 1) Axis(y, side = 2) } } and then at the help for Axis and axis. > Thanks in advance for any help, > > Xavier -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From ali.zolfaghari at gmail.com Wed Mar 2 12:44:41 2011 From: ali.zolfaghari at gmail.com (Dr. Alireza Zolfaghari) Date: Wed, 2 Mar 2011 11:44:41 +0000 Subject: [R] R and Android Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ripley at stats.ox.ac.uk Wed Mar 2 13:09:53 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Wed, 2 Mar 2011 12:09:53 +0000 (GMT) Subject: [R] Difference in numeric Dates between Excel and R In-Reply-To: <4D6E25F6.6090307@univie.ac.at> References: <4D6E25F6.6090307@univie.ac.at> Message-ID: On Wed, 2 Mar 2011, Erich Neuwirth wrote: > A detailed description of the Excel problem as seen through the eyes of > MS can be found at > > http://support.microsoft.com/kb/214326 No, that's only half the problem. The description at http://support.microsoft.com/kb/214330 (as cited in the as.Date.Rd file for the MS-approved numeric values) is wrong, because one of those systems starts at day 1 and one at day 0. Which description is wrong depends how you interpret 'the number of elapsed days since', but you can't have two meanings in one article. They say, correctly, that the two systems are 1462 different, but there were only 1460 (real world) or 1461 (MS world) days from 1900-01-01 to 1904-01-01. > On 3/2/2011 8:15 AM, Prof Brian Ripley wrote: >> >> ## Excel is said to use 1900-01-01 as day 1 (Windows default) or >> ## 1904-01-01 as day 0 (Mac default), but this is complicated by Excel >> ## thinking 1900 was a leap year. >> ## So for recent dates from Windows Excel >> as.Date(35981, origin="1899-12-30") # 1998-07-05 >> ## and Mac Excel >> as.Date(34519, origin="1904-01-01") # 1998-07-05 >> >> So the origin you used is off by 2 days: one for the origin being day 1 >> and one for Windows Excel's ignorance of the calendar. >> >> Note too that these are *default*: they can be changed in Excel. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From ggrothendieck at gmail.com Wed Mar 2 14:02:07 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Wed, 2 Mar 2011 08:02:07 -0500 Subject: [R] a question on sqldf's handling of missing value and factor In-Reply-To: <1299041571876-3331007.post@n4.nabble.com> References: <1299041571876-3331007.post@n4.nabble.com> Message-ID: On Tue, Mar 1, 2011 at 11:52 PM, xin wei wrote: > Dear subscribers: > > I am using the following code to read a large number of big text files: > library(sqldf) > tempd <- file(XXXX) > tempdx <- sqldf("select * from tempd", dbname = tempfile(), file.format = > list(header = T, sep="\t", row.names = F)) > > The problem is: all my numberical variable become factor (maybe because > these columns all contain missing value). It would be quite cubersome to > convert them to numeric variable using as.numeric one by one. Does anyone > know how to re-set SQLDF so that it would automatically read the numeric > column with missing row as real numeric instead of factor? > If you can provide a minimal ***reproducible*** example it would help. Maybe sqldf(..., method = "raw") will give you what you want but I can't say for sure without the example. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From jfox at mcmaster.ca Wed Mar 2 14:19:20 2011 From: jfox at mcmaster.ca (John Fox) Date: Wed, 2 Mar 2011 08:19:20 -0500 Subject: [R] Rcommander In-Reply-To: References: Message-ID: <011201cbd8dc$7127f580$5377e080$@mcmaster.ca> Dear Selda, Most likely you haven't installed Tcl/Tk for X-windows. If you haven't already done so, please see the Rcmdr installation instructions for Mac users at . I hope this helps, John -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of Selda Korkmaz > Sent: March-02-11 3:41 AM > To: r-help at r-project.org > Subject: [R] Rcommander > > Dear Sirs, > > i just downloaded the R programm on my Macbook, but I can4t open Rcmdr, > although I installed the needed Rcmdr-packages. I would be very happy, > if you could help me. Telephone: +49 151 10868600 (Germany) or e-mail > > Yous sincerely, > > > Selda Korkmaz > > sk at seldakorkmaz.com > www.seldakorkmaz.com > > > > > > > > > [[alternative HTML version deleted]] From benjamin.ward at bathspa.org Wed Mar 2 15:04:04 2011 From: benjamin.ward at bathspa.org (Ben Ward) Date: Wed, 2 Mar 2011 14:04:04 +0000 Subject: [R] R and Android In-Reply-To: References: Message-ID: Is there really one for the iphone? As far as I was aware, apple had beef about their policy agreements and the fact such software is open source/free as in freedom. I actually expected the situation would be the other way round: console for android but none for iphone? Ben W. On 02/03/2011 11:44, Dr. Alireza Zolfaghari wrote: > Hi List, > Is anybody aware of any R console available for Android mobile? I know that > there is one for Iphone. > > thanks, > Alireza > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From f.harrell at vanderbilt.edu Wed Mar 2 15:18:23 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Wed, 2 Mar 2011 06:18:23 -0800 (PST) Subject: [R] what does the "S.D." returned by {Hmisc} rcorr.cens measure? In-Reply-To: <1299056420913-3331186.post@n4.nabble.com> References: <1298976559588-3329609.post@n4.nabble.com> <1298988344382-3329899.post@n4.nabble.com> <1299056420913-3331186.post@n4.nabble.com> Message-ID: <1299075503976-3331566.post@n4.nabble.com> Dxy is a U-statistic. In the U-statistic literature there is a combinatoric approach to estimating variances, requiring one to examine all possible pairs of observations. The general formula I use is a bit messy. You can look at the Fortran code that comes with the Hmisc package to see the algorithm. Frank vikkiyft wrote: > > Thanks for your reply Prof Harrell!! > > Could you kindly list some references of the fomula for calculating the SD > of Somer's D in this kind of application? Because I couldnt find any.. > ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/what-does-the-S-D-returned-by-Hmisc-rcorr-cens-measure-tp3329609p3331566.html Sent from the R help mailing list archive at Nabble.com. From chuse22 at gmail.com Wed Mar 2 15:19:15 2011 From: chuse22 at gmail.com (Chuse chuse) Date: Wed, 2 Mar 2011 16:19:15 +0200 Subject: [R] Refine ARMA model Message-ID: Dear users, I tried to fit an AR(2) model to data. This the result: > arima(vw,c(3,0,0)) Call: arima(x = vw, order = c(3, 0, 0)) Coefficients: ar1 ar2 ar3 intercept 0.1052 -0.0102 -0.1203 0.0099 s.e. 0.0337 0.0339 0.0338 0.0018 sigma^2 estimated as 0.002934: log likelihood = 1293.16, aic = -2576.33 Now, ar2 is not significantly different from zero. I would like to refine the model considering ar1 and ar3 only so I fit a model x[t]=c+m*x[t-1] + n*x[t-3]. Anyone could help me and tell me how to do it? Thank you very much. Chuse From gpetris at uark.edu Wed Mar 2 15:22:18 2011 From: gpetris at uark.edu (Giovanni Petris) Date: Wed, 02 Mar 2011 08:22:18 -0600 Subject: [R] bootstrap resampling question In-Reply-To: References: <1298990251.1675.1757.camel@definetti> Message-ID: <1299075738.1688.155.camel@definetti> Good point. I'll take my suggestion back... Giovanni On Tue, 2011-03-01 at 13:18 -0500, Jonathan P Daily wrote: > I'm not sure that is equivalent to sampling with replacement, since if the > first "draw" is 1, then the probability that the next draw will be one is > 4/100 instead of the 1/20 it would be in sampling with replacement. I > think the way to do this would be what Greg suggested - something like: > > bigsamp <- sample(1:20, 100, T) > idx <- sort(unlist(sapply(1:20, function(x) which(bigsamp == > x)[1:5])))[1:20] > samp <- bigsamp[idx] > > -------------------------------------- > Jonathan P. Daily > Technician - USGS Leetown Science Center > 11649 Leetown Road > Kearneysville WV, 25430 > (304) 724-4480 > "Is the room still a room when its empty? Does the room, > the thing itself have purpose? Or do we, what's the word... imbue it." > - Jubal Early, Firefly > > r-help-bounces at r-project.org wrote on 03/01/2011 09:37:31 AM: > > > [image removed] > > > > Re: [R] bootstrap resampling question > > > > Giovanni Petris > > > > to: > > > > Bodnar Laszlo EB_HU > > > > 03/01/2011 11:58 AM > > > > Sent by: > > > > r-help-bounces at r-project.org > > > > Cc: > > > > "'r-help at r-project.org'" > > > > A simple way of sampling with replacement from 1:20, with the additional > > constraint that each number can be selected at most five times is > > > > > sample(rep(1:20, 5), 20) > > > > HTH, > > Giovanni > > > > On Tue, 2011-03-01 at 11:30 +0100, Bodnar Laszlo EB_HU wrote: > > > Hello there, > > > > > > I have a problem concerning bootstrapping in R - especially > > focusing on the resampling part of it. I try to sum it up in a > > simplified way so that I would not confuse anybody. > > > > > > I have a small database consisting of 20 observations (basically > > numbers from 1 to 20, I mean: 1, 2, 3, 4, 5, ... 18, 19, 20). > > > > > > I would like to resample this database many times for the > > bootstrap process with the following two conditions. The resampled > > databases should also have 20 observations and you can select each > > of the previously mentioned 20 numbers with replacement. I guess it > > is obvious so far. Now the more difficult second condition is that > > one number can be selected only maximum 5 times. In order to make > > this clear I try to show you an example. So there can be resampled > > databases like the following ones: > > > > > > (1st database) 1,2,1,2,1,2,1,2,1,2,3,3,3,3,3,4,4,4,4,4 > > > (4 different numbers are chosen, each selected 5 times) > > > > > > (2nd database) 1,8,8,6,8,8,8,2,3,4,5,6,6,6,6,7,19,1,1,1 > > > (Two numbers - 8 and 6 - selected 5 times, number "1" selected > > four times, the others selected less than 4 times) > > > > > > My very first guess that came to my mind whilst thinking about the > > problem was the sample function where there are settings like > > replace=TRUE and prob=... where you can create a probability vector > > i.e. how much should be the probability of selecting a number. So I > > tried to calculate probabilities first. I thought the problem can > > basically described as a k-combination with repetitions. > > Unfortunately the only thing I could calculate so far is the total > > number of all possible selections which amounts to 137 846 527 049. > > > > > > Anybody knows how to implement my second "tricky" condition into > > one of the R functions? Are 'boot' and 'bootstrap' packages capable > > of managing this? I guess they are, I just couldn't figure it out yet... > > > > > > Thanks very much! Best regards, > > > Laszlo Bodnar > > > > > > > > > ____________________________________________________________________________________________________ > > > Ez az e-mail ?s az ?sszes hozz? tartoz? csatolt mell?klet titkos > > ?s/vagy jogilag, szakmailag vagy m?s m?don v?dett inform?ci?t > > tartalmazhat. Amennyiben nem ?n a lev?l c?mzettje akkor a lev?l > > tartalm?nak k?zl?se, reproduk?l?sa, m?sol?sa, vagy egy?b m?s ?ton > > t?rt?n? terjeszt?se, felhaszn?l?sa szigor?an tilos. Amennyiben > > t?ved?sb?l kapta meg ezt az ?zenetet k?rj?k azonnal ?rtes?tse az > > ?zenet k?ld?j?t. Az Erste Bank Hungary Zrt. (EBH) nem v?llal > > felel?ss?get az inform?ci? teljes ?s pontos - c?mzett(ek)hez t?rt?n? > > - eljuttat?s??rt, valamint semmilyen k?s?s?rt, kapcsolat > > megszakad?sb?l ered? hib??rt, vagy az inform?ci? felhaszn?l?s?b?l > > vagy annak megb?zhatatlans?g?b?l ered? k?r?rt. > > > > > > Az ?zenetek EBH-n k?v?li k?ld?je vagy c?mzettje tudom?sul veszi ?s > > hozz?j?rul, hogy az ?zenetekhez m?s banki alkalmazott is hozz?f?rhet > > az EBH folytonos munkamenet?nek biztos?t?sa ?rdek?ben. > > > > > > > > > This e-mail and any attached files are confidential > and/...{{dropped:19}} > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > > > > Giovanni Petris > > Associate Professor > > Department of Mathematical Sciences > > University of Arkansas - Fayetteville, AR 72701 > > Ph: (479) 575-6324, 575-8630 (fax) > > http://definetti.uark.edu/~gpetris/ > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > From jon.skoien at jrc.ec.europa.eu Wed Mar 2 15:28:54 2011 From: jon.skoien at jrc.ec.europa.eu (Jon Olav Skoien) Date: Wed, 02 Mar 2011 15:28:54 +0100 Subject: [R] Find downstream values in a network Message-ID: <4D6E5426.3040407@jrc.ec.europa.eu> Dear list, I have a data.frame with segments between river junctions and dimensionless predictions of runoff (runoff/area) at some of these junctions. As I want to plot my values on a continuous river network (this data.frame is part of a SpatialLinesDataFrame), I would like to change NA values to the closest non-NA value downstream. Here is a simple example: > examp = data.frame(FROMJCT = c(1,2,3,4,5,7,8,9,10,11,12,13,14),TOJCT = c(2,3,4,5,6,4,7,8,8,10,8,12,9)) > examp$pred = NA > examp$pred[c(2,4,5,7,13)] = c(1,2,3,4,5) > examp FROMJCT TOJCT pred 1 1 2 NA 2 2 3 1 3 3 4 NA 4 4 5 2 5 5 6 3 6 7 4 NA 7 8 7 4 8 9 8 NA 9 10 8 NA 10 11 10 NA 11 12 8 NA 12 13 12 NA 13 14 9 5 "FROMJCT" describes the upstream and "TOJCT" the downstream junction. examp$pred[7] above should hence get the value 3, as its "TOJCT" junction is the same as the "FROMJCT" junction of examp$pred[6]. examp$pred[8] should get the same value, as it is linked to examp$pred[6] through examp$pred[7]. I can do this iteratively by propagating values upwards in the river network by combining a while and a for-loop: ichange = 1 while (ichange > 0) { ichange = 0 for (i in 1:dim(examp)[1]) { if (!is.na(examp$pred[i])) { toid = which(examp$TOJCT == examp$FROMJCT[i]) if (length(toid) > 0 && is.na(examp$pred[toid])) { examp$pred[toid] = examp$pred[i] ichange = ichange + 1 } } } print(ichange) } But this looks messy and is rather slow when the river network is described through a large number of segments. I am quite sure that I have missed a better way of propagating the values. This is a preprocessing step before plotting a result in a documentation example, so I am looking for a short, intuitive and nice solution... Any hints? Thanks, Jon From gavin.simpson at ucl.ac.uk Wed Mar 2 15:39:48 2011 From: gavin.simpson at ucl.ac.uk (Gavin Simpson) Date: Wed, 02 Mar 2011 14:39:48 +0000 Subject: [R] Rioja package, creating transfer function, WA, "Error in FUN" In-Reply-To: <1297359613709-3299636.post@n4.nabble.com> References: <1297359613709-3299636.post@n4.nabble.com> Message-ID: <1299076788.25572.39.camel@prometheus.geog.ucl.ac.uk> On Thu, 2011-02-10 at 09:40 -0800, mdc wrote: > Hi, I am a new R user and am trying to construct a palaeoenvironmental > transfer function (weighted averaging method) using the package rioja. > I've managed to insert the two matrices (the species abundance and the > environmental data) and have assigned them to the y and x values > respectively. When I try and enter the 'WA' function though, I get an 'Error > in FUN' message (see below for full values). Alas, I do not know what this > means and have struggled to find similar problems to this online. Is there a > step I've missed out between assigning the matrices and the WA function? > > > SWED=odbcConnectExcel(file.choose()) (SWED is the environmental data > > file) > > sqlTables(SWED) > > Env=sqlFetch(SWED, "Sheet1") > > odbcClose(SWED) > > Env > > SampleId WTD Moisture pH EC > 1 "N1_1" "20" "91.72700" "3.496674" " 85.02688" > 2 "N1_2" " 2" "93.88913" "3.550794" " 85.69465" > 3 "N1_3" "26" "90.30269" "3.948559" "113.19206" > 4 "N1_4" " 5" "94.14427" "3.697213" " 48.56375" > 5 "N1_5" "30" "90.04269" "3.745020" "108.57278" > .... > 90 "GAL_15" "70" "94.07849" "3.777932" " 66.77673" That's your problem, the odbc stuff has read the data in as characters. CSV would be a lot simpler, just save your excel sheets as CSV files and read them in with: Env <- read.csv("my_excel_sheet.csv", row.names = 1) etc... where my_excel_sheet.csv is the name of your saved csv file or just use: Env <- read.csv(file.choose(), row.names = 1) if finding files via the GUI is helpful to you. It is odd that the species data set has been read in OK though - I say OK, but you still need to get the F1 column out of the species data and set it as the row names of your data. Sorry I'm coming to this late; I've been away and not really following the list for a few weeks. If you can't get things working, contact me off list and send the Excel files and I'll send back a script that will load the files and do the WA for you to look at. HTH G > > > STEST=odbcConnectExcel(file.choose()) > > sqlTables(STEST) (STEST is the > > species abundance file) > > Spe=sqlFetch(STEST, "Sheet8") > > odbcClose(STEST) > > Spe > > (The species data contains the abundance of 32 species over 90 sites, set > out like this) > F1 AmpFlav AmpWri ArcCat ArcDis > 1 N1_1 22.2929936 0.0000000 0.0000000 0.0000000 > 2 N1_2 30.9677419 0.0000000 0.0000000 3.2258065 > > > library(rioja) > > y <-as.matrix(Spe) > > x <-as.matrix(Env) > > > WA(y, x, tolDW = FALSE, use.N2=TRUE, check.data=TRUE, lean=FALSE) (the > > command from the WA section of the rioja booklet) > Error in FUN(newX[, i], ...) : invalid 'type' (character) of argument > > > Any help would be most appreciated, > Best wishes, > Matthew -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% From ripley at stats.ox.ac.uk Wed Mar 2 15:59:27 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Wed, 2 Mar 2011 14:59:27 +0000 (GMT) Subject: [R] Refine ARMA model In-Reply-To: References: Message-ID: Hint: the 'fixed' argument can be used to set a parameter to a fixed value such as zero. With the reproducible example we asked you for, we might have shown you how to use it .... On Wed, 2 Mar 2011, Chuse chuse wrote: > Dear users, > > I tried to fit an AR(2) model to data. This the result: >> arima(vw,c(3,0,0)) > > Call: > arima(x = vw, order = c(3, 0, 0)) > > Coefficients: > ar1 ar2 ar3 intercept > 0.1052 -0.0102 -0.1203 0.0099 > s.e. 0.0337 0.0339 0.0338 0.0018 > > sigma^2 estimated as 0.002934: log likelihood = 1293.16, aic = -2576.33 > > Now, ar2 is not significantly different from zero. > I would like to refine the model considering ar1 and ar3 only so I fit a model > x[t]=c+m*x[t-1] + n*x[t-3]. > > Anyone could help me and tell me how to do it? Thank you very much. > Chuse -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From ligges at statistik.tu-dortmund.de Wed Mar 2 16:00:16 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Wed, 02 Mar 2011 16:00:16 +0100 Subject: [R] RWinEdt difficulties In-Reply-To: References: Message-ID: <4D6E5B80.6050502@statistik.tu-dortmund.de> On 01.03.2011 11:01, John Seers wrote: > Hello Everyone > > I have just upgraded my PC to Windows 7 (64 bit) and I have installed R > 2.12.2. R seems to be working fine. > > I am having problems getting RWinEdt working with it though. > > I have tried installing WinEdt 6.0 and WinEdt 5.5. But both fail with the > same error using R as 64 bit or 32 bit. I install the package using > Administrator rights. > > >> library(RWinEdt) > Warning message: > In shell(paste("\"\"", .gW$InstallRoot, "\\WinEdt.exe\" -C=\"R-WinEdt\" > -E=", : > '""C:\Program Files (x86)\WinEdt Team\WinEdt 6\WinEdt.exe" -C="R-WinEdt" > -E="C:\Program Files (x86)\WinEdt Team\WinEdt 6\R.ini""' execution failed > with error code 1 >> One installing RWinEdt the first time, please run R with Administrator privileges (right click to do so). Then installation should work smoothly with WinEdt < 6.0. > Does it matter if you are using 64 bit R and the 32 bit WinEdt? (I have > tried 32 bit R). > > Does RWinEdt work with WinEdt 6.0? No, not yet, unfortunately. But some free time is scheduled for this in April. Uwe Ligges > Can anybody suggest a solution? > Thanks for any help. > > Regards > > John Seers > > > ********************************************************************************************************** > ********************************************************************************************************** > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From KINLEY_ROBERT at lilly.com Wed Mar 2 16:10:48 2011 From: KINLEY_ROBERT at lilly.com (Robert Kinley) Date: Wed, 2 Mar 2011 15:10:48 +0000 Subject: [R] RWinEdt difficulties In-Reply-To: <4D6E5B80.6050502@statistik.tu-dortmund.de> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ral at lcfltd.com Wed Mar 2 16:08:23 2011 From: ral at lcfltd.com (Robert A LaBudde) Date: Wed, 02 Mar 2011 10:08:23 -0500 Subject: [R] problem with glm(family=binomial) when some levels have only 0 proportion values In-Reply-To: <20110302110142.7554696fy5afjqti@webmail.unibas.ch> References: <20110302110142.7554696fy5afjqti@webmail.unibas.ch> Message-ID: <0LHF003WMRE5IJ20@vms173007.mailsrvcs.net> The algorithm is not converging. Your iterations are at the maximum. It won't do any good to add a fractional number to all data, as the result will depend on the number added (try 1.0, 0.5 and 0.1 to see this). The root problem is that your data are degenerate. Firstly, your types '2' and '3' are indistinguishable in your data. Secondly, consider the case without 'type'. If you have all zero data for 10 trials, you cannot discriminate among mu = 0, 0.00001, 0.0001, 0.001 or 0.01. This leads to numerical instability. Thirdly, the variance estimate in the IRLS will start at 0.0, which gives a singularity. Fundamentally, the algorithm is failing because you are at the boundary of possibilities for a parameter, so special techniques are needed to do maximum likelihood estimation. The simple solution is to deal with the data for your types separately. Another is to do more batches for '2' and '3' to get an observed failure. At 05:01 AM 3/2/2011, J?rg Schulze wrote: >Hello everybody > >I want to compare the proportions of germinated seeds (seed batches of >size 10) of three plant types (1,2,3) with a glm with binomial data >(following the method in Crawley: Statistics,an introduction using R, >p.247). >The problem seems to be that in two plant types (2,3) all plants have >proportions = 0. >I give you my data and the model I'm running: > > success failure type > [1,] 0 10 3 > [2,] 0 10 2 > [3,] 0 10 2 > [4,] 0 10 2 > [5,] 0 10 2 > [6,] 0 10 2 > [7,] 0 10 2 > [8,] 4 6 1 > [9,] 4 6 1 >[10,] 3 7 1 >[11,] 5 5 1 >[12,] 7 3 1 >[13,] 4 6 1 >[14,] 0 10 3 >[15,] 0 10 3 >[16,] 0 10 3 >[17,] 0 10 3 >[18,] 0 10 3 >[19,] 0 10 3 >[20,] 0 10 2 >[21,] 0 10 2 >[22,] 0 10 2 >[23,] 9 1 1 >[24,] 6 4 1 >[25,] 4 6 1 >[26,] 0 10 3 >[27,] 0 10 3 > > y<- cbind(success, failure) > > Call: >glm(formula = y ~ type, family = binomial) > >Deviance Residuals: > Min 1Q Median 3Q >-1.3521849 -0.0000427 -0.0000427 -0.0000427 > Max > 2.6477556 > >Coefficients: > Estimate Std. Error z value Pr(>|z|) >(Intercept) 0.04445 0.21087 0.211 0.833 >typeFxC -23.16283 6696.13233 -0.003 0.997 >typeFxD -23.16283 6696.13233 -0.003 0.997 > >(Dispersion parameter for binomial family taken to be 1) > > Null deviance: 134.395 on 26 degrees of freedom >Residual deviance: 12.622 on 24 degrees of freedom >AIC: 42.437 > >Number of Fisher Scoring iterations: 20 > > >Huge standard errors are calculated and there is no difference between >plant type 1 and 2 or between plant type 1 and 3. >If I add 1 to all successes, so that all the 0 values disappear, the >standard error becomes lower and I find highly significant differences >between the plant types. > >suc<- success + 1 >fail<- 11 - suc >Y<- cbind(suc,fail) > >Call: >glm(formula = Y ~ type, family = binomial) > >Deviance Residuals: > Min 1Q Median 3Q >-1.279e+00 -4.712e-08 -4.712e-08 0.000e+00 > Max > 2.584e+00 > >Coefficients: > Estimate Std. Error z value Pr(>|z|) >(Intercept) 0.2231 0.2023 1.103 0.27 >typeFxC -2.5257 0.4039 -6.253 4.02e-10 *** >typeFxD -2.5257 0.4039 -6.253 4.02e-10 *** >--- >Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > >(Dispersion parameter for binomial family taken to be 1) > > Null deviance: 86.391 on 26 degrees of freedom >Residual deviance: 11.793 on 24 degrees of freedom >AIC: 76.77 > >Number of Fisher Scoring iterations: 4 > > >So I think the 0 values of all plants of group 2 and 3 are the >problem, do you agree? >I don't know why this is a problem, or how I can explain to a reviewer >why a data transformation (+ 1) is necessary with such a dataset. > >I would greatly appreciate any comments. >Juerg >______________________________________ > >J?rg Schulze >Department of Environmental Sciences >Section of Conservation Biology >University of Basel >St. Johanns-Vorstadt 10 >4056 Basel, Switzerland >Tel.: ++41/61/267 08 47 > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. ================================================================ Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: ral at lcfltd.com Least Cost Formulations, Ltd. URL: http://lcfltd.com/ 824 Timberlake Drive Tel: 757-467-0954 Virginia Beach, VA 23464-3239 Fax: 757-467-2947 "Vere scire est per causas scire" ================================================================ From simone.gabbriellini at gmail.com Wed Mar 2 16:10:20 2011 From: simone.gabbriellini at gmail.com (Simone Gabbriellini) Date: Wed, 2 Mar 2011 16:10:20 +0100 Subject: [R] how to simplify a data.frame and add the counts of duplicate rows as a new column Message-ID: <44BDC344-6A0B-4061-8565-71DE46D98BB7@gmail.com> Hello List, I would like to simplify a data.frame like this columnA columnB user10 proj12 user10 proj19 user10 proj12 into something like: columnA columnB columnC user10 proj12 2 user10 proj19 1 I know unique() can simplify the data.frame, but how to count and store the duplicates? thanks in advance for any help. best regards, Simone From scttchamberlain4 at gmail.com Wed Mar 2 16:22:59 2011 From: scttchamberlain4 at gmail.com (Scott Chamberlain) Date: Wed, 2 Mar 2011 09:22:59 -0600 Subject: [R] how to simplify a data.frame and add the counts of duplicate rows as a new column In-Reply-To: <44BDC344-6A0B-4061-8565-71DE46D98BB7@gmail.com> References: <44BDC344-6A0B-4061-8565-71DE46D98BB7@gmail.com> Message-ID: <377289C507D84F14920840F6D6D87A45@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From simone.gabbriellini at gmail.com Wed Mar 2 16:34:14 2011 From: simone.gabbriellini at gmail.com (Simone Gabbriellini) Date: Wed, 2 Mar 2011 16:34:14 +0100 Subject: [R] how to simplify a data.frame and add the counts of duplicate rows as a new column In-Reply-To: <377289C507D84F14920840F6D6D87A45@gmail.com> References: <44BDC344-6A0B-4061-8565-71DE46D98BB7@gmail.com> <377289C507D84F14920840F6D6D87A45@gmail.com> Message-ID: <7C56F84F-6646-4CA8-A347-60176108E86A@gmail.com> many thanks, this is really a great solution! best, Simone Il giorno 02/mar/2011, alle ore 16.22, Scott Chamberlain ha scritto: > see package plyr, especially the function ddply(), eg.., in your case: > > ddply(dataframe, .(columnA, columnB), summarise, > columnC = length(columnB) > ) > > Scott > On Wednesday, March 2, 2011 at 9:10 AM, Simone Gabbriellini wrote: > >> Hello List, >> >> I would like to simplify a data.frame like this >> >> columnA columnB >> user10 proj12 >> user10 proj19 >> user10 proj12 >> >> into something like: >> >> columnA columnB columnC >> user10 proj12 2 >> user10 proj19 1 >> >> I know unique() can simplify the data.frame, but how to count and store the duplicates? >> >> thanks in advance for any help. >> >> best regards, >> Simone >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > From kitty.a1000 at gmail.com Wed Mar 2 16:40:49 2011 From: kitty.a1000 at gmail.com (sadz a) Date: Wed, 2 Mar 2011 15:40:49 +0000 Subject: [R] How to extrapolate a model Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From a.strelniece at eurotransplant.org Wed Mar 2 16:47:23 2011 From: a.strelniece at eurotransplant.org (Aggita) Date: Wed, 2 Mar 2011 07:47:23 -0800 (PST) Subject: [R] message: please select CRAN mirror Message-ID: <1299080842948-3331711.post@n4.nabble.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From b.hartley at cardinal-sys.com Wed Mar 2 16:33:30 2011 From: b.hartley at cardinal-sys.com (Benjamin Hartley) Date: Wed, 02 Mar 2011 16:33:30 +0100 Subject: [R] Question regarding vector manipulation Message-ID: <4D6E634A.2010700@cardinal-sys.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From b.k.andreassen at medisin.uio.no Wed Mar 2 15:31:57 2011 From: b.k.andreassen at medisin.uio.no (Bettina Kulle Andreassen) Date: Wed, 02 Mar 2011 15:31:57 +0100 Subject: [R] *** caught segfault *** when using impute.knn (impute package) Message-ID: <4D6E54DD.1050600@medisin.uio.no> hi, i am getting an error when calling the impute.knn function (see the screenshot below). what is the problem here and how can it be solved? screenshot: ################## *** caught segfault *** address 0x513c7b84, cause 'memory not mapped' Traceback: 1: .Fortran("knnimp", x, ximp = x, p, n, imiss = imiss, irmiss, as.integer(k), double(p), double(n), integer(p), integer(n), PACKAGE = "impute") 2: knnimp.internal(x, k, imiss, irmiss, p, n, maxp = maxp) 3: knnimp(x, k, maxmiss = rowmax, maxp = maxp) 4: impute.knn(dummy0, k) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace ################## thanks for your help in advance! tina -- Bettina Kulle Andreassen University of Oslo Department of Biostatistics and Institute for Epi-Gen (Faculty Division Ahus) tel: +47 22851193 +47 67963923 From benhartley903 at googlemail.com Wed Mar 2 16:42:12 2011 From: benhartley903 at googlemail.com (Benjamin Hartley) Date: Wed, 2 Mar 2011 16:42:12 +0100 Subject: [R] Vector manipulations Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From brynedal at gmail.com Wed Mar 2 16:14:02 2011 From: brynedal at gmail.com (Bryo) Date: Wed, 2 Mar 2011 07:14:02 -0800 (PST) Subject: [R] Selecting a subsample so that it follows a distribution. Message-ID: <1299078842639-3331659.post@n4.nabble.com> Hi All, I want to select rows at random from a large data.frame while achieving a particular distribution defined my a given subset of this data.frame. How can I do this? More details and what I've done so far is given below. I have gene expression data and gene sets of interest. In order to look at enrichment of differential expression I'm doing a simple permutation approach: Selecting a an random set of genes (same size at those diff exp) and recording the overlap, repeating 10 000 times. The problem: The expression level and significance in differential expression is correlated (more power). Hence I want to do a biased permutation, selecting random genes that together follow the same expression level distribution. This is what I've done so far: geneExp is my data.frame with DE statistics. 6585 rows of genes, col one is gene ID. geneSet is my gene set, column one is gene ID. index is the index of the genes DE in my geneExp. dSign=density(geneExp[index,'baseMean']) #baseMean is a measure of expressionlevel prob=lapply(geneExp[,"baseMean"],function(x) approx(dSign$x,dSign$y,x)$y) prob=unlist(prob) So when I am doing my permutation I do: overlap=vector(0,length=10000) for (i in 1:10000) { index=sample(1:6585,543,prob=prob) overlap[i]=sum(!is.na(match(geneSet[,1],geneExp[index,1]))) } And thereafter look at the distribution of random overlaps compared to the initially observed overlap. But, the distribution of values that this permutation gives in NOT equal to the distr of significant genes, but a lot narrower. Simple because my method assumes a uniform distribution of values to chose from. Sorry if this was a complicated message, I would highly appreciate any help or comments! Best, Bryo -- View this message in context: http://r.789695.n4.nabble.com/Selecting-a-subsample-so-that-it-follows-a-distribution-tp3331659p3331659.html Sent from the R help mailing list archive at Nabble.com. From colleen.t.kenney at gmail.com Wed Mar 2 16:34:37 2011 From: colleen.t.kenney at gmail.com (Colleen Kenney) Date: Wed, 2 Mar 2011 10:34:37 -0500 Subject: [R] Probit Analysis and Interval Calculations for different LD50s Message-ID: I am encountering a problem with the calculation of Fieller and Delta Method confidence intervals when performing probit analysis on simulated data; my code is included below. I am testing 5 dose groups, with log doses (-0.2, -0.1, 0, 0.1, 0.2) and (1.8, 1.9, 2, 2.1, 2.2) so that the log(LD50) are 0 and 2, respectively. However, while I get the coverage as seen in the literature for the log doses surrounding 0, I get very wide intervals when log(LD50)=2, with everything else remaining constant. Can anyone help please? nd=100 N=10000 m=5 alpha=0.05 x<-c(-0.2, -0.1, 0, 0.1, 0.2) logLD50<-0 slope<-10 for (i in 1:N){ dose1[i]<-sum(rbinom(nd, 1, pnorm((x[1]-logLD50)*slope))) dose2[i]<-sum(rbinom(nd, 1, pnorm((x[2]-logLD50)*slope))) dose3[i]<-sum(rbinom(nd, 1, pnorm((x[3]-logLD50)*slope))) dose4[i]<-sum(rbinom(nd, 1, pnorm((x[4]-logLD50)*slope))) dose5[i]<-sum(rbinom(nd, 1, pnorm((x[5]-logLD50)*slope))) } ld50<-function(b) -b[1]/b[2] for (i in 1:N){ pw<-data.frame(x=x, n=rep(nd, m), y=c(dose1[i], dose2[i], dose3[i], dose4[i], dose5[i])) pw$Ymat<-cbind(pw$y, nd-pw$y) pwp.1<-glm(Ymat~x, family=binomial(link=probit), data=pw) pwp<-summary(pwp.1) iter[i]<-pwp.1$iter ld[i]<-ld50(coef(pwp.1)) a[i]<-coef(pwp.1)[1] b[i]<-coef(pwp.1)[2] nu11<-pwp$cov.unscaled[1,1] nu12<-pwp$cov.unscaled[1,2] nu22[i]<-pwp$cov.unscaled[2,2] mse[i]<- nu11/b^2+nu22*a^2/b^4-2*nu12*a/(b^3) s.ab<-sqrt(nu11/b^2+nu22*a^2/b^4-2*nu12*a/(b^3)) z.alpha<-qnorm(1-alpha/2) g[i]<-z.alpha^2*nu22/b[i]^2 fl.lower[i]<-ld[i]+g[i]/(1-g[i])*(ld[i]-nu12/nu22)-z.alpha/(b[i]*(1-g[i]))*sqrt(nu11-2*ld[i]*nu12+ld[i]^2*nu22-g[i]*(nu11-nu12^2/nu22)) #Fieller interval fl.upper[i]<-ld[i]+g[i]/(1-g[i])*(ld[i]-nu12/nu22)+z.alpha/(b[i]*(1-g[i]))*sqrt(nu11-2*ld[i]*nu12+ld[i]^2*nu22-g[i]*(nu11-nu12^2/nu22)) ci.lower[i]<-ld[i]-z.alpha*s.ab #delta method interval ci.upper[i]<-ld[i]+z.alpha*s.ab } From crabak at acm.org Wed Mar 2 16:01:06 2011 From: crabak at acm.org (csrabak) Date: Wed, 2 Mar 2011 13:01:06 -0200 Subject: [R] problem with glm(family=binomial) when some levels have only 0 proportion values In-Reply-To: <20110302110142.7554696fy5afjqti@webmail.unibas.ch> References: <20110302110142.7554696fy5afjqti@webmail.unibas.ch> Message-ID: Em 2/3/2011 08:01, J?rg Schulze escreveu: > Hello everybody This is not a R related problem, but rather more theoretic one, anyway: > > I want to compare the proportions of germinated seeds (seed batches of > size 10) of three plant types (1,2,3) with a glm with binomial data > (following the method in Crawley: Statistics,an introduction using R, > p.247). > The problem seems to be that in two plant types (2,3) all plants have > proportions = 0. > I give you my data and the model I'm running: > > success failure type > [1,] 0 10 3 [snipped] > [26,] 0 10 3 > [27,] 0 10 3 > > y<- cbind(success, failure) > > Call: > glm(formula = y ~ type, family = binomial) > > Deviance Residuals: > Min 1Q Median 3Q > -1.3521849 -0.0000427 -0.0000427 -0.0000427 > Max > 2.6477556 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) 0.04445 0.21087 0.211 0.833 > typeFxC -23.16283 6696.13233 -0.003 0.997 > typeFxD -23.16283 6696.13233 -0.003 0.997 > > (Dispersion parameter for binomial family taken to be 1) > > Null deviance: 134.395 on 26 degrees of freedom > Residual deviance: 12.622 on 24 degrees of freedom > AIC: 42.437 > > Number of Fisher Scoring iterations: 20 > > > Huge standard errors are calculated and there is no difference between > plant type 1 and 2 or between plant type 1 and 3. > If I add 1 to all successes, so that all the 0 values disappear, the > standard error becomes lower and I find highly significant differences > between the plant types. > > suc<- success + 1 > fail<- 11 - suc > Y<- cbind(suc,fail) > > Call: > glm(formula = Y ~ type, family = binomial) > > Deviance Residuals: > Min 1Q Median 3Q > -1.279e+00 -4.712e-08 -4.712e-08 0.000e+00 > Max > 2.584e+00 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) 0.2231 0.2023 1.103 0.27 > typeFxC -2.5257 0.4039 -6.253 4.02e-10 *** > typeFxD -2.5257 0.4039 -6.253 4.02e-10 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > (Dispersion parameter for binomial family taken to be 1) > > Null deviance: 86.391 on 26 degrees of freedom > Residual deviance: 11.793 on 24 degrees of freedom > AIC: 76.77 > > Number of Fisher Scoring iterations: 4 > > > So I think the 0 values of all plants of group 2 and 3 are the problem, > do you agree? It depends on the definition of "problem" here, if the result of your experiment, maybe, for the difference in the two regressions, not. > I don't know why this is a problem, or how I can explain to a reviewer > why a data transformation (+ 1) is necessary with such a dataset. You need to ascertain the modeling of your statistic test against the epistemological analysis you're performing. Caveat: I'm not an expert in agriculture, so this is just a comment. If the success rates of your dataframe are the germinations of three types of plants in a certain period of time, then perhaps it could make sense to add one to all the values in the success column (and subtract ones from the failure?) because that would cope with the possibility that a certain time after the experiment has been stopped, it could have germinated. If in the other hand, the non germinated seeds are known to not germinate anymore, then the calculation device would put you on wrong path. > > I would greatly appreciate any comments. Get a look at the zero inflated (and perhaps hurdle as well) distributions and the regressions associated with them. Using sos I get more than 100 entries to look at, so I'll refrain to put specific links here. HTH -- Cesar Rabak DC Consulting LTDA From crosspide at hotmail.com Wed Mar 2 14:50:08 2011 From: crosspide at hotmail.com (agent dunham) Date: Wed, 2 Mar 2011 05:50:08 -0800 (PST) Subject: [R] how many records for suitable regression Message-ID: <1299073808387-3331522.post@n4.nabble.com> Dear community, I was wondering if it's possible to know if you have enough data for a regression study. I remember you must have more data than parameters to obtain, but I'd like to know if there was something more sophisticated. Thanks, user at host.com -- View this message in context: http://r.789695.n4.nabble.com/how-many-records-for-suitable-regression-tp3331522p3331522.html Sent from the R help mailing list archive at Nabble.com. From marchywka at hotmail.com Wed Mar 2 13:26:25 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Wed, 2 Mar 2011 07:26:25 -0500 Subject: [R] R and Android In-Reply-To: References: Message-ID: ---------------------------------------- > Date: Wed, 2 Mar 2011 11:44:41 +0000 > From: ali.zolfaghari at gmail.com > To: r-help at r-project.org > Subject: [R] R and Android > > Hi List, > Is anybody aware of any R console available for Android mobile? I know that > there is one for Iphone. I was just looking at PHP libraries for use with Rserve and I guess I'd ask if you or others using R on a phone could comment on how that would compare to, say, just using a web app and running R on a remote machine through a browser? What exactly do you use R for on a small device like that? Thanks. > > thanks, > Alireza From nandan.amar at gmail.com Wed Mar 2 14:48:59 2011 From: nandan.amar at gmail.com (Amar) Date: Wed, 2 Mar 2011 13:48:59 +0000 Subject: [R] finding model order components for arima() Message-ID: Hi, I am trying to model a time series using arima(). For getting the model order components(p, d, q and P,D,Q) I am using procedure discussed in [1] in section 3.2 . It is most likely hit and trial method based on lower AIC value. I want to know what is the correct way to find model order components or the method described in [1] is the appropriate one. thanks in advance. From newhonewind at gmail.com Wed Mar 2 11:47:14 2011 From: newhonewind at gmail.com (newhonewind) Date: Wed, 2 Mar 2011 02:47:14 -0800 (PST) Subject: [R] The other Question of "Censored Quantile Regression for Longitudinal Data" In-Reply-To: <1297607305459-3303675.post@n4.nabble.com> References: <1287220445414-883458.post@n4.nabble.com> <1287220445413-2998110.post@n4.nabble.com> <1287335534874-2999206.post@n4.nabble.com> <1287484347858-3001875.post@n4.nabble.com> <1297607038599-3303668.post@n4.nabble.com> <1297607305459-3303675.post@n4.nabble.com> Message-ID: <1299062834295-3331318.post@n4.nabble.com> How to solve the panel data of cqr by writing the R code or using the "quantreg" ? Thank you! -- View this message in context: http://r.789695.n4.nabble.com/Question-of-Quantile-Regression-for-Longitudinal-Data-tp883458p3331318.html Sent from the R help mailing list archive at Nabble.com. From news at jonasstein.de Wed Mar 2 12:48:10 2011 From: news at jonasstein.de (Jonas Stein) Date: Wed, 2 Mar 2011 12:48:10 +0100 Subject: [R] Plot with same font like in LaTeX Message-ID: Hi, i want to make my plots look uniform in LaTeX documents. - usage of the same font on axes and in legend like LaTeX uses (for example "Computer Modern") - put real LaTeX formulas on the axes Have you any hints how i can achieve that? I had no luck two years ago, but i want to try it again now. kind regards, -- Jonas Stein From scttchamberlain4 at gmail.com Wed Mar 2 14:01:46 2011 From: scttchamberlain4 at gmail.com (Scott Chamberlain) Date: Wed, 2 Mar 2011 07:01:46 -0600 Subject: [R] Rcommander In-Reply-To: References: Message-ID: <3063944CD9914F22B4C761ADB4719DD4@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From skmaidulhaque at gmail.com Wed Mar 2 14:43:52 2011 From: skmaidulhaque at gmail.com (SK MAIDUL HAQUE) Date: Wed, 2 Mar 2011 19:13:52 +0530 Subject: [R] transform table to matrix Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From violagirl470 at msn.com Wed Mar 2 15:13:38 2011 From: violagirl470 at msn.com (Laura Clasemann) Date: Wed, 2 Mar 2011 14:13:38 +0000 Subject: [R] Contingency table in R Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From william.a.simpson at gmail.com Wed Mar 2 09:27:52 2011 From: william.a.simpson at gmail.com (William Simpson) Date: Wed, 2 Mar 2011 08:27:52 +0000 Subject: [R] trigonometric regression In-Reply-To: References: Message-ID: Useful guide to trig regression -------------- next part -------------- A non-text attachment was scrubbed... Name: trig_reg.pdf Type: application/pdf Size: 423687 bytes Desc: not available URL: From xinwei at stat.psu.edu Wed Mar 2 16:14:44 2011 From: xinwei at stat.psu.edu (xin wei) Date: Wed, 2 Mar 2011 07:14:44 -0800 (PST) Subject: [R] a question on sqldf's handling of missing value and factor In-Reply-To: References: <1299041571876-3331007.post@n4.nabble.com> Message-ID: <1299078884773-3331662.post@n4.nabble.com> Dear Mr. Grothendieck : thank you so much for your attention. You are the real expert here. the following is a mock text file: a b c aa 23 aaa 34 aaaa 77 note that both b and c column contain missing value (blank) I save it under my C drive and use both read.table and sqldf to import it to R and then use identical() function to compare the result. The following is the result: > setwd("c:/") > library(sqldf) > test <- file("test.txt") > testx <- sqldf("select * from test", + dbname = tempfile(), file.format = list(header = T, sep="\t", row.names = F)) > testy<- read.table("test.txt", header = T, sep="\t") > identical(testx, testy) [1] FALSE > testx a b c 1 aa 23.0 2 aaa 34.6 0.0 3 aaaa 77.8 > testy a b c 1 aa NA 23.0 2 aaa 34.6 NA 3 aaaa NA 77.8 > class(testx$b) [1] "factor" > class(testy$b) [1] "numeric" > read.table seems to get it right while sqldf treats b as factor (if I add method="raw", b become character). what is more troubling is that column C has number 0 at the second row while in the original file it is missing. In my real world situation with a much larger text file, the problem is that many cells are empty when they all actually have values in the original text file. I would greatly appreciate your help if you can shed some light on this. thanks -- View this message in context: http://r.789695.n4.nabble.com/a-question-on-sqldf-s-handling-of-missing-value-and-factor-tp3331007p3331662.html Sent from the R help mailing list archive at Nabble.com. From xinwei at stat.psu.edu Wed Mar 2 16:17:51 2011 From: xinwei at stat.psu.edu (xin wei) Date: Wed, 2 Mar 2011 07:17:51 -0800 (PST) Subject: [R] a question on sqldf's handling of missing value and factor In-Reply-To: <1299078884773-3331662.post@n4.nabble.com> References: <1299041571876-3331007.post@n4.nabble.com> <1299078884773-3331662.post@n4.nabble.com> Message-ID: <1299079071033-3331667.post@n4.nabble.com> I am sorry for posting the wrong source file. the correct source file is as follows: a b c aa 23 aaa 34.6 aaaa 77.8 They are tab delimited but somehow could not be displayed correctly in browser. -- View this message in context: http://r.789695.n4.nabble.com/a-question-on-sqldf-s-handling-of-missing-value-and-factor-tp3331007p3331667.html Sent from the R help mailing list archive at Nabble.com. From ligges at statistik.tu-dortmund.de Wed Mar 2 16:53:03 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Wed, 02 Mar 2011 16:53:03 +0100 Subject: [R] *** caught segfault *** when using impute.knn (impute package) In-Reply-To: <4D6E54DD.1050600@medisin.uio.no> References: <4D6E54DD.1050600@medisin.uio.no> Message-ID: <4D6E67DF.6080702@statistik.tu-dortmund.de> On 02.03.2011 15:31, Bettina Kulle Andreassen wrote: > hi, > > i am getting an error when calling the impute.knn > function (see the screenshot below). > what is the problem here and how can it be solved? Please write to the package maintainers. This is probably a bug in the package. Uwe Ligges > > > screenshot: > > ################## > *** caught segfault *** > address 0x513c7b84, cause 'memory not mapped' > > Traceback: > 1: .Fortran("knnimp", x, ximp = x, p, n, imiss = imiss, irmiss, > as.integer(k), double(p), double(n), integer(p), integer(n), PACKAGE = > "impute") > 2: knnimp.internal(x, k, imiss, irmiss, p, n, maxp = maxp) > 3: knnimp(x, k, maxmiss = rowmax, maxp = maxp) > 4: impute.knn(dummy0, k) > > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > > ################## > > thanks for your help in advance! > > tina > From izahn at psych.rochester.edu Wed Mar 2 16:53:48 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Wed, 2 Mar 2011 10:53:48 -0500 Subject: [R] Plot with same font like in LaTeX In-Reply-To: References: Message-ID: Have a look at the tikzDevice package. Best, Ista On Wed, Mar 2, 2011 at 6:48 AM, Jonas Stein wrote: > Hi, > > i want to make my plots look uniform in LaTeX documents. > > - usage of the same font on axes and in legend like LaTeX uses > ?(for example "Computer Modern") > > - put real LaTeX formulas on the axes > > Have you any hints how i can achieve that? > I had no luck two years ago, but i want to try it again now. > > kind regards, > > -- > Jonas Stein > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From rex.dwyer at syngenta.com Wed Mar 2 16:53:46 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Wed, 2 Mar 2011 10:53:46 -0500 Subject: [R] inefficient ifelse() ? In-Reply-To: References: <77EB52C6DD32BA4D87471DCD70C8D70003F7474A@NA-PA-VBE03.na.tibco.com> Message-ID: <36180405F8418449918AD20618D110FC095BF54E1B@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Hi Ivo, It might be useful for you to study the examples below. The key from a programming language point of view is that functions like ifelse are functions of whole vectors, not elements of vectors. You either evaluate an argument or you don't; you don't evaluate only part of argument. (Somebody correct me if I'm wrong.) As you can see from the examples, if there are no TRUEs or no FALSEs in the condition, the corresponding arms are not evaluated, but if there are some of each, both must be evaluated. This a property of the entire condition vector. You can see all this if you type ifelse (not ?ifelse, just ifelse) and look at the definition. If you want to operate on elements of vectors, you need to use subsetting, e.g.: s = rep(NA,length(t)); b=t%%2==0; s[b]=g(t[b]); s[!b]=f(t[!b]) I agree that it might be counterintuitive for a beginner, but so is 0!=0^0=1, and both follow from first principles. (e.g. n! = n(n-1)!) "Counterintuitive" is not the same as "incorrect", and "correct" is not the same as "efficient". :) HTH Rex > t = 1:30 > ifelse(t%%2==0,g(t),f(t)) g for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 f for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 [1] 2 6 6 12 10 18 14 24 18 30 22 36 26 42 30 48 34 54 38 60 42 66 46 72 50 [26] 78 54 84 58 90 > t = 2*(1:30) > ifelse(t%%2==0,g(t),f(t)) g for 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 [1] 6 12 18 24 30 36 42 48 54 60 66 72 78 84 90 96 102 108 114 [20] 120 126 132 138 144 150 156 162 168 174 180 > t = 2*(1:30)+1 > ifelse(t%%2==0,g(t),f(t)) f for 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 [1] 6 10 14 18 22 26 30 34 38 42 46 50 54 58 62 66 70 74 78 [20] 82 86 90 94 98 102 106 110 114 118 122 > t = rep(c(1,2,NA),3) > ifelse(t%%2==0,g(t),f(t)) g for 1 2 NA 1 2 NA 1 2 NA f for 1 2 NA 1 2 NA 1 2 NA [1] 2 6 NA 2 6 NA 2 6 NA > t = rep(NA,10) > ifelse(t%%2==0,g(t),f(t)) [1] NA NA NA NA NA NA NA NA NA NA > t=1:30 > ifelse(c(TRUE,FALSE,FALSE,TRUE),g(t),f(t)) g for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 f for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 [1] 3 4 6 12 > -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of ivo welch Sent: Tuesday, March 01, 2011 5:20 PM To: William Dunlap Cc: r-help Subject: Re: [R] inefficient ifelse() ? yikes. you are asking me too much. thanks everybody for the information. I learned something new. my suggestion would be for the much smarter language designers (than I) to offer us more or less blissfully ignorant users another vector-related construct in R. It could perhaps be named %if% %else%, analogous to if else (with naming inspired by %in%, and with evaluation only of relevant parts [just as if else for scalars]), with different outcomes in some cases, but with the advantage of typically evaluating only half as many conditions as the ifelse() vector construct. %if% %else% may work only in a subset of cases, but when it does work, it would be nice to have. it would probably be my first "goto" function, with ifelse() use only as a fallback. of course, I now know how to fix my specific issue. I was just surprised that my first choice, ifelse(), was not as optimized as I had thought. best, /iaw On Tue, Mar 1, 2011 at 5:13 PM, William Dunlap wrote: > An ifelse-like function that only evaluated > what was needed would be fine, but it would > have to be different from ifelse itself. The > trick is to come up with a good parameterization. > > E.g., how would it deal with things like > ifelse(is.na(x), mean(x, na.rm=TRUE), x) > or > ifelse(x>1, log(x), runif(length(x),-1,0)) > or > ifelse(x>1, log(x), -seq_along(x)) > Would it reject such things? Deciding that the > x in mean(x,na.rm=TRUE) should be replaced by > x[is.na(x)] would be wrong. Deciding that > runif(length(x)) should be replaced by runif(sum(x>1)) > seems a bit much to expect. Replacing seq_along(x) with > seq_len(sum(x>1)) is wrong. It would be better to > parameterize the new function so it wouldn't have to > think about those cases. > > Would you want it to depend only on a logical > vector or perhaps also on a factor (a vectorized > switch/case function)? > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > >> -----Original Message----- >> From: r-help-bounces at r-project.org >> [mailto:r-help-bounces at r-project.org] On Behalf Of ivo welch >> Sent: Tuesday, March 01, 2011 12:36 PM >> To: Henrique Dallazuanna >> Cc: r-help >> Subject: Re: [R] inefficient ifelse() ? >> >> thanks, Henrique. did you mean >> >> as.vector(t(mapply(function(x, f)f(x), split(t, ((t %% 2)==0)), >> list(f, g)))) ? >> >> otherwise, you get a matrix. >> >> its a good solution, but unfortunately I don't think this can be used >> to redefine ifelse(cond,ift,iff) in a way that is transparent. the >> ift and iff functions will always be evaluated before the function >> call happens, even with lazy evaluation. :-( >> >> I still think that it makes sense to have a smarter vectorized %if% in >> a vectorized language like R. just my 5 cents. >> >> /iaw >> >> ---- >> Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com) >> >> >> >> >> >> On Tue, Mar 1, 2011 at 2:33 PM, Henrique Dallazuanna >> wrote: >> > Try this: >> > >> > mapply(function(x, f)f(x), split(t, t %% 2), list(g, f)) >> > >> > On Tue, Mar 1, 2011 at 4:19 PM, ivo welch wrote: >> >> >> >> dear R experts--- >> >> >> >> t <- 1:30 >> >> f <- function(t) { cat("f for", t, "\n"); return(2*t) } >> >> g <- function(t) { cat("g for", t, "\n"); return(3*t) } >> >> s <- ifelse( t%%2==0, g(t), f(t)) >> >> >> >> shows that the ifelse function actually evaluates both f() >> and g() for >> >> all values first, and presumably then just picks left or >> right results >> >> based on t%%2. uggh... wouldn't it make more sense to >> evaluate only >> >> the relevant parts of each vector and then reassemble them? >> >> >> >> /iaw >> >> ---- >> >> Ivo Welch >> >> >> >> ______________________________________________ >> >> R-help at r-project.org mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> > >> > >> > >> > -- >> > Henrique Dallazuanna >> > Curitiba-Paran?-Brasil >> > 25? 25' 40" S 49? 16' 22" O >> > >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. From ligges at statistik.tu-dortmund.de Wed Mar 2 16:54:42 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Wed, 02 Mar 2011 16:54:42 +0100 Subject: [R] message: please select CRAN mirror In-Reply-To: <1299080842948-3331711.post@n4.nabble.com> References: <1299080842948-3331711.post@n4.nabble.com> Message-ID: <4D6E6842.60709@statistik.tu-dortmund.de> On 02.03.2011 16:47, Aggita wrote: >> chooseCRANmirror() > Error in m[, 1L] : incorrect number of dimensions > > Can someone explain me why I can't choose the cran mirror, but get again and > again this error message. Have searched for this on several engines but > can't find explanation. > > Thanks a lot in advance! > > -- > View this message in context: http://r.789695.n4.nabble.com/message-please-select-CRAN-mirror-tp3331711p3331711.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. Yes, PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Which R version? Which OS? Have you tried with a recent version of R? Uwe Ligges From aikidasgupta at gmail.com Wed Mar 2 16:56:54 2011 From: aikidasgupta at gmail.com (Abhijit Dasgupta) Date: Wed, 2 Mar 2011 10:56:54 -0500 Subject: [R] Plot with same font like in LaTeX In-Reply-To: References: Message-ID: <4D6E68C6.7000305@araastat.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From cbeleites at units.it Wed Mar 2 16:58:38 2011 From: cbeleites at units.it (Claudia Beleites) Date: Wed, 2 Mar 2011 16:58:38 +0100 Subject: [R] Plot with same font like in LaTeX In-Reply-To: References: Message-ID: <4D6E692E.50409@units.it> Jonas, have a look at tikzdevice Claudia -- Claudia Beleites Dipartimento dei Materiali e delle Risorse Naturali Universit? degli Studi di Trieste Via Alfonso Valerio 6/a I-34127 Trieste phone: +39 0 40 5 58-37 68 email: cbeleites at units.it From sarah.goslee at gmail.com Wed Mar 2 17:05:13 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Wed, 2 Mar 2011 11:05:13 -0500 Subject: [R] transform table to matrix In-Reply-To: References: Message-ID: If I understand you correctly, the easiest thing to do is import the data without converting the strings to factors (the default behavior) using: mydata <- read.table("mydata.csv", as.is=TRUE) If that isn't actually your problem, the output of str(mydata) would be helpful, as would an actual example of what you are trying to do. Sarah On Wed, Mar 2, 2011 at 8:43 AM, SK MAIDUL HAQUE wrote: > ?I have a text file that I have imported into R. It contains 3 columns and > 316940 rows. The first column is vegetation plot ID, the second species > names and the third is a cover value (numeric). I imported using the > read.table function. > > My problem is this. I need to reformat the information as a matrix, with the > first column becoming the row labels and the second the column labels and > the cover values as the matrix cell data. However, since the > read.tablefunction imported the data as an indexed data frame, I can't use > the columns > as vectors. Is there a way around this, to convert the data frame as 3 > separate vectors? I have been looking all over for a function, and my > programming skills are not great. > > > -- > Sk Maidul Haque > Scientific Officer-C > Applied Spectroscopy Division > Bhabha Atomic Research Centre, Vizag > > Mo: 09666429050/09093458503 > -- Sarah Goslee http://www.functionaldiversity.org From peter.langfelder at gmail.com Wed Mar 2 17:07:55 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Wed, 2 Mar 2011 08:07:55 -0800 Subject: [R] *** caught segfault *** when using impute.knn (impute package) In-Reply-To: <4D6E54DD.1050600@medisin.uio.no> References: <4D6E54DD.1050600@medisin.uio.no> Message-ID: On Wed, Mar 2, 2011 at 6:31 AM, Bettina Kulle Andreassen wrote: > hi, > > i am getting an error when calling the impute.knn > function (see the screenshot below). > what is the problem here and how can it be solved? > > > screenshot: > > ################## > ?*** caught segfault *** > address 0x513c7b84, cause 'memory not mapped' > > Traceback: > ?1: .Fortran("knnimp", x, ximp = x, p, n, imiss = imiss, irmiss, > as.integer(k), double(p), double(n), integer(p), integer(n), ? ? PACKAGE = > "impute") > ?2: knnimp.internal(x, k, imiss, irmiss, p, n, maxp = maxp) > ?3: knnimp(x, k, maxmiss = rowmax, maxp = maxp) > ?4: impute.knn(dummy0, k) > > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > > ################## > > thanks for your help in advance! > I've been seeing the same problem for some time. It tends to happen when one of the clusters the function splits the data into has the same size as k. Make sure k is smaller than your data size, too. Try moving k a little bit, for example set k=9 or k=11 (the default is 10) and see if the crash goes away. I am CCing the package maintainer. HTH, Peter > Bettina Kulle Andreassen > > University of Oslo > > Department of Biostatistics > > and > > Institute for Epi-Gen (Faculty Division Ahus) > > tel: > +47 22851193 > +47 67963923 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From marc_schwartz at me.com Wed Mar 2 17:08:54 2011 From: marc_schwartz at me.com (Marc Schwartz) Date: Wed, 02 Mar 2011 10:08:54 -0600 Subject: [R] R and Android In-Reply-To: References: Message-ID: If you search the list archives (using keywords such as iPhone or iPad), you will see extensive discussions on this point. There is/was an option to install a full R application on so-called "jail broken" Apple mobile units **only**. Otherwise, it's client/server. HTH, Marc Schwartz On Mar 2, 2011, at 8:04 AM, Ben Ward wrote: > Is there really one for the iphone? As far as I was aware, apple had beef about their policy agreements and the fact such software is open source/free as in freedom. > I actually expected the situation would be the other way round: console for android but none for iphone? > > Ben W. > > On 02/03/2011 11:44, Dr. Alireza Zolfaghari wrote: >> Hi List, >> Is anybody aware of any R console available for Android mobile? I know that >> there is one for Iphone. >> >> thanks, >> Alireza >> From jfox at mcmaster.ca Wed Mar 2 17:09:19 2011 From: jfox at mcmaster.ca (John Fox) Date: Wed, 2 Mar 2011 11:09:19 -0500 Subject: [R] Rcommander In-Reply-To: <3063944CD9914F22B4C761ADB4719DD4@gmail.com> References: <3063944CD9914F22B4C761ADB4719DD4@gmail.com> Message-ID: <014701cbd8f4$300d02d0$90270870$@mcmaster.ca> Dear Scott, I assume that Selda has already done this, but if not, the Rcmdr would still start and then offer to install its dependencies. Best, John -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of Scott Chamberlain > Sent: March-02-11 8:02 AM > To: Selda Korkmaz > Cc: r-help at r-project.org > Subject: Re: [R] Rcommander > > install.packages("Rcmdr", dependencies=TRUE) > library(Rcmdr) > > > Scott > On Wednesday, March 2, 2011 at 2:41 AM, Selda Korkmaz wrote: > > Dear Sirs, > > > > i just downloaded the R programm on my Macbook, but I canB4t open > Rcmdr, although I installed the needed Rcmdr-packages. I would be very > happy, if you could help me. Telephone: +49 151 10868600 (Germany) or e- > mail > > > > Yous sincerely, > > > > > > Selda Korkmaz > > > > sk at seldakorkmaz.com > > www.seldakorkmaz.com > > > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] From eriki at ccbr.umn.edu Wed Mar 2 17:10:45 2011 From: eriki at ccbr.umn.edu (Erik Iverson) Date: Wed, 2 Mar 2011 10:10:45 -0600 Subject: [R] Plot with same font like in LaTeX In-Reply-To: References: Message-ID: <4D6E6C05.4040601@ccbr.umn.edu> Jonas, Try looking at the tikzDevice package, and/or the pgfSweave package. --Erik Jonas Stein wrote: > Hi, > > i want to make my plots look uniform in LaTeX documents. > > - usage of the same font on axes and in legend like LaTeX uses > (for example "Computer Modern") > > - put real LaTeX formulas on the axes > > Have you any hints how i can achieve that? > I had no luck two years ago, but i want to try it again now. > > kind regards, > From scttchamberlain4 at gmail.com Wed Mar 2 17:17:00 2011 From: scttchamberlain4 at gmail.com (Scott Chamberlain) Date: Wed, 2 Mar 2011 10:17:00 -0600 Subject: [R] transform table to matrix In-Reply-To: References: Message-ID: <11AF2783CA6F4B2F98FF604196B1CB6C@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jaomatos at gmail.com Wed Mar 2 17:19:19 2011 From: jaomatos at gmail.com (=?iso-8859-1?q?Jos=E9_Matos?=) Date: Wed, 2 Mar 2011 16:19:19 +0000 Subject: [R] does rpy support R 2.12.2 In-Reply-To: References: Message-ID: <201103021619.19633.jaomatos@gmail.com> On Tuesday 01 March 2011 23:36:20 Pete Shepard wrote: > Hi, > > I am getting the following error when I try to run import rpy from the the > python IDE: > > Traceback (most recent call last): > File "", line 1, in > File "/usr/lib/python2.6/dist-packages/rpy.py", line 134, in > """ % RVERSION) > RuntimeError: No module named _rpy2122 > > RPy module can not be imported. Please check if your rpy > installation supports R 2.12.2. If you have multiple R versions > installed, you may need to set RHOME before importing rpy. For > > example: > >>> from rpy_options import set_options > >>> set_options(RHOME='c:/progra~1/r/rw2011/') > >>> from rpy import * > > I am wondering if rpy supports R 2.12.2? > > Thanks Yes it does but you need to recompile it for the new R version. FWIW your example above is strange in the sense that you are using a linux version and passing it a windows path... regardless it does not work because the installed rpy was compiled against a previous version of R. I hope it helps, -- Jos? Ab?lio From izahn at psych.rochester.edu Wed Mar 2 17:21:08 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Wed, 2 Mar 2011 11:21:08 -0500 Subject: [R] transform table to matrix In-Reply-To: References: Message-ID: Hi Sk, On Wed, Mar 2, 2011 at 8:43 AM, SK MAIDUL HAQUE wrote: > ?I have a text file that I have imported into R. It contains 3 columns and > 316940 rows. The first column is vegetation plot ID, the second species > names and the third is a cover value (numeric). I imported using the > read.table function. > > My problem is this. I need to reformat the information as a matrix, with the > first column becoming the row labels and the second the column labels and > the cover values as the matrix cell data. dat.m <- as.matrix(dat) rownames(dat.m) <- dat[, 1] dat.m <- dat.m[, -1] However, since the > read.tablefunction imported the data as an indexed data frame, I can't use > the columns > as vectors. I'm not sure why you can't access the collumns in a data.frame. Of course you can -- see ?"[" or ?subset Is there a way around this, to convert the data frame as 3 > separate vectors? ?"[" Best, Ista I have been looking all over for a function, and my > programming skills are not great. > > > -- > Sk Maidul Haque > Scientific Officer-C > Applied Spectroscopy Division > Bhabha Atomic Research Centre, Vizag > > Mo: 09666429050/09093458503 > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From izahn at psych.rochester.edu Wed Mar 2 17:23:18 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Wed, 2 Mar 2011 11:23:18 -0500 Subject: [R] Contingency table in R In-Reply-To: References: Message-ID: Hi Laura, On Wed, Mar 2, 2011 at 9:13 AM, Laura Clasemann wrote: > > Hi, > > I have a table in R with data I needed and need to create a contingency table out of it. The table I have so far looks like this: > > > ? ? ? ? ? ? ? ? ? Binger > r > DietType ? ? No Yes > ?Dangerous ?15 ?12 > ?Healthy ? ?52 ? 9 > ?None ? ? ?134 ?24 > ?Unhealthy ?72 ?23 > > These are the error messages that I keep getting whenever I try to get a contingency table. I'm not sure why it won't work for me, any help would be appreciated! >> nametable<-table(excat,recat) > Error in table(excat, recat) : object 'excat' not found That error seems pretty clear. The table function can't find the excat data. Is it in a data.frame or a list? Perhaps ?with will point you in the rigth direction. Best, Ista > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From izahn at psych.rochester.edu Wed Mar 2 17:26:57 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Wed, 2 Mar 2011 11:26:57 -0500 Subject: [R] Vector manipulations In-Reply-To: References: Message-ID: Hi Benjamin, There may be faster ways, but v <- 1:100 x <- 10 n <- which(cumsum(v) == x) w <- v[1:n] seems pretty straightforward. Best, Ista On Wed, Mar 2, 2011 at 10:42 AM, Benjamin Hartley wrote: > I have a question regarding the most efficient way to select a substring of > a vector: > > I have a vector of value v, and I want to select a subspace of this vector > called w such that: > > w=v[1:n] > > where > > sum(w) = x > > I am interested in what you thing would be the most efficient way to do this > - I would like to avoid slowing down my simulations as much as possible. > > Thank you very much for any help that anyone is able to give. > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From jdaily at usgs.gov Wed Mar 2 17:33:04 2011 From: jdaily at usgs.gov (Jonathan P Daily) Date: Wed, 2 Mar 2011 11:33:04 -0500 Subject: [R] Vector manipulations In-Reply-To: References: Message-ID: Is this what you want? I don't know what your v looks like, but this won't work if there are cases in which v won't sum to exactly x. x <- 20 v <- sample(0:1, 100, T) w <- v[1:which(cumsum(v)==x)] -------------------------------------- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 "Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it." - Jubal Early, Firefly r-help-bounces at r-project.org wrote on 03/02/2011 10:42:12 AM: > [image removed] > > [R] Vector manipulations > > Benjamin Hartley > > to: > > r-help > > 03/02/2011 11:08 AM > > Sent by: > > r-help-bounces at r-project.org > > I have a question regarding the most efficient way to select a substring of > a vector: > > I have a vector of value v, and I want to select a subspace of this vector > called w such that: > > w=v[1:n] > > where > > sum(w) = x > > I am interested in what you thing would be the most efficient way to do this > - I would like to avoid slowing down my simulations as much as possible. > > Thank you very much for any help that anyone is able to give. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From david.croll at gmx.ch Wed Mar 2 17:55:04 2011 From: david.croll at gmx.ch (David Croll) Date: Wed, 02 Mar 2011 17:55:04 +0100 Subject: [R] power regression: which package? Message-ID: <4D6E7668.8050301@gmx.ch> Dear R users and R friends, I have a little problem... I don't know anymore which package to use if I want to perform a power regression analysis. To be clear, I want to fit a regression model like this: fit <- ....(y ~ a * x ^ b + c) where a, b and c are coefficients of the model. The R Site does not have the answer I want... Thanks in advance and with kind regards, David From john.seers at googlemail.com Wed Mar 2 17:57:34 2011 From: john.seers at googlemail.com (John Seers) Date: Wed, 2 Mar 2011 16:57:34 +0000 Subject: [R] RWinEdt difficulties In-Reply-To: References: <4D6E5B80.6050502@statistik.tu-dortmund.de> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From kjetilbrinchmannhalvorsen at gmail.com Wed Mar 2 18:04:38 2011 From: kjetilbrinchmannhalvorsen at gmail.com (Kjetil Halvorsen) Date: Wed, 2 Mar 2011 14:04:38 -0300 Subject: [R] power regression: which package? In-Reply-To: <4D6E7668.8050301@gmx.ch> References: <4D6E7668.8050301@gmx.ch> Message-ID: ?nls install.packages("nls2",dep=T) library(nls2) ?nls2 install.packages("nlstools") library(help=nlstools) install.packages("NISTnls", dep=T) library(help=NISTnls) the last one give access to many examples. On Wed, Mar 2, 2011 at 1:55 PM, David Croll wrote: > > > Dear R users and R friends, > > > I have a little problem... I don't know anymore which package to use if > I want to perform a power regression analysis. > > > To be clear, I want to fit a regression model like this: > > fit <- ....(y ~ a * x ^ b + c) > > where a, b and c are coefficients of the model. > > > The R Site does not have the answer I want... > > > Thanks in advance and with kind regards, > > > David > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From john.seers at googlemail.com Wed Mar 2 18:06:44 2011 From: john.seers at googlemail.com (John Seers) Date: Wed, 2 Mar 2011 17:06:44 +0000 Subject: [R] RWinEdt difficulties In-Reply-To: <4D6E5B80.6050502@statistik.tu-dortmund.de> References: <4D6E5B80.6050502@statistik.tu-dortmund.de> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ligges at statistik.tu-dortmund.de Wed Mar 2 18:10:16 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Wed, 02 Mar 2011 18:10:16 +0100 Subject: [R] RWinEdt difficulties In-Reply-To: References: <4D6E5B80.6050502@statistik.tu-dortmund.de> Message-ID: <4D6E79F8.7060200@statistik.tu-dortmund.de> On 02.03.2011 18:06, John Seers wrote: > ********************************************************************************************************** > ********************************************************************************************************** > > > 2011/3/2 Uwe Ligges > >> >> >> On 01.03.2011 11:01, John Seers wrote: >> >>> Hello Everyone >>> >>> I have just upgraded my PC to Windows 7 (64 bit) and I have installed R >>> 2.12.2. R seems to be working fine. >>> >>> I am having problems getting RWinEdt working with it though. >>> >>> I have tried installing WinEdt 6.0 and WinEdt 5.5. But both fail with the >>> same error using R as 64 bit or 32 bit. I install the package using >>> Administrator rights. >>> >>> >>> library(RWinEdt) >>>> >>> Warning message: >>> In shell(paste("\"\"", .gW$InstallRoot, "\\WinEdt.exe\" -C=\"R-WinEdt\" >>> -E=", : >>> '""C:\Program Files (x86)\WinEdt Team\WinEdt 6\WinEdt.exe" -C="R-WinEdt" >>> -E="C:\Program Files (x86)\WinEdt Team\WinEdt 6\R.ini""' execution failed >>> with error code 1 >>> >>>> >>>> >> One installing RWinEdt the first time, please run R with Administrator >> privileges (right click to do so). Then installation should work smoothly >> with WinEdt< 6.0. >> >> >> Does it matter if you are using 64 bit R and the 32 bit WinEdt? (I have >>> tried 32 bit R). >>> >>> Does RWinEdt work with WinEdt 6.0? >>> >> >> >> No, not yet, unfortunately. But some free time is scheduled for this in >> April. >> >> Uwe Ligges >> >> >> >> Can anybody suggest a solution? >>> >> >> >> >> >> Thanks for any help. >>> >>> Regards >>> >>> John Seers >>> >>> >>> >>> ********************************************************************************************************** >>> >>> ********************************************************************************************************** >>> >>> > > > Hello Uwe > > Thank you for your reply. > >> One installing RWinEdt the first time, please run R with Administrator > privileges (right click to do so). Then installation should work>smoothly > with WinEdt< 6.0. > > Hmmm. I think I did the first time. But I have tried again. > > Removed WinEdt 6.0 and installed 5.5. Uninstalled R and reinstalled only 64 > bit version this time. Ensured all traces of RWinEdt were removed. Started R > with admin privileges. Installed RWinEdt. Loaded RWinEdt. Same problem. > >> library(RWinEdt) > Warning message: > In shell(paste("\"\"", .gW$InstallRoot, "\\WinEdt.exe\" -C=\"R-WinEdt\" > -E=", : > '""C:\Program Files (x86)\WinEdt Team\WinEdt\WinEdt.exe" -C="R-WinEdt" > -E="C:\Program Files (x86)\WinEdt Team\WinEdt\R.ini""' execution failed with > error code 1 >> > > Any other suggestions? Well, use the manual setup as indicated in the readme cinatined in the package. No idea what went wrong in this case. Uwe > Regards > > John Seers > > > > > > > > > > >> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From john.seers at googlemail.com Wed Mar 2 18:14:20 2011 From: john.seers at googlemail.com (John Seers) Date: Wed, 2 Mar 2011 17:14:20 +0000 Subject: [R] RWinEdt difficulties In-Reply-To: <4D6E79F8.7060200@statistik.tu-dortmund.de> References: <4D6E5B80.6050502@statistik.tu-dortmund.de> <4D6E79F8.7060200@statistik.tu-dortmund.de> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From gpetris at uark.edu Wed Mar 2 18:30:30 2011 From: gpetris at uark.edu (Giovanni Petris) Date: Wed, 02 Mar 2011 11:30:30 -0600 Subject: [R] Line numbering in Sweave Message-ID: <1299087030.1779.2.camel@definetti> Is there a way of getting line numbers in Schunks? Ideally, I would like to have numbers printed every two or five lines. Thank you in advance, Giovanni -- Giovanni Petris Associate Professor Department of Mathematical Sciences University of Arkansas - Fayetteville, AR 72701 Ph: (479) 575-6324, 575-8630 (fax) http://definetti.uark.edu/~gpetris/ From ted.rosenbaum at yale.edu Wed Mar 2 18:26:45 2011 From: ted.rosenbaum at yale.edu (Ted Rosenbaum) Date: Wed, 2 Mar 2011 12:26:45 -0500 Subject: [R] merge in data.tables -- "non-visible" In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From m.lathouri06 at imperial.ac.uk Wed Mar 2 18:51:40 2011 From: m.lathouri06 at imperial.ac.uk (Lathouri, Maria) Date: Wed, 2 Mar 2011 17:51:40 +0000 Subject: [R] please help with interaction.plot Message-ID: <43C6B76B98F4E14E9A0AF286AAB5C38C3F94F781@ICEXM1.ic.ac.uk> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From vokey at uleth.ca Wed Mar 2 19:05:01 2011 From: vokey at uleth.ca (Vokey, John) Date: Wed, 2 Mar 2011 11:05:01 -0700 Subject: [R] bootstrap resampling - simplified In-Reply-To: References: Message-ID: On 2011-03-02, at 4:00 AM, r-help-request at r-project.org wrote: > Hello there, > > I have a problem concerning bootstrapping in R - especially focusing on the resampling part of it. I try to sum it up in a simplified way so that I would not confuse anybody. > > I have a small database consisting of 20 observations (basically numbers from 1 to 20, I mean: 1, 2, 3, 4, 5, ... 18, 19, 20). > > I would like to resample this database many times for the bootstrap process with the following conditions. Firstly, every resampled database should also include 20 observations. Secondly, when selecting a number from the above-mentioned 20 numbers, you can do this selection with replacement. The difficult part comes now: one number can be selected only maximum 5 times. In order to make this clear I show you a couple of examples. So the resampled databases might be like the following ones: > > (1st database) 1,2,1,2,1,2,1,2,1,2,3,3,3,3,3,4,4,4,4,4 > 4 different numbers are chosen (1, 2, 3, 4), each selected - for the maximum possible - 5 times. > > (2nd database) 1,8,8,6,8,8,8,2,3,4,5,6,6,6,6,7,19,1,1,1 > Two numbers - 8 and 6 - selected 5 times (the maximum possible times), number 1 selected 4 times, the others selected less than 4 times. > > (3rd database) 1,1,2,2,3,3,4,4,9,9,9,10,10,13,10,9,3,9,2,1 > Number 9 chosen for the maximum possible 5 times, number 10, 3, 2, 1 chosen for 3 times, number 4 selected twice and number 13 selected only once. > > ... > > Anybody knows how to implement my "tricky" condition into one of the R functions - that one number can be selected only 5 times at most? Are 'boot' and 'bootstrap' packages capable of managing this? I guess they are, I just couldn't figure it out yet... > > Thanks very much! Best regards, > Laszlo Bodnar Laszlo, Create a vector consisting of 5 of each number. Then, for each sample, scramble the order of the items in the vector, and select the first 20. -- Please avoid sending me Word or PowerPoint attachments. See -Dr. John R. Vokey From jdaily at usgs.gov Wed Mar 2 19:32:18 2011 From: jdaily at usgs.gov (Jonathan P Daily) Date: Wed, 2 Mar 2011 13:32:18 -0500 Subject: [R] bootstrap resampling - simplified In-Reply-To: References: Message-ID: I will point out again that sampling a five-fold replicate of 1:20 is not the same as resampling with replacement, although I made an error in reporting probabilities - the P(x2 = 1 | x1 = 1) = 4/99 and not 4/100. When sampling with replacement, P(x2 = 1 | x1 = 1) = P(x2 = 1 | x1 != 1) = 1/20. -------------------------------------- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 "Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it." - Jubal Early, Firefly r-help-bounces at r-project.org wrote on 03/02/2011 01:05:01 PM: > [image removed] > > Re: [R] bootstrap resampling - simplified > > Vokey, John > > to: > > r-help > > 03/02/2011 01:07 PM > > Sent by: > > r-help-bounces at r-project.org > > On 2011-03-02, at 4:00 AM, r-help-request at r-project.org wrote: > > > Hello there, > > > > I have a problem concerning bootstrapping in R - especially > focusing on the resampling part of it. I try to sum it up in a > simplified way so that I would not confuse anybody. > > > > I have a small database consisting of 20 observations (basically > numbers from 1 to 20, I mean: 1, 2, 3, 4, 5, ... 18, 19, 20). > > > > I would like to resample this database many times for the > bootstrap process with the following conditions. Firstly, every > resampled database should also include 20 observations. Secondly, > when selecting a number from the above-mentioned 20 numbers, you can > do this selection with replacement. The difficult part comes now: > one number can be selected only maximum 5 times. In order to make > this clear I show you a couple of examples. So the resampled > databases might be like the following ones: > > > > (1st database) 1,2,1,2,1,2,1,2,1,2,3,3,3,3,3,4,4,4,4,4 > > 4 different numbers are chosen (1, 2, 3, 4), each selected - for > the maximum possible - 5 times. > > > > (2nd database) 1,8,8,6,8,8,8,2,3,4,5,6,6,6,6,7,19,1,1,1 > > Two numbers - 8 and 6 - selected 5 times (the maximum possible > times), number 1 selected 4 times, the others selected less than 4 times. > > > > (3rd database) 1,1,2,2,3,3,4,4,9,9,9,10,10,13,10,9,3,9,2,1 > > Number 9 chosen for the maximum possible 5 times, number 10, 3, 2, > 1 chosen for 3 times, number 4 selected twice and number 13 selectedonly once. > > > > ... > > > > Anybody knows how to implement my "tricky" condition into one of > the R functions - that one number can be selected only 5 times at > most? Are 'boot' and 'bootstrap' packages capable of managing this? > I guess they are, I just couldn't figure it out yet... > > > > Thanks very much! Best regards, > > Laszlo Bodnar > > Laszlo, > Create a vector consisting of 5 of each number. Then, for each > sample, scramble the order of the items in the vector, and select > the first 20. > > > -- > Please avoid sending me Word or PowerPoint attachments. > See > > -Dr. John R. Vokey > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From gpetris at uark.edu Wed Mar 2 19:33:31 2011 From: gpetris at uark.edu (Giovanni Petris) Date: Wed, 02 Mar 2011 12:33:31 -0600 Subject: [R] bootstrap resampling - simplified In-Reply-To: References: Message-ID: <1299090811.1779.4.camel@definetti> But this seems to me to be equivalent to sample(rep(1:20, 5), 20), which I previously suggested and was pointed out to be wrong.... Giovanni On Wed, 2011-03-02 at 11:05 -0700, Vokey, John wrote: > On 2011-03-02, at 4:00 AM, r-help-request at r-project.org wrote: > > > Hello there, > > > > I have a problem concerning bootstrapping in R - especially focusing on the resampling part of it. I try to sum it up in a simplified way so that I would not confuse anybody. > > > > I have a small database consisting of 20 observations (basically numbers from 1 to 20, I mean: 1, 2, 3, 4, 5, ... 18, 19, 20). > > > > I would like to resample this database many times for the bootstrap process with the following conditions. Firstly, every resampled database should also include 20 observations. Secondly, when selecting a number from the above-mentioned 20 numbers, you can do this selection with replacement. The difficult part comes now: one number can be selected only maximum 5 times. In order to make this clear I show you a couple of examples. So the resampled databases might be like the following ones: > > > > (1st database) 1,2,1,2,1,2,1,2,1,2,3,3,3,3,3,4,4,4,4,4 > > 4 different numbers are chosen (1, 2, 3, 4), each selected - for the maximum possible - 5 times. > > > > (2nd database) 1,8,8,6,8,8,8,2,3,4,5,6,6,6,6,7,19,1,1,1 > > Two numbers - 8 and 6 - selected 5 times (the maximum possible times), number 1 selected 4 times, the others selected less than 4 times. > > > > (3rd database) 1,1,2,2,3,3,4,4,9,9,9,10,10,13,10,9,3,9,2,1 > > Number 9 chosen for the maximum possible 5 times, number 10, 3, 2, 1 chosen for 3 times, number 4 selected twice and number 13 selected only once. > > > > ... > > > > Anybody knows how to implement my "tricky" condition into one of the R functions - that one number can be selected only 5 times at most? Are 'boot' and 'bootstrap' packages capable of managing this? I guess they are, I just couldn't figure it out yet... > > > > Thanks very much! Best regards, > > Laszlo Bodnar > > Laszlo, > Create a vector consisting of 5 of each number. Then, for each sample, scramble the order of the items in the vector, and select the first 20. > > > -- > Please avoid sending me Word or PowerPoint attachments. > See > > -Dr. John R. Vokey > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From izahn at psych.rochester.edu Wed Mar 2 19:48:28 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Wed, 2 Mar 2011 13:48:28 -0500 Subject: [R] Line numbering in Sweave In-Reply-To: <1299087030.1779.2.camel@definetti> References: <1299087030.1779.2.camel@definetti> Message-ID: SweaveListingUtils might do it. http://cran.r-project.org/web/packages/SweaveListingUtils/index.html Best, Ista On Wed, Mar 2, 2011 at 12:30 PM, Giovanni Petris wrote: > Is there a way of getting line numbers in Schunks? Ideally, I would like > to have numbers printed every two or five lines. > > Thank you in advance, > Giovanni > > > > -- > > Giovanni Petris ? > Associate Professor > Department of Mathematical Sciences > University of Arkansas - Fayetteville, AR 72701 > Ph: (479) 575-6324, 575-8630 (fax) > http://definetti.uark.edu/~gpetris/ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From bbolker at gmail.com Wed Mar 2 20:05:05 2011 From: bbolker at gmail.com (Ben Bolker) Date: Wed, 2 Mar 2011 19:05:05 +0000 Subject: [R] message: please select CRAN mirror References: <1299080842948-3331711.post@n4.nabble.com> Message-ID: Aggita eurotransplant.org> writes: > > > chooseCRANmirror() > Error in m[, 1L] : incorrect number of dimensions > > Can someone explain me why I can't choose the cran mirror, but get again and > again this error message. Have searched for this on several engines but > can't find explanation. > It's hard for us to diagnose this if we can't reproduce it. I will take a shot though. The chooseCRANmirror function looks like this: function (graphics = getOption("menu.graphics")) { if (!interactive()) stop("cannot choose a CRAN mirror non-interactively") m <- getCRANmirrors(all = FALSE, local.only = FALSE) res <- menu(m[, 1L], graphics, "CRAN mirror") if (res > 0L) { URL <- m[res, "URL"] repos <- getOption("repos") repos["CRAN"] <- gsub("/$", "", URL[1L]) options(repos = repos) } invisible() } Looking in the guts of the function, it is clear that it is failing when it looks at the list of mirrors that it has gotten -- this list of mirrors has somehow turned into a vector instead of a matrix. You could use debug(chooseCRANmirror) to step through the function and inspect the value of m just before the function crashes. This is probably the result of some sort of network problem -- you're not getting a decent list of mirrors. What is the result of str(getCRANmirrors(all=FALSE,local.only=FALSE)) ? From gunter.berton at gene.com Wed Mar 2 20:42:40 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Wed, 2 Mar 2011 11:42:40 -0800 Subject: [R] bootstrap resampling - simplified In-Reply-To: References: Message-ID: Folks: On Wed, Mar 2, 2011 at 10:32 AM, Jonathan P Daily wrote: > I will point out again that sampling a five-fold replicate of 1:20 is not > the same as resampling with replacement, -- Correct. In sampling with replacement from 1:20 there is positive probability of getting all 1's or all 2's, etc. The poster specifically said that he wanted 0 probability of such results. So, obviously, the poster does NOT want to "sample with replacement from 1:20." What he does want (I think) is a re-sample of size n from the set of all **vectors** of length 20, each element of which is an integer from 1 to 20, and for which no individual values occur more than 5 times in the vector. Of course I'm just interpreting/paraphrasing the original post (if I got it right), but I think doing so makes the nature of the task clearer: one needs to find some way to sample with replacement from the space of all such **sequences**. I think it is now clear that one may do so by rejection sampling: i.e. sample with replacement from 1:20 and throw away any sequences that fail the at most 5 criterion. The sequences that remain are samples of size 1 from the population of sequences that satisfy the poster's criteria (in theory, anyway; this might tax a pseudo RNG in practice). A collection of n such sequences is a bootstrap sample from this population. I **think** that's what the poster wants -- and what others have already provided. However, maybe this clarifies why it works. If I have made any error in this, **Please** post a message pointing out my error. I sometimes get confused about this stuff, too. Cheers, Bert although I made an error in > reporting probabilities - the P(x2 = 1 | x1 = 1) = 4/99 and not 4/100. > When sampling with replacement, P(x2 = 1 | x1 = 1) = P(x2 = 1 | x1 != 1) = > 1/20. > -------------------------------------- > Jonathan P. Daily > Technician - USGS Leetown Science Center > 11649 Leetown Road > Kearneysville WV, 25430 > (304) 724-4480 > "Is the room still a room when its empty? Does the room, > ?the thing itself have purpose? Or do we, what's the word... imbue it." > ? ? - Jubal Early, Firefly > > r-help-bounces at r-project.org wrote on 03/02/2011 01:05:01 PM: > >> [image removed] >> >> Re: [R] bootstrap resampling - simplified >> >> Vokey, John >> >> to: >> >> r-help >> >> 03/02/2011 01:07 PM >> >> Sent by: >> >> r-help-bounces at r-project.org >> >> On 2011-03-02, at 4:00 AM, r-help-request at r-project.org wrote: >> >> > Hello there, >> > >> > I have a problem concerning bootstrapping in R - especially >> focusing on the resampling part of it. I try to sum it up in a >> simplified way so that I would not confuse anybody. >> > >> > I have a small database consisting of 20 observations (basically >> numbers from 1 to 20, I mean: 1, 2, 3, 4, 5, ... 18, 19, 20). >> > >> > I would like to resample this database many times for the >> bootstrap process with the following conditions. Firstly, every >> resampled database should also include 20 observations. Secondly, >> when selecting a number from the above-mentioned 20 numbers, you can >> do this selection with replacement. The difficult part comes now: >> one number can be selected only maximum 5 times. In order to make >> this clear I show you a couple of examples. So the resampled >> databases might be like the following ones: >> > >> > (1st database) ? ? ? ? ?1,2,1,2,1,2,1,2,1,2,3,3,3,3,3,4,4,4,4,4 >> > 4 different numbers are chosen (1, 2, 3, 4), each selected - for >> the maximum possible - 5 times. >> > >> > (2nd database) ? ? ? ? ?1,8,8,6,8,8,8,2,3,4,5,6,6,6,6,7,19,1,1,1 >> > Two numbers - 8 and 6 - selected 5 times (the maximum possible >> times), number 1 selected 4 times, the others selected less than 4 > times. >> > >> > (3rd database) ? ? ? ? ?1,1,2,2,3,3,4,4,9,9,9,10,10,13,10,9,3,9,2,1 >> > Number 9 chosen for the maximum possible 5 times, number 10, 3, 2, >> 1 chosen for 3 times, number 4 selected twice and number 13 selectedonly > once. >> > >> > ... >> > >> > Anybody knows how to implement my "tricky" condition into one of >> the R functions - that one number can be selected only 5 times at >> most? Are 'boot' and 'bootstrap' packages capable of managing this? >> I guess they are, I just couldn't figure it out yet... >> > >> > Thanks very much! Best regards, >> > Laszlo Bodnar >> >> Laszlo, >> ? Create a vector consisting of 5 of each number. ?Then, for each >> sample, scramble the order of the items in the vector, and select >> the first 20. >> >> >> -- >> Please avoid sending me Word or PowerPoint attachments. >> See >> >> -Dr. John R. Vokey >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Bert Gunter Genentech Nonclinical Biostatistics 467-7374 http://devo.gene.com/groups/devo/depts/ncb/home.shtml From bt_jannis at yahoo.de Wed Mar 2 20:59:27 2011 From: bt_jannis at yahoo.de (Jannis) Date: Wed, 02 Mar 2011 20:59:27 +0100 Subject: [R] How to extrapolate a model In-Reply-To: References: Message-ID: <4D6EA19F.7060207@yahoo.de> I have no experience with this quantreg package and you did not include any code for us to reproduce your problem. But these models in R all work similar. Have a look at the model result object returned by the call of the fit (str(modelresults) ). I would expect that there is some formula component. Otherwise see what predict(modelresult), formulat(modelresults) fitted(modelresults) return ( str() ) or have a look at the documentation of these functions. HTH Jannis On 03/02/2011 04:40 PM, sadz a wrote: > I am using a multiple additive model (in the quantreg package) and I would > like to 'extract' the fitted model formulae > > ie- for a straight line the formula would be y= 'a+b*c' > for my multiple model I would expect somthing more complex because the model > is not linear (its a bit like a GAM) but given I can plot the model using > # f<-fitted(model) > #lines(f) > there must be a formula that I can extract > > Thank you for your time > Kitty > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From jdaily at usgs.gov Wed Mar 2 21:00:28 2011 From: jdaily at usgs.gov (Jonathan P Daily) Date: Wed, 2 Mar 2011 15:00:28 -0500 Subject: [R] bootstrap resampling - simplified In-Reply-To: References: Message-ID: I apologize if I was not clear in my response. I only mentioned x1, x2 in my example, but I did not clarify that I also knew that P(x6 = 1 | x1..5 = 1) = 0 in the original request. I also see that if he meant that he wanted to sample with replacement from the set of sequences that sample(rep(1:20, 5), 20) is fine for generating said sequences. My interpretation was that the sequences themselves should be sampling with replacement until frequency hits 5, whereupon it is not replaced. Hence my suggestion of: bigsamp <- sample(1:20, 100, T) idx <- sort(unlist(sapply(1:20, function(x) which(bigsamp == x)[1:5])))[1:20] samp <- bigsamp[idx] I apologize for my lack of clarity, though after reading the original post I'm not sure which solution the OP was looking for. Cheers, Jon -------------------------------------- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 "Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it." - Jubal Early, Firefly Bert Gunter wrote on 03/02/2011 02:42:40 PM: > [image removed] > > Re: [R] bootstrap resampling - simplified > > Bert Gunter > > to: > > Jonathan P Daily > > 03/02/2011 02:42 PM > > Cc: > > "Vokey, John", r-help, r-help-bounces > > Folks: > > On Wed, Mar 2, 2011 at 10:32 AM, Jonathan P Daily wrote: > > I will point out again that sampling a five-fold replicate of 1:20 is not > > the same as resampling with replacement, > > -- Correct. In sampling with replacement from 1:20 there is positive > probability of getting all 1's or all 2's, etc. The poster > specifically said that he wanted 0 probability of such results. So, > obviously, the poster does NOT want to "sample with replacement from > 1:20." What he does want (I think) is a re-sample of size n from the > set of all **vectors** of length 20, each element of which is an > integer from 1 to 20, and for which no individual values occur more > than 5 times in the vector. Of course I'm just > interpreting/paraphrasing the original post (if I got it right), but I > think doing so makes the nature of the task clearer: one needs to find > some way to sample with replacement from the space of all such > **sequences**. > > I think it is now clear that one may do so by rejection sampling: i.e. > sample with replacement from 1:20 and throw away any sequences that > fail the at most 5 criterion. The sequences that remain are samples of > size 1 from the population of sequences that satisfy the poster's > criteria (in theory, anyway; this might tax a pseudo RNG in practice). > A collection of n such sequences is a bootstrap sample from this > population. I **think** that's what the poster wants -- and what > others have already provided. However, maybe this clarifies why it > works. > > If I have made any error in this, **Please** post a message pointing > out my error. I sometimes get confused about this stuff, too. > > Cheers, > Bert > > > > > > although I made an error in > > reporting probabilities - the P(x2 = 1 | x1 = 1) = 4/99 and not 4/100. > > When sampling with replacement, P(x2 = 1 | x1 = 1) = P(x2 = 1 | x1 != 1) = > > 1/20. > > -------------------------------------- > > Jonathan P. Daily > > Technician - USGS Leetown Science Center > > 11649 Leetown Road > > Kearneysville WV, 25430 > > (304) 724-4480 > > "Is the room still a room when its empty? Does the room, > > the thing itself have purpose? Or do we, what's the word... imbue it." > > - Jubal Early, Firefly > > > > r-help-bounces at r-project.org wrote on 03/02/2011 01:05:01 PM: > > > >> [image removed] > >> > >> Re: [R] bootstrap resampling - simplified > >> > >> Vokey, John > >> > >> to: > >> > >> r-help > >> > >> 03/02/2011 01:07 PM > >> > >> Sent by: > >> > >> r-help-bounces at r-project.org > >> > >> On 2011-03-02, at 4:00 AM, r-help-request at r-project.org wrote: > >> > >> > Hello there, > >> > > >> > I have a problem concerning bootstrapping in R - especially > >> focusing on the resampling part of it. I try to sum it up in a > >> simplified way so that I would not confuse anybody. > >> > > >> > I have a small database consisting of 20 observations (basically > >> numbers from 1 to 20, I mean: 1, 2, 3, 4, 5, ... 18, 19, 20). > >> > > >> > I would like to resample this database many times for the > >> bootstrap process with the following conditions. Firstly, every > >> resampled database should also include 20 observations. Secondly, > >> when selecting a number from the above-mentioned 20 numbers, you can > >> do this selection with replacement. The difficult part comes now: > >> one number can be selected only maximum 5 times. In order to make > >> this clear I show you a couple of examples. So the resampled > >> databases might be like the following ones: > >> > > >> > (1st database) 1,2,1,2,1,2,1,2,1,2,3,3,3,3,3,4,4,4,4,4 > >> > 4 different numbers are chosen (1, 2, 3, 4), each selected - for > >> the maximum possible - 5 times. > >> > > >> > (2nd database) 1,8,8,6,8,8,8,2,3,4,5,6,6,6,6,7,19,1,1,1 > >> > Two numbers - 8 and 6 - selected 5 times (the maximum possible > >> times), number 1 selected 4 times, the others selected less than 4 > > times. > >> > > >> > (3rd database) 1,1,2,2,3,3,4,4,9,9,9,10,10,13,10,9,3,9,2,1 > >> > Number 9 chosen for the maximum possible 5 times, number 10, 3, 2, > >> 1 chosen for 3 times, number 4 selected twice and number 13 selectedonly > > once. > >> > > >> > ... > >> > > >> > Anybody knows how to implement my "tricky" condition into one of > >> the R functions - that one number can be selected only 5 times at > >> most? Are 'boot' and 'bootstrap' packages capable of managing this? > >> I guess they are, I just couldn't figure it out yet... > >> > > >> > Thanks very much! Best regards, > >> > Laszlo Bodnar > >> > >> Laszlo, > >> Create a vector consisting of 5 of each number. Then, for each > >> sample, scramble the order of the items in the vector, and select > >> the first 20. > >> > >> > >> -- > >> Please avoid sending me Word or PowerPoint attachments. > >> See > >> > >> -Dr. John R. Vokey > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Bert Gunter > Genentech Nonclinical Biostatistics > 467-7374 > http://devo.gene.com/groups/devo/depts/ncb/home.shtml From benhartley903 at googlemail.com Wed Mar 2 19:36:21 2011 From: benhartley903 at googlemail.com (Benjamin Hartley) Date: Wed, 2 Mar 2011 19:36:21 +0100 Subject: [R] Vector manipulations In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dschruth at gmail.com Wed Mar 2 21:16:27 2011 From: dschruth at gmail.com (dms) Date: Wed, 2 Mar 2011 12:16:27 -0800 (PST) Subject: [R] merge( , by='row.names') slowness Message-ID: <117a4deb-9e29-4bba-8db1-5c78571bdfd7@a11g2000pri.googlegroups.com> I noticed that joining two data.frames in R using the "merge" function that using by='row.names' slows things down substantially when compared to just joining on a common index column. Using a dataframe size of ~10,000 rows: it's as slow as 10 minutes in the by='row.names' case versus merely 1 second using an index column. Beyond the 10^6 range, it's unusably slow. n <- 5 a <- data.frame(id=as.character(1:10^n), x=rnorm(10^n)); rownames(a) <- a$id b <- data.frame(id=as.character(1:10^n + 10^(n-1)), y=rnorm(10^n)); rownames(b) <- b$id date() fast <- merge(a, b, all=T) date() slow <- merge(a, b, all=T, by='row.names') date() Has anybody else noticed this? From embracegod1 at yahoo.com Wed Mar 2 17:39:37 2011 From: embracegod1 at yahoo.com (Makuachukwu Ojide) Date: Wed, 2 Mar 2011 08:39:37 -0800 (PST) Subject: [R] Step by step procedure for the application of Threshold model Message-ID: <563471.81827.qm@web33408.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From linsleys at comcast.net Wed Mar 2 19:51:18 2011 From: linsleys at comcast.net (linsleyp) Date: Wed, 2 Mar 2011 10:51:18 -0800 (PST) Subject: [R] trouble loading ggplot2 using R Message-ID: <1299091878638-3332044.post@n4.nabble.com> I'm having trouble loading ggplot2 on my mac (Snow Leopard) using R version 2.12.1, as shown below. I can't find a posting relevant to this problem, so any help would be very much appreciated. Thanks, peter l > install.packages('ggplot2', dep = TRUE) trying URL 'http://cran.cnr.Berkeley.edu/bin/macosx/leopard/contrib/2.12/ggplot2_0.8.9.tgz' Content type 'application/x-gzip' length 2481399 bytes (2.4 Mb) opened URL ================================================== downloaded 2.4 Mb The downloaded packages are in /var/folders/XF/XF0tU7gdGTeF4Th7KhKYDk+++TI/-Tmp-//RtmpFMnqLB/downloaded_packages > library(ggplot2) Error in assign(names[i], dots[[i]], env = envir) : invalid first argument Error : unable to load R code in package 'ggplot2' Error: package/namespace load failed for 'ggplot2' -- View this message in context: http://r.789695.n4.nabble.com/trouble-loading-ggplot2-using-R-tp3332044p3332044.html Sent from the R help mailing list archive at Nabble.com. From louise at sinfield.uk.net Wed Mar 2 17:50:06 2011 From: louise at sinfield.uk.net (LouiseS) Date: Wed, 2 Mar 2011 08:50:06 -0800 (PST) Subject: [R] Creating a weighted sample - Help Message-ID: <1299084606398-3331842.post@n4.nabble.com> Hi I'm new to R and most things I want to do I can do but I'm stuck on how to weight a sample. I have had a look through the post but I can't find anything that addresses my specific problem. I am wanting to scale up a sample which has been taken based on a single variable (perf) which has 4 attributes H,I, J and K. The make up of the sample is shown below:- Perf Factored Count (A) Raw Count (B) Factor (A/B) H 5,945 2,924 2.033174 I 1,305 2,436 0.535714 J 2,000 2,092 0.956023 K 750 1,225 0.612245 I then want to produce all further analysis based on this factored sample. I can produce a weighted sample in SAS using the weight function which I have shown below wt=0; if perf='H' then wt=2.033174; if perf='I ' then wt=0.535714; if perf='J ' then wt=0.956023; if perf='K ' then wt=0.612245; proc freq data=DD.new; tables resdstat; weight wt; run; Does anyone know how to reproduce this in R? Thanks very much -- View this message in context: http://r.789695.n4.nabble.com/Creating-a-weighted-sample-Help-tp3331842p3331842.html Sent from the R help mailing list archive at Nabble.com. From mailtoantonyraj at gmail.com Wed Mar 2 17:10:39 2011 From: mailtoantonyraj at gmail.com (Antony Raj) Date: Wed, 2 Mar 2011 21:40:39 +0530 Subject: [R] Contingency table in R In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From nandan.amar at gmail.com Wed Mar 2 17:02:33 2011 From: nandan.amar at gmail.com (Amar) Date: Wed, 2 Mar 2011 16:02:33 +0000 Subject: [R] finding model order components for arima() References: Message-ID: Amar gmail.com> writes: > > Hi, > I am trying to model a time series using arima(). For getting the > model order components(p, d, q and P,D,Q) I am using procedure > discussed in [1] in section 3.2 . It is most likely hit and trial > method based on lower AIC value. > I want to know what is the correct way to find model order components > or the method described in [1] is the appropriate one. > thanks in advance. > > [1]Automatic Time Series Forecasting: The forecast Package for R (http://www.jstatsoft.org/v27/i03) From patsko at gmx.de Wed Mar 2 17:10:43 2011 From: patsko at gmx.de (patsko at gmx.de) Date: Wed, 02 Mar 2011 17:10:43 +0100 Subject: [R] GLM / Logistic Regression Problem Message-ID: <20110302161043.309870@gmx.net> Hi there, I am encountering a problem with the GLM tool performing logistic regression. After computing a warning appears, saying ?glm.fit: fitted probabilities numerically 0 or 1 occurred?. A prediction of new values confirms the problem as the model does not produce regular probability estimates but values which are way higher than 1 and lower than 0 in many cases. I have tried both methods setting the family=binomial and family=binomial(?logit?) so this can?t be the reason that causes the error. As an alternative solution I have considered to resort to the Logistic tool from the RWeka package. The manual says that it exists for building multinomial logistic regression models. I can?t image it would be a problem but can anyone confirm that it indeed is possible to use the algorithm for also computing binary models?! Best regards Patrick -- Schon geh?rt? GMX hat einen genialen Phishing-Filter in die From rosyaraur at gmail.com Wed Mar 2 20:38:16 2011 From: rosyaraur at gmail.com (Umesh Rosyara) Date: Wed, 2 Mar 2011 14:38:16 -0500 Subject: [R] thank you In-Reply-To: References: <003e01cbd830$14a07140$3de153c0$@com> <6BB40054316B42E0B973E9E829B89EA0@OwnerPC> Message-ID: <008001cbd911$6106ad10$23140730$@com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From m2m at gmx.at Wed Mar 2 21:25:12 2011 From: m2m at gmx.at (Spindoctor) Date: Wed, 2 Mar 2011 12:25:12 -0800 (PST) Subject: [R] spplot() - costumize the color-legend Message-ID: <1299097512016-3332225.post@n4.nabble.com> Hi! Is there a way to manually costumize the color legend in an spplot() - especially where to draw ticks and labels for the ticks? The reason I'm asking: Usually spplot() automatically divides the data into fitting slices and makes a color legend (also automatically). I want to assign the slices myself and have a fixed scale instead of an automatic/dynamic scale. I think what I want gets clear in this example: library(sp) data(meuse.grid) gridded(meuse.grid) = ~x+y ## DATA GENERATION meuse.grid$random <- rnorm(nrow(meuse.grid), 7, 2) # generate random data meuse.grid$random[meuse.grid$random < 0] <- 0 # make sure there is no value is smaller than zero ... meuse.grid$random[meuse.grid$random > 10] <- 10 # and bigger than ten ## DATA GENERATION FINISHED ## making a factor out of meuse.grid$ random to have absolute values plotted meuse.grid$random <- cut(meuse.grid$random, seq(0, 10, 0.1)) # here I assign the levels I want to use in my plot!!! spplot(meuse.grid, c("random"), col.regions = rainbow(100, start = 4/6, end = 1)) # look at the color-legend - not so good. The graphic itself is like I want it, but the legend doesn't look too good. Although I assign 100 factors, I want just a few ticks in the legend (and also just a few labels). How can this be achieved? Thank you! -- View this message in context: http://r.789695.n4.nabble.com/spplot-costumize-the-color-legend-tp3332225p3332225.html Sent from the R help mailing list archive at Nabble.com. From gunter.berton at gene.com Wed Mar 2 21:49:25 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Wed, 2 Mar 2011 12:49:25 -0800 Subject: [R] GLM / Logistic Regression Problem In-Reply-To: <20110302161043.309870@gmx.net> References: <20110302161043.309870@gmx.net> Message-ID: Please read the Help for predict.glm carefully to make sure you are not confusing predicted response on the linear scale (log odds) with that on the probability scale. The warning is just that: a warning. It means that you have fitted PROBABILITIES on the boundary, which might compromise the iterative fitting algorithm and inference thereon. Ergo: examine this carefully before bithely proceeding. -- Bert On Wed, Mar 2, 2011 at 8:10 AM, wrote: > Hi there, > > I am encountering a problem with the GLM tool performing logistic regression. After computing a warning appears, saying ?glm.fit: fitted probabilities numerically 0 or 1 occurred?. A prediction of new values confirms the problem as the model does not produce regular probability estimates but values which are way higher than 1 and lower than 0 in many cases. > I have tried both methods setting the family=binomial and family=binomial(?logit?) so this can?t be the reason that causes the error. > > As an alternative solution I have considered to resort to the Logistic tool from the RWeka package. The manual says that it exists for building multinomial logistic regression models. I can?t image it would be a problem but can anyone confirm that it indeed is possible to use the algorithm for also computing binary models?! > > Best regards > > Patrick > -- > Schon geh?rt? GMX hat einen genialen Phishing-Filter in die > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Bert Gunter Genentech Nonclinical Biostatistics From Greg.Snow at imail.org Wed Mar 2 22:02:58 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Wed, 2 Mar 2011 14:02:58 -0700 Subject: [R] how many records for suitable regression In-Reply-To: <1299073808387-3331522.post@n4.nabble.com> References: <1299073808387-3331522.post@n4.nabble.com> Message-ID: It really depends on what question you are trying to answer. Things like the relative importance of type I and type II errors could matter a lot. Correlation among the predictors can affect things. What effect size are you looking for and what power do you want? And much more. There is a general rule of thumb that you need at least 10-20 observations per predictor variable (categorical variables need to be thought of as their indicator variables for this rule) to have any chance that the coefficients will be meaningful, but this is very much a lower bound and you may need more depending on some of the above questions. If you have some idea of what the structure of your data will be, then you can simulate various sample sizes, analyze them, and see which sizes start to give meaningful answers. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of agent dunham > Sent: Wednesday, March 02, 2011 6:50 AM > To: r-help at r-project.org > Subject: [R] how many records for suitable regression > > Dear community, > > I was wondering if it's possible to know if you have enough data for a > regression study. > > I remember you must have more data than parameters to obtain, but I'd > like > to know if there was something more sophisticated. > > Thanks, user at host.com > > -- > View this message in context: http://r.789695.n4.nabble.com/how-many- > records-for-suitable-regression-tp3331522p3331522.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From deeepersound at googlemail.com Wed Mar 2 22:07:53 2011 From: deeepersound at googlemail.com (Maxim) Date: Wed, 2 Mar 2011 22:07:53 +0100 Subject: [R] clustering problem Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rex.dwyer at syngenta.com Wed Mar 2 22:25:31 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Wed, 2 Mar 2011 16:25:31 -0500 Subject: [R] clustering problem In-Reply-To: References: Message-ID: <36180405F8418449918AD20618D110FC095BF553D4@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Don't you expect it to be a lot faster if you cluster 20 items instead of 25000? -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Maxim Sent: Wednesday, March 02, 2011 4:08 PM To: r-help at r-project.org Subject: [R] clustering problem Hi, I have a gene expression experiment with 20 samples and 25000 genes each. I'd like to perform clustering on these. It turned out to become much faster when I transform the underlying matrix with t(matrix). Unfortunately then I'm not anymore able to use cutree to access individual clusters. In general I do something like this: hc <- hclust(dist(USArrests), "ave") library(RColorBrewer) library(gplots) clrno=3 cols<-rainbow(clrno, alpha = 1) clstrs <- cutree(hc, k=clrno) ccols <- cols[as.vector(clstrs)] heatcol<-colorRampPalette(c(3,1,2), bias = 1.0)(32) heatmap.2(as.matrix(USArrests), Rowv=as.dendrogram(hc),col=heatcol, trace="none",RowSideColors=ccols) Nice, I can access 3 main clusters with cutree. But what about a situation when I perform hclust like hc <- hclust(dist(t(USArrests)), "ave") which I have to do in order to speed up the clustering process. This I can plot with: heatmap.2(as.matrix(USArrests), Colv=as.dendrogram(hc),col=heatcol, trace="none") But where do I find information about the clustering that was applied to the rows? cutree(hc, k=clrno) delivers the clustering on the columns, so what can I do to access the levels for the rows? I guess the solution is easy, but after ours of playing around I thought it might be a good time to contact the mailing list! Maxim [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. From ggrothendieck at gmail.com Wed Mar 2 22:42:40 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Wed, 2 Mar 2011 16:42:40 -0500 Subject: [R] a question on sqldf's handling of missing value and factor In-Reply-To: <1299079071033-3331667.post@n4.nabble.com> References: <1299041571876-3331007.post@n4.nabble.com> <1299078884773-3331662.post@n4.nabble.com> <1299079071033-3331667.post@n4.nabble.com> Message-ID: On Wed, Mar 2, 2011 at 10:17 AM, xin wei wrote: > I am sorry for posting the wrong source file. the correct source file is as > follows: > a ? ? ? b ? ? ? c > aa ? ? ? ? ? ? ?23 > aaa ? ? 34.6 > aaaa ? ? ? ? ? ?77.8 > > They are tab delimited but somehow could not be displayed correctly in > browser. The problem is that you are using empty fields to represent missing values but SQLite regarded them as zero length character fields. See FAQ 14 on the sqldf home page for a solution: http://code.google.com/p/sqldf/#14._How_does_one_read_files_where_numeric_NAs_are_represented_as -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From zmring at gmail.com Wed Mar 2 22:52:29 2011 From: zmring at gmail.com (John Smith) Date: Wed, 2 Mar 2011 16:52:29 -0500 Subject: [R] how to delete empty levels from lattice xyplot Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From deeepersound at googlemail.com Wed Mar 2 23:14:50 2011 From: deeepersound at googlemail.com (Maxim) Date: Wed, 2 Mar 2011 23:14:50 +0100 Subject: [R] clustering problem In-Reply-To: <36180405F8418449918AD20618D110FC095BF553D4@USETCMSXMB02.NAFTA.SYNGENTA.ORG> References: <36180405F8418449918AD20618D110FC095BF553D4@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From r_s_poole at hotmail.co.uk Wed Mar 2 22:58:22 2011 From: r_s_poole at hotmail.co.uk (Ryan Poole) Date: Wed, 2 Mar 2011 21:58:22 +0000 Subject: [R] looping through data in sections and performing a function on each one Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From aquanyc at gmail.com Wed Mar 2 23:14:12 2011 From: aquanyc at gmail.com (rivercode) Date: Wed, 2 Mar 2011 14:14:12 -0800 (PST) Subject: [R] Create a zoo/xts Time Series with Millisecond jumps Message-ID: <1299104052334-3332427.post@n4.nabble.com> Is there a easy way to create the time index for a zoo/xts object for every 100 milliseconds. eg. time Index would be: 10:00:00:100 10:00:00:200 10:00:00:300 10:00:00:400 I am looking to build an empty zoo/xts object with time index from 10am to 3pm, index jumps by 100ms each row. Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Create-a-zoo-xts-Time-Series-with-Millisecond-jumps-tp3332427p3332427.html Sent from the R help mailing list archive at Nabble.com. From adick at fiu.edu Wed Mar 2 23:38:49 2011 From: adick at fiu.edu (Anthony Dick) Date: Wed, 2 Mar 2011 17:38:49 -0500 Subject: [R] parallel bootstrap linear model on multicore mac (re-post) Message-ID: <4D6EC6F9.60202@fiu.edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rex.dwyer at syngenta.com Thu Mar 3 00:12:36 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Wed, 2 Mar 2011 18:12:36 -0500 Subject: [R] merge( , by='row.names') slowness In-Reply-To: <117a4deb-9e29-4bba-8db1-5c78571bdfd7@a11g2000pri.googlegroups.com> References: <117a4deb-9e29-4bba-8db1-5c78571bdfd7@a11g2000pri.googlegroups.com> Message-ID: <36180405F8418449918AD20618D110FC095BF554DF@USETCMSXMB02.NAFTA.SYNGENTA.ORG> -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of dms Sent: Wednesday, March 02, 2011 3:16 PM To: r-help at r-project.org Subject: [R] merge( , by='row.names') slowness I noticed that joining two data.frames in R using the "merge" function that using by='row.names' slows things down substantially when compared to just joining on a common index column. Using a dataframe size of ~10,000 rows: it's as slow as 10 minutes in the by='row.names' case versus merely 1 second using an index column. Beyond the 10^6 range, it's unusably slow. n <- 5 a <- data.frame(id=as.character(1:10^n), x=rnorm(10^n)); rownames(a) <- a$id b <- data.frame(id=as.character(1:10^n + 10^(n-1)), y=rnorm(10^n)); rownames(b) <- b$id date() fast <- merge(a, b, all=T) date() slow <- merge(a, b, all=T, by='row.names') date() Has anybody else noticed this? _________________________________________________ HI DMS, Well, first off, they don't give the same answer... in fact, not even the same dimension. Even so, from looking at merge.data.frame, it's not immediately obvious what would make a difference of this magnitude. The answer might be buried in the internal merge. Here for n=3: > system.time(print(dim(merge(a,b,all=T)))) [1] 1100 3 user system elapsed 0.01 0.00 0.01 > system.time(print(dim(merge(a,b,all=T,by=1)))) [1] 1100 3 user system elapsed 0.01 0.00 0.02 > system.time(print(dim(merge(a,b,all=T,by=0)))) [1] 1100 5 user system elapsed 3.26 0.00 3.17 > system.time(print(dim(merge(a,b,all=T,by="row.names")))) [1] 1100 5 user system elapsed 3.17 0.00 3.17 > ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. From ggrothendieck at gmail.com Thu Mar 3 00:20:42 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Wed, 2 Mar 2011 18:20:42 -0500 Subject: [R] Create a zoo/xts Time Series with Millisecond jumps In-Reply-To: <1299104052334-3332427.post@n4.nabble.com> References: <1299104052334-3332427.post@n4.nabble.com> Message-ID: On Wed, Mar 2, 2011 at 5:14 PM, rivercode wrote: > Is there a easy way to create the time index for a zoo/xts object for every > 100 milliseconds. > > eg. ?time Index would be: > > 10:00:00:100 > 10:00:00:200 > 10:00:00:300 > 10:00:00:400 > > I am looking to build an empty zoo/xts object with time index from 10am to > 3pm, index jumps by 100ms each row. > Here are three ways. as.xts(z2) could be used to turn the second one into xts. library(zoo) library(chron) len <- 5 * 60 * 60 * 10 + 1 # use chron times class z1 <- zoo(, seq(times("10:00:00"), times("15:00:00"), length = len)) # use POSIXct times z2 <- zoo(, seq(as.POSIXct("2011-01-01 10:00:00"), as.POSIXct("2011-01-01 15:00:00"), length = len)) # number intervals from 1 to len z3 <- zoo(, seq_len(len)) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From ehlers at ucalgary.ca Thu Mar 3 01:10:14 2011 From: ehlers at ucalgary.ca (P Ehlers) Date: Wed, 02 Mar 2011 16:10:14 -0800 Subject: [R] transform table to matrix In-Reply-To: <11AF2783CA6F4B2F98FF604196B1CB6C@gmail.com> References: <11AF2783CA6F4B2F98FF604196B1CB6C@gmail.com> Message-ID: <4D6EDC66.2000606@ucalgary.ca> Scott Chamberlain wrote: > This thread seems freakishly similar to what you are asking....Scott Even to the point of including the same typo as well as proof that neither poster bothered to read the posting guide. Great spot, Scott! Peter Ehlers > > http://tolstoy.newcastle.edu.au/R/help/06/07/30127.html > On Wednesday, March 2, 2011 at 7:43 AM, SK MAIDUL HAQUE wrote: >> I have a text file that I have imported into R. It contains 3 columns and >> 316940 rows. The first column is vegetation plot ID, the second species >> names and the third is a cover value (numeric). I imported using the >> read.table function. >> >> My problem is this. I need to reformat the information as a matrix, with the >> first column becoming the row labels and the second the column labels and >> the cover values as the matrix cell data. However, since the >> read.tablefunction imported the data as an indexed data frame, I can't use >> the columns >> as vectors. Is there a way around this, to convert the data frame as 3 >> separate vectors? I have been looking all over for a function, and my >> programming skills are not great. >> >> >> -- >> Sk Maidul Haque >> Scientific Officer-C >> Applied Spectroscopy Division >> Bhabha Atomic Research Centre, Vizag >> >> Mo: 09666429050/09093458503 >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From scttchamberlain4 at gmail.com Thu Mar 3 03:19:18 2011 From: scttchamberlain4 at gmail.com (Scott Chamberlain) Date: Wed, 2 Mar 2011 20:19:18 -0600 Subject: [R] transform table to matrix In-Reply-To: <4D6EDC66.2000606@ucalgary.ca> References: <11AF2783CA6F4B2F98FF604196B1CB6C@gmail.com> <4D6EDC66.2000606@ucalgary.ca> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rsaber at comcast.net Thu Mar 3 04:45:49 2011 From: rsaber at comcast.net (Gregory Ryslik) Date: Wed, 2 Mar 2011 22:45:49 -0500 Subject: [R] Non-conformable arrays Message-ID: <2836F980-47B1-4B5F-B8F3-DA9E46D1E232@comcast.net> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djmuser at gmail.com Thu Mar 3 04:46:35 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Wed, 2 Mar 2011 19:46:35 -0800 Subject: [R] how to delete empty levels from lattice xyplot In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Bill.Venables at csiro.au Thu Mar 3 04:55:48 2011 From: Bill.Venables at csiro.au (Bill.Venables at csiro.au) Date: Thu, 3 Mar 2011 14:55:48 +1100 Subject: [R] Non-conformable arrays In-Reply-To: <2836F980-47B1-4B5F-B8F3-DA9E46D1E232@comcast.net> References: <2836F980-47B1-4B5F-B8F3-DA9E46D1E232@comcast.net> Message-ID: <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A87@EXNSW-MBX03.nexus.csiro.au> Here is one way. 1. make sure y.test is a factor 2. Use table(y.test, factor(PredictedTestCurrent, levels = levels(y.test)) 3. If PredictedTestCurrent is already a factor with the wrong levels, turn it back into a character string vector first. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Gregory Ryslik Sent: Thursday, 3 March 2011 1:46 PM To: r-help Help Subject: [R] Non-conformable arrays Hi Everyone, I'm running some simulations where eventually I need to table the results. The problem is, that while most simulations I have at least one predicted outcome for each of the six possible categories, sometimes the algorithm assigns all the outcomes and one category is left out. Thus when I try to add the misclassification matrices I get an error. Is there a simple way I can make sure that all my tables have the same size (with a row or column of zeros) if appropriate without a messy "if" structure checking each condition? To be more specific, here's my line of code for the table command and the two matrices that I sometimes have. Notice that in the second matrix, the "fad" column is missing. Basically, I want all the columns and rows to be predetermined so that no columns/rows go missing. Thanks for your help! Kind regards, Greg table(y.test,PredictedTestCurrent): PredictedTestCurrent y.test adi car con fad gla mas adi 9 0 0 0 0 0 car 0 6 1 0 0 3 con 1 0 3 0 0 0 fad 0 0 0 2 5 4 gla 0 1 0 0 6 3 mas 0 0 0 1 4 4 PredictedTestCurrent y.test adi car con gla mas adi 8 0 0 0 0 car 0 8 0 0 1 con 2 0 3 0 0 fad 0 1 0 4 7 gla 0 0 0 3 5 mas 0 2 0 6 3 [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From ericstrom at aol.com Thu Mar 3 04:19:22 2011 From: ericstrom at aol.com (eric) Date: Wed, 2 Mar 2011 19:19:22 -0800 (PST) Subject: [R] What am I doing wrong with this loop ? Message-ID: <1299122362375-3332703.post@n4.nabble.com> What is wrong with this loop ? I am getting an error saying incorrect number of dimensions y[i,2] x <- as.data.frame(runif(2000, 12, 38)) z <-numeric(length(x)) y <- as.data.frame(z) for(i in 1:length(x)) { y <- ifelse(i < 500, as.data.frame(lowess(x[1:i,1], f=1/9)) , as.data.frame(lowess(x[(i-499):i,1], f=1/9))) z[i] <-y[i,2] } -- View this message in context: http://r.789695.n4.nabble.com/What-am-I-doing-wrong-with-this-loop-tp3332703p3332703.html Sent from the R help mailing list archive at Nabble.com. From Bill.Venables at csiro.au Thu Mar 3 05:31:24 2011 From: Bill.Venables at csiro.au (Bill.Venables at csiro.au) Date: Thu, 3 Mar 2011 15:31:24 +1100 Subject: [R] What am I doing wrong with this loop ? In-Reply-To: <1299122362375-3332703.post@n4.nabble.com> References: <1299122362375-3332703.post@n4.nabble.com> Message-ID: <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A89@EXNSW-MBX03.nexus.csiro.au> Here is a start > x <- as.data.frame(runif(2000, 12, 38)) > length(x) [1] 1 > names(x) [1] "runif(2000, 12, 38)" > Why are you turning x and y into data frames? It also looks as if you should be using if(...) ... else ... rather than ifelse(.,.,), too. You need to sort out a few issues, it seems. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of eric Sent: Thursday, 3 March 2011 1:19 PM To: r-help at r-project.org Subject: [R] What am I doing wrong with this loop ? What is wrong with this loop ? I am getting an error saying incorrect number of dimensions y[i,2] x <- as.data.frame(runif(2000, 12, 38)) z <-numeric(length(x)) y <- as.data.frame(z) for(i in 1:length(x)) { y <- ifelse(i < 500, as.data.frame(lowess(x[1:i,1], f=1/9)) , as.data.frame(lowess(x[(i-499):i,1], f=1/9))) z[i] <-y[i,2] } -- View this message in context: http://r.789695.n4.nabble.com/What-am-I-doing-wrong-with-this-loop-tp3332703p3332703.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From rsaber at comcast.net Thu Mar 3 06:06:57 2011 From: rsaber at comcast.net (Gregory Ryslik) Date: Thu, 3 Mar 2011 00:06:57 -0500 Subject: [R] Non-conformable arrays In-Reply-To: <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A87@EXNSW-MBX03.nexus.csiro.au> References: <2836F980-47B1-4B5F-B8F3-DA9E46D1E232@comcast.net> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A87@EXNSW-MBX03.nexus.csiro.au> Message-ID: <02551021-5F03-4042-8045-08FAF9C38B59@comcast.net> Perfect! Thank you! On Mar 2, 2011, at 10:55 PM, wrote: > Here is one way. > > 1. make sure y.test is a factor > > 2. Use > > table(y.test, > factor(PredictedTestCurrent, levels = levels(y.test)) > > 3. If PredictedTestCurrent is already a factor with the wrong levels, turn it back into a character string vector first. > > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Gregory Ryslik > Sent: Thursday, 3 March 2011 1:46 PM > To: r-help Help > Subject: [R] Non-conformable arrays > > Hi Everyone, > > I'm running some simulations where eventually I need to table the results. The problem is, that while most simulations I have at least one predicted outcome for each of the six possible categories, sometimes the algorithm assigns all the outcomes and one category is left out. Thus when I try to add the misclassification matrices I get an error. Is there a simple way I can make sure that all my tables have the same size (with a row or column of zeros) if appropriate without a messy "if" structure checking each condition? > > To be more specific, > > here's my line of code for the table command and the two matrices that I sometimes have. Notice that in the second matrix, the "fad" column is missing. Basically, I want all the columns and rows to be predetermined so that no columns/rows go missing. Thanks for your help! > > Kind regards, > Greg > > table(y.test,PredictedTestCurrent): > > > PredictedTestCurrent > y.test adi car con fad gla mas > adi 9 0 0 0 0 0 > car 0 6 1 0 0 3 > con 1 0 3 0 0 0 > fad 0 0 0 2 5 4 > gla 0 1 0 0 6 3 > mas 0 0 0 1 4 4 > > > PredictedTestCurrent > y.test adi car con gla mas > adi 8 0 0 0 0 > car 0 8 0 0 1 > con 2 0 3 0 0 > fad 0 1 0 4 7 > gla 0 0 0 3 5 > mas 0 2 0 6 3 > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ehlers at ucalgary.ca Thu Mar 3 07:12:12 2011 From: ehlers at ucalgary.ca (P Ehlers) Date: Wed, 02 Mar 2011 22:12:12 -0800 Subject: [R] how to delete empty levels from lattice xyplot In-Reply-To: References: Message-ID: <4D6F313C.1040201@ucalgary.ca> Dennis Murphy wrote: > Hi: > > On Wed, Mar 2, 2011 at 1:52 PM, John Smith wrote: > >> Hello All, >> >> I try to use the attached code to produce a cross over plot. There are 13 >> subjects, 7 of them in for/sal group, and 6 of them in sal/for group. But >> in >> xyplot, all the subjects are listed in both subgraphs. Could anyone help me >> figure out how to get rid of the empty levels? >> > > Let's start with the data frame construction. Using the vectors from your > original post, > > studyLong <- data.frame(id = factor(rep(id, 2)), > sequence = rep(sequence, 2), > treat = c(treat1, treat2), > pef = c(pef1, pef2)) > > Notice how I read the data frame from your original vectors. You made the > mistake of mixing character with numeric vectors in forming a matrix - by > doing that, > the matrix type is coerced to character. You then coerced the matrix to a > data frame, > but because you didn't set stringsAsFactors = FALSE, all four of the > variables > in your data frame were factors. Something looked weird to me in your first > graph, so I checked it with str(): >> str(studyLong) > 'data.frame': 26 obs. of 4 variables: > $ id : Factor w/ 13 levels "1","10","11",..: 1 9 11 12 2 3 6 7 8 10 > ... > $ sequence: Factor w/ 2 levels "for/sal","sal/for": 1 1 1 1 1 1 1 2 2 2 ... > $ treat : Factor w/ 2 levels "for","sal": 1 1 1 1 1 1 1 2 2 2 ... > $ pef : Factor w/ 20 levels " 90","210","220",..: 9 9 15 20 4 16 11 15 > 9 16 ... > > After I re-read it per above, it looked like this: >> str(studyLong) > 'data.frame': 26 obs. of 4 variables: > $ id : Factor w/ 13 levels "1","2","3","4",..: 1 4 6 7 10 11 14 2 3 5 > ... > $ sequence: Factor w/ 2 levels "for/sal","sal/for": 1 1 1 1 1 1 1 2 2 2 ... > $ treat : Factor w/ 2 levels "for","sal": 1 1 1 1 1 1 1 2 2 2 ... > $ pef : num 310 310 370 410 250 380 330 370 310 380 ... > > If I understand your question, you want the only the levels for each > treatment group plotted in each panel. This is one way, but I'm guessing > it's not quite what you were expecting ( I wasn't :): > > xyplot(pef ~ id | sequence, groups=treat, data=studyLong, > auto.key=list(columns=2), scales = list(x = list(relation = 'free'))) > > I tried a couple of things without much success in lattice (I'm not > terribly good at writing panel functions off the top of my head, I'm > afraid), > but it was pretty easy to get what I think you want from ggplot2: > > library(ggplot2) > g <- ggplot(studyLong, aes(x = id, y = pef, colour = treat)) > g + geom_point() + facet_wrap( ~ sequence, scales = 'free_x') > > # or, using the simpler qplot() function, > qplot(x = id, y = pef, colour = treat, data = studyLong, geom = 'point') + > facet_wrap( ~ sequence, scales = 'free_x') > > If that's what you're after, one of the Lattice mavens can probably show > you how to do it easily in Lattice as well. I'll probably learn something, > too... I'm no lattice expert, but it seems to me that here one simple answer is to set the levels of 'id' appropriately: just replace the line studyLong <- data.frame(id = factor(rep(id, 2)), with studyLong <- data.frame(id = factor(rep(id, 2), levels=id), and run the lattice code. Peter Ehlers > > For more on ggplot2: http://had.co.nz/ggplot2 > Scroll to the bottom to find the on-line help pages with examples. > > HTH, > Dennis > > Thanks >> >> >> >> library(lattice) >> >> pef1 <- c(310,310,370,410,250,380,330,370,310,380,290,260,90) >> pef2 <- c(270,260,300,390,210,350,365,385,400,410,320,340,220) >> id <- c("1","4","6","7","10","11","14","2","3","5","9","12","13") >> sequence <- c(rep('for/sal', 7), rep('sal/for', 6)) >> treat1 <- c(rep('for', 7), rep('sal', 6)) >> treat2 <- c(rep('sal', 7), rep('for', 6)) >> study <- data.frame(id, sequence, treat1, pef1, treat2, pef2) >> >> studyLong <- as.data.frame(rbind(as.matrix(study[,c('id', 'sequence', >> 'treat1', 'pef1')]), >> as.matrix(study[,c('id', 'sequence', >> 'treat2', 'pef2')]))) >> colnames(studyLong) <- c('id', 'sequence', 'treat', 'pef') >> >> xyplot(pef ~ id | sequence, groups=treat, data=studyLong, >> auto.key=list(columns=2)) >> From ehlers at ucalgary.ca Thu Mar 3 07:28:42 2011 From: ehlers at ucalgary.ca (P Ehlers) Date: Wed, 02 Mar 2011 22:28:42 -0800 Subject: [R] Creating a weighted sample - Help In-Reply-To: <1299084606398-3331842.post@n4.nabble.com> References: <1299084606398-3331842.post@n4.nabble.com> Message-ID: <4D6F351A.7030903@ucalgary.ca> LouiseS wrote: > Hi > > I'm new to R and most things I want to do I can do but I'm stuck on how to > weight a sample. I have had a look through the post but I can't find > anything that addresses my specific problem. I am wanting to scale up a > sample which has been taken based on a single variable (perf) which has 4 > attributes H,I, J and K. The make up of the sample is shown below:- > > Perf Factored Count (A) Raw Count (B) Factor (A/B) > H 5,945 2,924 2.033174 > I 1,305 2,436 0.535714 > J 2,000 2,092 0.956023 > K 750 1,225 0.612245 > > > I then want to produce all further analysis based on this factored sample. > I can produce a weighted sample in SAS using the weight function which I > have shown below > > wt=0; > if perf='H' then wt=2.033174; > if perf='I ' then wt=0.535714; > if perf='J ' then wt=0.956023; > if perf='K ' then wt=0.612245; > > proc freq data=DD.new; > tables resdstat; > weight wt; > run; > > Does anyone know how to reproduce this in R? I don't know what you mean by "all further analysis", but if you want weighted mean, variance, quantile, have a look at ?wtd.mean in the Hmisc package. Just use your A/B values in a weights vector. Peter Ehlers > > Thanks very much > > -- > View this message in context: http://r.789695.n4.nabble.com/Creating-a-weighted-sample-Help-tp3331842p3331842.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From rsaber at comcast.net Thu Mar 3 07:44:21 2011 From: rsaber at comcast.net (Gregory Ryslik) Date: Thu, 3 Mar 2011 01:44:21 -0500 Subject: [R] read a text file with variable number of spaces In-Reply-To: <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A89@EXNSW-MBX03.nexus.csiro.au> References: <1299122362375-3332703.post@n4.nabble.com> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A89@EXNSW-MBX03.nexus.csiro.au> Message-ID: Hi, I seem to be having somewhat of an unusual data input problem with some of the data sets I'm working with and want to run a simulation on. in the first data set I'm looking at, I have a text file where the spacing between columns varies. I've attached a snippet. Is there a way to read this into R? Basically, I want to ignore all the spaces to make new columns. In a slightly different case, I have a long sequence of nucleotides (the letters are always either g,a,t,c). Is there a way to get each letter into it's own column so that I can then use it as a data set? I'm kind of loathe to program a java/C program to do this if I don't have to and was wondering if a way in R exists for this. Thanks! Greg Case1: ACE2_YEAST 0.42 0.37 0.59 0.20 0.50 0.00 0.52 0.29 NUC ACH1_YEAST 0.40 0.42 0.57 0.35 0.50 0.00 0.53 0.25 CYT ACON_YEAST 0.60 0.40 0.52 0.46 0.50 0.00 0.53 0.22 MIT ACR1_YEAST 0.66 0.55 0.45 0.19 0.50 0.00 0.46 0.22 MIT ACT_YEAST 0.46 0.44 0.52 0.11 0.50 0.00 0.50 0.22 CYT ACT2_YEAST 0.47 0.39 0.50 0.11 0.50 0.00 0.49 0.40 CYT ACT3_YEAST 0.58 0.47 0.54 0.11 0.50 0.00 0.51 0.26 NUC ACT5_YEAST 0.50 0.34 0.55 0.21 0.50 0.00 0.49 0.22 NUC Case2: gtacagtacgtacgtacgatcgatctagcatgcatgcatgcatgcta From Gerrit.Eichner at math.uni-giessen.de Thu Mar 3 08:57:45 2011 From: Gerrit.Eichner at math.uni-giessen.de (Gerrit Eichner) Date: Thu, 3 Mar 2011 08:57:45 +0100 (MET) Subject: [R] read a text file with variable number of spaces In-Reply-To: References: <1299122362375-3332703.post@n4.nabble.com> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A89@EXNSW-MBX03.nexus.csiro.au> Message-ID: Hello, Gregory, for your first data set see ?read.table and for you second ?read.fwf may help solving your problem. Hth -- Gerrit On Thu, 3 Mar 2011, Gregory Ryslik wrote: > Hi, > > > I seem to be having somewhat of an unusual data input problem with some > of the data sets I'm working with and want to run a simulation on. > > in the first data set I'm looking at, I have a text file where the > spacing between columns varies. I've attached a snippet. Is there a way > to read this into R? Basically, I want to ignore all the spaces to make > new columns. In a slightly different case, I have a long sequence of > nucleotides (the letters are always either g,a,t,c). Is there a way to > get each letter into it's own column so that I can then use it as a data > set? > > I'm kind of loathe to program a java/C program to do this if I don't > have to and was wondering if a way in R exists for this. > > Thanks! > Greg > > Case1: > ACE2_YEAST 0.42 0.37 0.59 0.20 0.50 0.00 0.52 0.29 NUC > ACH1_YEAST 0.40 0.42 0.57 0.35 0.50 0.00 0.53 0.25 CYT > ACON_YEAST 0.60 0.40 0.52 0.46 0.50 0.00 0.53 0.22 MIT > ACR1_YEAST 0.66 0.55 0.45 0.19 0.50 0.00 0.46 0.22 MIT > ACT_YEAST 0.46 0.44 0.52 0.11 0.50 0.00 0.50 0.22 CYT > ACT2_YEAST 0.47 0.39 0.50 0.11 0.50 0.00 0.49 0.40 CYT > ACT3_YEAST 0.58 0.47 0.54 0.11 0.50 0.00 0.51 0.26 NUC > ACT5_YEAST 0.50 0.34 0.55 0.21 0.50 0.00 0.49 0.22 NUC > > Case2: > gtacagtacgtacgtacgatcgatctagcatgcatgcatgcatgcta > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From djnordlund at frontier.com Thu Mar 3 09:11:23 2011 From: djnordlund at frontier.com (Daniel Nordlund) Date: Thu, 3 Mar 2011 00:11:23 -0800 Subject: [R] Creating a weighted sample - Help In-Reply-To: <4D6F351A.7030903@ucalgary.ca> References: <1299084606398-3331842.post@n4.nabble.com> <4D6F351A.7030903@ucalgary.ca> Message-ID: > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of P Ehlers > Sent: Wednesday, March 02, 2011 10:29 PM > To: LouiseS > Cc: r-help at r-project.org > Subject: Re: [R] Creating a weighted sample - Help > > LouiseS wrote: > > Hi > > > > I'm new to R and most things I want to do I can do but I'm stuck on how > to > > weight a sample. I have had a look through the post but I can't find > > anything that addresses my specific problem. I am wanting to scale up a > > sample which has been taken based on a single variable (perf) which has > 4 > > attributes H,I, J and K. The make up of the sample is shown below:- > > > > Perf Factored Count (A) Raw Count (B) Factor (A/B) > > H 5,945 2,924 > 2.033174 > > I 1,305 2,436 > 0.535714 > > J 2,000 2,092 > 0.956023 > > K 750 1,225 > 0.612245 > > > > > > I then want to produce all further analysis based on this factored > sample. > > I can produce a weighted sample in SAS using the weight function which I > > have shown below > > > > wt=0; > > if perf='H' then wt=2.033174; > > if perf='I ' then wt=0.535714; > > if perf='J ' then wt=0.956023; > > if perf='K ' then wt=0.612245; > > > > proc freq data=DD.new; > > tables resdstat; > > weight wt; > > run; > > > > Does anyone know how to reproduce this in R? > > I don't know what you mean by "all further analysis", > but if you want weighted mean, variance, quantile, have > a look at ?wtd.mean in the Hmisc package. Just use your > A/B values in a weights vector. > > Peter Ehlers > You haven't told us how you obtained these data that you want to weight, but if you used some kind of non-SRS sampling plan (e.g. stratified, or cluster sample) then you should look at the survey package. Dan Daniel Nordlund Bothell, WA USA From aquanyc at gmail.com Thu Mar 3 05:04:08 2011 From: aquanyc at gmail.com (rivercode) Date: Wed, 2 Mar 2011 20:04:08 -0800 (PST) Subject: [R] as.POSIXct show milliseconds with format Message-ID: <1299125048984-3332733.post@n4.nabble.com> Hi, Trying to create a POSIXct index for an xts object that will display the POSIXct index as HH:MM:SS.MMM. First of all, I am trying to get the as.POSIXct to work with format... > as.POSIXct(paste("2011-03-02 09:00:00.000", sep=""), tz="EST", > format="%H:%M:%OS3") [1] NA Why is this returning NA ? I can get Hours and Minutes...but only with the format as %H %M. > as.POSIXct(paste("2011-03-02 09:00:00.000", sep=""), tz="EST", format="%H > %M") [1] "2011-03-02 20:11:00 EST" BUT if I do it with format="%H:%M" I also get an NA: > as.POSIXct(paste("2011-03-02 09:00:00.000", sep=""), tz="EST", > format="%H:%M") [1] NA What am I not understanding ? Is it possible to create a POSIXct index for xts (or zoo) that will display (eg. with head(my_xts_object) ) the index in format HH:MM:SS.MMM so I can see the milliseconds. Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/as-POSIXct-show-milliseconds-with-format-tp3332733p3332733.html Sent from the R help mailing list archive at Nabble.com. From scttchamberlain4 at gmail.com Thu Mar 3 05:17:50 2011 From: scttchamberlain4 at gmail.com (Scott Chamberlain) Date: Wed, 2 Mar 2011 22:17:50 -0600 Subject: [R] What am I doing wrong with this loop ? In-Reply-To: <1299122362375-3332703.post@n4.nabble.com> References: <1299122362375-3332703.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From raji.sankaran at gmail.com Thu Mar 3 06:14:56 2011 From: raji.sankaran at gmail.com (Raji) Date: Wed, 2 Mar 2011 21:14:56 -0800 (PST) Subject: [R] Import/convert PMML to R model Message-ID: <1299129296949-3332772.post@n4.nabble.com> Hi R-helpers, I have saved my kmeansModel as a pmml using the PMML package in R using the following commands. library(pmml) pmml(kmeansModel) I have to import this saved pmml model and save it in a kmeansModel again. Can you please let me know the R commands to do that? Thanks, Raji -- View this message in context: http://r.789695.n4.nabble.com/Import-convert-PMML-to-R-model-tp3332772p3332772.html Sent from the R help mailing list archive at Nabble.com. From p_connolly at slingshot.co.nz Thu Mar 3 09:50:43 2011 From: p_connolly at slingshot.co.nz (Patrick Connolly) Date: Thu, 3 Mar 2011 21:50:43 +1300 Subject: [R] Reproducibility issue in gbm (32 vs 64 bit) In-Reply-To: <7F867093C54A3448B1D4088254FBB855041C752E@smmail12.rand.org> References: <7F867093C54A3448B1D4088254FBB855041C752E@smmail12.rand.org> Message-ID: <20110303085043.GD5549@slingshot.co.nz> On Sat, 26-Feb-2011 at 08:46AM -0800, Ridgeway, Greg wrote: |> I have heard about this before happening on other |> platforms. Frankly I'm not positive how this happens. My best guess |> is that there's a tiny bit of numeric instability in the 9+ decimal |> place so that on a given iteration a one variable choice at random |> looks better than the other. Any other ideas? Greg I played around with this some time ago and noticed that it happens only when there's perfect or very nearly perfect correlation. I even tried a third variable and it was ignored almost completely. I concluded it's highly unlikely to cause a problem since real data wouldn't have perfectly correlated variables -- or if they did, they'd be easy enough to detect. |> |> ----- Original Message ----- |> From: Joshua Wiley |> To: Axel Urbiz |> Cc: R-help at r-project.org ; Ridgeway, Greg |> Sent: Fri Feb 25 22:16:02 2011 |> Subject: Re: [R] Reproducibility issue in gbm (32 vs 64 bit) |> |> Hi Axel, |> |> I do not have a nice explanation why the results differ off the top of |> my head. I can say I can replicate what you get on 32/64 (both |> Windows 7) bit with the development version of R and gbm_1.6-3.1. |> |> Here is an even simpler example that shows the difference: |> |> gbmfit <- gbm(1:50 ~ I(50:1) + I(60:11), distribution = "gaussian") |> summary(gbmfit) |> |> I copied that package maintainer. |> |> Cheers, |> |> Josh |> |> On Fri, Feb 25, 2011 at 7:29 PM, Axel Urbiz wrote: |> > Dear List, |> > |> > The gbm package on Win 7 produces different results for the |> > relative importance of input variables in R 32-bit relative to R 64-bit. Any |> > idea why? Any idea which one is correct? |> > |> > Based on this example, it looks like the relative importance of 2 perfectly |> > correlated predictors is "diluted" by half in 32-bit, whereas in 64-bit, one |> > of these predictors gets all the importance and the other gets none. I found |> > this interesting. |> > |> > ### Sample code |> > |> > library(gbm) |> > set.seed(12345) |> > xc=matrix(rnorm(100*20),100,20) |> > y=sample(1:2,100,replace=TRUE) |> > xc[,2] <- xc[,1] |> > gbmfit <- gbm(y~xc[,1]+xc[,2] +xc[,3], distribution="gaussian") |> > summary(gbmfit) |> > |> > ### Results on R 2.12.0 (32-bit) |> > |> > ? ? ?var ?rel.inf |> > 1 xc[, 3] 49.76143 |> > 2 xc[, 1] 27.27432 |> > 3 xc[, 2] 22.96425 |> >> |> > ### Results on R 2.12.0 (64-bit) |> >> summary(gbmfit) |> > ? ? ?var ?rel.inf |> > 1 xc[, 1] 50.23857 |> > 2 xc[, 3] 49.76143 |> > 3 xc[, 2] ?0.00000 |> > |> > Thanks, |> > Axel. |> > |> > ? ? ? ?[[alternative HTML version deleted]] |> > |> > ______________________________________________ |> > R-help at r-project.org mailing list |> > https://stat.ethz.ch/mailman/listinfo/r-help |> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html |> > and provide commented, minimal, self-contained, reproducible code. |> > |> |> |> |> -- |> Joshua Wiley |> Ph.D. Student, Health Psychology |> University of California, Los Angeles |> http://www.joshuawiley.com/ |> |> __________________________________________________________________________ |> |> This email message is for the sole use of the intended recipient(s) and |> may contain confidential information. Any unauthorized review, use, |> disclosure or distribution is prohibited. If you are not the intended |> recipient, please contact the sender by reply email and destroy all copies |> of the original message. |> ______________________________________________ |> R-help at r-project.org mailing list |> https://stat.ethz.ch/mailman/listinfo/r-help |> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html |> and provide commented, minimal, self-contained, reproducible code. -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___ Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) ..... Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. From jim at bitwrit.com.au Thu Mar 3 10:06:22 2011 From: jim at bitwrit.com.au (Jim Lemon) Date: Thu, 03 Mar 2011 20:06:22 +1100 Subject: [R] Contingency table in R In-Reply-To: References: Message-ID: <4D6F5A0E.7080702@bitwrit.com.au> On 03/03/2011 01:13 AM, Laura Clasemann wrote: > > Hi, > > I have a table in R with data I needed and need to create a contingency table out of it. The table I have so far looks like this: > > > Binger > r > DietType No Yes > Dangerous 15 12 > Healthy 52 9 > None 134 24 > Unhealthy 72 23 > > These are the error messages that I keep getting whenever I try to get a contingency table. I'm not sure why it won't work for me, any help would be appreciated! >> nametable<-table(excat,recat) > Error in table(excat, recat) : object 'excat' not found > Hi Laura, The above looks like a contingency table, but I suspect that it is in a format that is not recognized as such by R. If I read in the above, less the top two lines ("Binger" and "r"), I get a data frame. lc.df<-read.table("lc.dat",header=TRUE) lc.df DietType No Yes 1 Dangerous 15 12 2 Healthy 52 9 3 None 134 24 4 Unhealthy 72 23 If I then try to run a chi-square test on the numeric columns of the data frame, chisq.test(lc.df[,2:3]) Pearson's Chi-squared test data: lc.df[, 2:3] X-squared = 14.5011, df = 3, p-value = 0.002297 I get the expected result. If the table in your message is something like a table in a word processing document or a text file, R doesn't know what it is. If it is indeed an R object (which I doubt) it probably isn't named "excat" (or "recat" for that matter). That may be what is causing the error message. A final point is that you probably want your table arranged in order of the presumed healthiness of the diet, i.e. Healthy None Unhealthy Dangerous because I think you are trying to discover whether bingers are more likely to report less healthy diets. Jim From singhalblr at gmail.com Thu Mar 3 11:53:03 2011 From: singhalblr at gmail.com (Harsh) Date: Thu, 3 Mar 2011 16:23:03 +0530 Subject: [R] R usage survey Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ggrothendieck at gmail.com Thu Mar 3 14:30:35 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Thu, 3 Mar 2011 08:30:35 -0500 Subject: [R] as.POSIXct show milliseconds with format In-Reply-To: <1299125048984-3332733.post@n4.nabble.com> References: <1299125048984-3332733.post@n4.nabble.com> Message-ID: On Wed, Mar 2, 2011 at 11:04 PM, rivercode wrote: > Hi, > > Trying to create a POSIXct index for an xts object that will display the > POSIXct index as HH:MM:SS.MMM. > > First of all, I am trying to get the as.POSIXct to work with format... > >> as.POSIXct(paste("2011-03-02 09:00:00.000", sep=""), tz="EST", >> format="%H:%M:%OS3") > [1] NA > > Why is this returning NA ? > > I can get Hours and Minutes...but only with the format as %H %M. > >> as.POSIXct(paste("2011-03-02 09:00:00.000", sep=""), tz="EST", format="%H >> %M") > [1] "2011-03-02 20:11:00 EST" > > BUT if I do it with format="%H:%M" I also get an NA: >> as.POSIXct(paste("2011-03-02 09:00:00.000", sep=""), tz="EST", >> format="%H:%M") > [1] NA > > What am I not understanding ? > > Is it possible to create a POSIXct index for xts (or zoo) that will display > (eg. with head(my_xts_object) ) the index in format HH:MM:SS.MMM so I can > see the milliseconds. > options(digits.secs = 3) will cause 3 digits to be displayed. Another thing you could do with zoo is that you could define your own class that displays any way you like. Here we have defined just enough methods of a "mytime" class. This takes advantage of the fact that zoo does not work with hard coded index classes but rather any index class with sufficient methods will work with zoo: library(zoo) as.mytime <- function(x, ...) UseMethod("as.mytime") as.mytime.character <- function(x, ...) { x <- as.POSIXct(paste("1970-01-01", x), ...) structure(x, class = c("mytime", class(x))) } as.mytime.POSIXt <- function(x, ...) { structure(x, class = c("mytime", setdiff(class(x), "mytime"))) } as.character.mytime <- format.mytime <- function(x, format = "%H:%M:%OS3", ...) { format.POSIXct(x, format = format, ...) } print.mytime <- function(x, ...) { print(format(x), ...) } Ops.mytime <- function (e1, e2) { as.mytime(NextMethod(.Generic)) } z <- zooreg(1:15, start = as.mytime("13:14:15.100"), frequency = 10) head(z) The last line produces: > head(z) 13:14:15.100 13:14:15.200 13:14:15.300 13:14:15.400 13:14:15.500 13:14:15.600 1 2 3 4 5 6 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From i.petzev at gmail.com Thu Mar 3 10:07:02 2011 From: i.petzev at gmail.com (hazzard) Date: Thu, 3 Mar 2011 01:07:02 -0800 (PST) Subject: [R] Multivariate Granger Causality Tests Message-ID: <1299143222871-3332968.post@n4.nabble.com> Dear Community, For my masters thesis I need to perform a multivariate granger causality test. I have found a code for bivariate testing on this page (http://www.econ.uiuc.edu/~econ472/granger.R.txt), which I think would not be useful for the multivariate case. Does anybody know a code for a multivariate granger causality test. Thank you in advance. Best Regards -- View this message in context: http://r.789695.n4.nabble.com/Multivariate-Granger-Causality-Tests-tp3332968p3332968.html Sent from the R help mailing list archive at Nabble.com. From antujsrv at gmail.com Thu Mar 3 10:22:44 2011 From: antujsrv at gmail.com (antujsrv) Date: Thu, 3 Mar 2011 01:22:44 -0800 (PST) Subject: [R] Developing a web crawler Message-ID: <1299144164900-3332993.post@n4.nabble.com> Hi, I wish to develop a web crawler in R. I have been using the functionalities available under the RCurl package. I am able to extract the html content of the site but i don't know how to go about analyzing the html formatted document. I wish to know the frequency of a word in the document. I am only acquainted with analyzing data sets. So how should i go about analyzing data that is not available in table format. Few chunks of code that i wrote: w <- getURL("http://www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B003DZ1Y8Q/ref=dp_reviewsanchor#FullQuotes") write.table(w,"test.txt") t <- readLines(w) readLines also didnt prove out to be of any help. Any help would be highly appreciated. Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/Developing-a-web-crawler-tp3332993p3332993.html Sent from the R help mailing list archive at Nabble.com. From stp08emj at shef.ac.uk Thu Mar 3 10:42:13 2011 From: stp08emj at shef.ac.uk (emj83) Date: Thu, 3 Mar 2011 01:42:13 -0800 (PST) Subject: [R] more boa plots questions In-Reply-To: <1299003345958-3330312.post@n4.nabble.com> References: <1299003345958-3330312.post@n4.nabble.com> Message-ID: <1299145333455-3333016.post@n4.nabble.com> Can anyone help? Thanks in advance Emma -- View this message in context: http://r.789695.n4.nabble.com/more-boa-plots-questions-tp3330312p3333016.html Sent from the R help mailing list archive at Nabble.com. From akshata.rao1908 at gmail.com Thu Mar 3 11:06:37 2011 From: akshata.rao1908 at gmail.com (Akshata Rao) Date: Thu, 3 Mar 2011 15:36:37 +0530 Subject: [R] Applying function to multiple data Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pmassicotte at hotmail.com Thu Mar 3 14:15:59 2011 From: pmassicotte at hotmail.com (Filoche) Date: Thu, 3 Mar 2011 05:15:59 -0800 (PST) Subject: [R] Greek character and R Message-ID: <1299158159334-3333304.post@n4.nabble.com> Dear R users. In a loop, I set the title of my graph with : mytitle = expression(paste(delta^13,'C Station ', i) title(mytitle) However, instead of using value of i, it will literally use "i" character. Any one know the way to concatenate the value of i to the mathematical expression? With regards, Phil -- View this message in context: http://r.789695.n4.nabble.com/Greek-character-and-R-tp3333304p3333304.html Sent from the R help mailing list archive at Nabble.com. From louise at sinfield.uk.net Thu Mar 3 14:21:16 2011 From: louise at sinfield.uk.net (LouiseS) Date: Thu, 3 Mar 2011 05:21:16 -0800 (PST) Subject: [R] Creating a weighted sample - Help In-Reply-To: References: <1299084606398-3331842.post@n4.nabble.com> <4D6F351A.7030903@ucalgary.ca> Message-ID: <1299158476421-3333311.post@n4.nabble.com> Hi Thanks for responses. The sample I have taken is a random sample from H, I, J and K. The further analysis I want to do is all around bad debt rates so it could be (H/H+I)*100 = Bad rate percentage also population stability calculations that are all related to credit scoring. I want to be able to report back on any variable that I have in my data set based on my factored counts (A) of 10,000 - so every calculation is based on 10,000 account in the correct proportions. Does his help? Thanks once again Louise -- View this message in context: http://r.789695.n4.nabble.com/Creating-a-weighted-sample-Help-tp3331842p3333311.html Sent from the R help mailing list archive at Nabble.com. From rguldemond at zoology.up.ac.za Thu Mar 3 14:44:43 2011 From: rguldemond at zoology.up.ac.za (Robert Guldemond) Date: Thu, 3 Mar 2011 15:44:43 +0200 Subject: [R] vector("integer", length) : vector size specified is too large Message-ID: <006001cbd9a9$2706d250$751476f0$@up.ac.za> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rex.dwyer at syngenta.com Thu Mar 3 14:58:07 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Thu, 3 Mar 2011 08:58:07 -0500 Subject: [R] Developing a web crawler In-Reply-To: <1299144164900-3332993.post@n4.nabble.com> References: <1299144164900-3332993.post@n4.nabble.com> Message-ID: <36180405F8418449918AD20618D110FC095BFA64B7@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Perl seems like a 10x better choice for the task, but try looking at the examples in ?strsplit to get started. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of antujsrv Sent: Thursday, March 03, 2011 4:23 AM To: r-help at r-project.org Subject: [R] Developing a web crawler Hi, I wish to develop a web crawler in R. I have been using the functionalities available under the RCurl package. I am able to extract the html content of the site but i don't know how to go about analyzing the html formatted document. I wish to know the frequency of a word in the document. I am only acquainted with analyzing data sets. So how should i go about analyzing data that is not available in table format. Few chunks of code that i wrote: w <- getURL("http://www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B003DZ1Y8Q/ref=dp_reviewsanchor#FullQuotes") write.table(w,"test.txt") t <- readLines(w) readLines also didnt prove out to be of any help. Any help would be highly appreciated. Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/Developing-a-web-crawler-tp3332993p3332993.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. From shawjw at gmail.com Thu Mar 3 15:08:38 2011 From: shawjw at gmail.com (James Shaw) Date: Thu, 3 Mar 2011 08:08:38 -0600 Subject: [R] m out of n bootstrap Message-ID: Can anyone confirm the formula for the m out of n bootstrap variance estimator? rq.boot applies a deflation factor directly to the bootstrap estimates. Presumably, the SE of the estimate of interest is then taken to be the SD of the deflated estimates. I have read Bickel's and others' papers on this subject but have not seen an explicit formula provided for the m out n variance estimator. -- James W. Shaw, Ph.D., Pharm.D., M.P.H. Assistant Professor Department of Pharmacy Administration College of Pharmacy University of Illinois at Chicago 833 South Wood Street, M/C 871, Room 266 Chicago, IL 60612 Tel.: 312-355-5666 Fax: 312-996-0868 Mobile Tel.: 215-852-3045 From deliverable at gmail.com Thu Mar 3 15:10:21 2011 From: deliverable at gmail.com (Alexy Khrabrov) Date: Thu, 3 Mar 2011 09:10:21 -0500 Subject: [R] Developing a web crawler In-Reply-To: <1299144164900-3332993.post@n4.nabble.com> References: <1299144164900-3332993.post@n4.nabble.com> Message-ID: <7A3F806C-3093-45EF-B247-D12C5C1A36CC@gmail.com> On Mar 3, 2011, at 4:22 AM, antujsrv wrote: > > I wish to develop a web crawler in R. As Rex said, there are faster languages, but R string processing got better due to the stringr package (R Journal 2010-2). When Hadley is done with it, it will be like having it all in R! -- Alexy From rex.dwyer at syngenta.com Thu Mar 3 15:32:50 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Thu, 3 Mar 2011 09:32:50 -0500 Subject: [R] Greek character and R In-Reply-To: <1299158159334-3333304.post@n4.nabble.com> References: <1299158159334-3333304.post@n4.nabble.com> Message-ID: <36180405F8418449918AD20618D110FC095BFA6544@USETCMSXMB02.NAFTA.SYNGENTA.ORG> mytitle = parse(text=paste("expression(paste(delta^13,'C Station ',",i,"))")) title(mytitle) -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Filoche Sent: Thursday, March 03, 2011 8:16 AM To: r-help at r-project.org Subject: [R] Greek character and R Dear R users. In a loop, I set the title of my graph with : mytitle = expression(paste(delta^13,'C Station ', i) title(mytitle) However, instead of using value of i, it will literally use "i" character. Any one know the way to concatenate the value of i to the mathematical expression? With regards, Phil -- View this message in context: http://r.789695.n4.nabble.com/Greek-character-and-R-tp3333304p3333304.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. From rex.dwyer at syngenta.com Thu Mar 3 15:32:50 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Thu, 3 Mar 2011 09:32:50 -0500 Subject: [R] R usage survey In-Reply-To: References: Message-ID: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Harsh, "Suitably analyzed" for whose purposes? One man's "suitable" is another's "outrageous". That's why people want to see the gowns at the Oscars. Under what auspices are you conducting this survey? What do you intend to do with it? You don't give any assurance that the results you post won't have personally identifiable information. I don't get the impression that you know much about survey design. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Harsh Sent: Thursday, March 03, 2011 5:53 AM To: r-help at r-project.org Subject: [R] R usage survey Hi R users, I request members of the R community to consider filling a short survey regarding the use of R. The survey can be found at http://goo.gl/jw1ig Please accept my apologies for posting here for a non-technical reason. The data collected will be suitably analyzed and I'll post a link to the results in the coming weeks. Thank you all for your interest and for sharing your R usage information. Regards, Harsh Singhal [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. From Bernhard_Pfaff at fra.invesco.com Thu Mar 3 15:48:11 2011 From: Bernhard_Pfaff at fra.invesco.com (Pfaff, Bernhard Dr.) Date: Thu, 3 Mar 2011 14:48:11 -0000 Subject: [R] Multivariate Granger Causality Tests In-Reply-To: <1299143222871-3332968.post@n4.nabble.com> References: <1299143222871-3332968.post@n4.nabble.com> Message-ID: Dear Hazzard I. Petzev, you might find causality() in the package vars useful. Best, Bernhard > -----Urspr?ngliche Nachricht----- > Von: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] Im Auftrag von hazzard > Gesendet: Donnerstag, 3. M?rz 2011 10:07 > An: r-help at r-project.org > Betreff: [R] Multivariate Granger Causality Tests > > Dear Community, > > For my masters thesis I need to perform a multivariate > granger causality test. I have found a code for bivariate > testing on this page > (http://www.econ.uiuc.edu/~econ472/granger.R.txt), which I > think would not be useful for the multivariate case. Does > anybody know a code for a multivariate granger causality > test. Thank you in advance. > > Best Regards > > -- > View this message in context: > http://r.789695.n4.nabble.com/Multivariate-Granger-Causality-T ests-tp3332968p3332968.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ***************************************************************** Confidentiality Note: The information contained in this ...{{dropped:10}} From lebatsnok at gmail.com Thu Mar 3 15:55:52 2011 From: lebatsnok at gmail.com (Kenn Konstabel) Date: Thu, 3 Mar 2011 16:55:52 +0200 Subject: [R] sqlFetch (RODBC) question Message-ID: Dear all, I've used RODBC a lot to read in files created in MS excel and access but found a strange problem today: a variable in my data file contained both numbers and text; sqlFetch would set text within a row of numbers to NA; but if first 5 or 6 rows would be text then all numbers would be read in as NA. con<-odbcConnectExcel("xample.xls") #the file is attached or at http://psych.ut.ee/~nek/ajutine/xample.xls sqlFetch(con, "TT$") # ID_NO Setting_ID #1 NA NA #2 1220000 12203 # 3 1220001 12203 #etc Whereas the same file saved as csv reads in correctly as: read.csv("xample.csv") # ID_NO Setting_ID #1 b a #2 1220000 12203 #3 1220001 12203 #4 1220002 12202 #5 1220003 12202 #etc Can anyone explain why it would behave like this? #just in case: > sessionInfo() R version 2.12.1 (2010-12-16) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United Kingdom.1252 [2] LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] RODBC_1.3-2 loaded via a namespace (and not attached): [1] iterators_1.0.3 tools_2.12.1 Thanks in advance, Kenn Kenn Konstabel Department of Chronic Diseases National Institute for Health Development Hiiu 42 Tallinn, Estonia From izahn at psych.rochester.edu Thu Mar 3 16:02:42 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Thu, 3 Mar 2011 15:02:42 +0000 Subject: [R] sqlFetch (RODBC) question In-Reply-To: References: Message-ID: Hi Kenn, This is discussed in the package vignette, section 7. Best, Ista On Thu, Mar 3, 2011 at 2:55 PM, Kenn Konstabel wrote: > Dear all, > > I've used RODBC a lot to read in files created in MS excel and access but > found a strange problem today: a variable in my data file contained both > numbers and text; sqlFetch would set text within a row of numbers to NA; but > if first 5 or 6 rows would be text then all numbers would be read in as NA. > > con<-odbcConnectExcel("xample.xls") ? #the file is attached or at > http://psych.ut.ee/~nek/ajutine/xample.xls > sqlFetch(con, "TT$") > # ? ? ID_NO Setting_ID > #1 ? ? ? NA ? ? ? ? NA > #2 ?1220000 ? ? ?12203 > # 3 ?1220001 ? ? ?12203 > #etc > > Whereas the same file saved as csv reads in correctly as: > > read.csv("xample.csv") > # ? ? ID_NO Setting_ID > #1 ? ? ? ?b ? ? ? ? ?a > #2 ?1220000 ? ? ?12203 > #3 ?1220001 ? ? ?12203 > #4 ?1220002 ? ? ?12202 > #5 ?1220003 ? ? ?12202 > #etc > > Can anyone explain why it would behave like this? > > #just in case: >> sessionInfo() > R version 2.12.1 (2010-12-16) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_United Kingdom.1252 > [2] LC_CTYPE=English_United Kingdom.1252 > [3] LC_MONETARY=English_United Kingdom.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United Kingdom.1252 > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > > other attached packages: > [1] RODBC_1.3-2 > > loaded via a namespace (and not attached): > [1] iterators_1.0.3 tools_2.12.1 > > > > Thanks in advance, > Kenn > > Kenn Konstabel > Department of Chronic Diseases > National Institute for Health Development > Hiiu 42 > Tallinn, Estonia > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From djbirdnerd at hotmail.com Thu Mar 3 15:00:00 2011 From: djbirdnerd at hotmail.com (djbirdnerd) Date: Thu, 3 Mar 2011 06:00:00 -0800 (PST) Subject: [R] Ordering several histograms Message-ID: <1299160800582-3333382.post@n4.nabble.com> Hallo everyone, I want to evaluate the change of the distribution for several size classes. How can i order these separate histograms with the same y-axis along a common x-axis according to their size classes. It would like it to look a bit like this (http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=109) without the quantile regression. I can produce the separate histograms, but have no clue how to merge them. i can put them next to each with separate x- and y-axes. Much obliged, Kenneth -- View this message in context: http://r.789695.n4.nabble.com/Ordering-several-histograms-tp3333382p3333382.html Sent from the R help mailing list archive at Nabble.com. From fologodubois at yahoo.com Thu Mar 3 13:07:41 2011 From: fologodubois at yahoo.com (Fologo Dubois) Date: Thu, 3 Mar 2011 04:07:41 -0800 (PST) Subject: [R] Fractional degree of differencing, d Message-ID: <539348.26850.qm@web120205.mail.ne1.yahoo.com> Formula: Memory Index called delta?in Parzen(1983); see pdf attachment p.536 ? Code: ? ########################################################################## # I am using a simulated long memories time series X1 of length 2000;??? # # I actually used d=.25 for AFRIMA (0,.25,0)???????????????????????????? # # and I am trying to estimate d through the memory index discussed in??? # # Parzen(1983) on p.536 . I am in need of an assessment of my code for?? # # the Parzen window as well as the choice of k and n. in my code I used? # # k to be 999 and n to be 2000. I am not confortable with the memory???? # #? index estimator and I will appreciate some help on the the code.????? # #?????????????????????????? Thank you!?????????????????????????????????? # ########################################################################## ? Pt <- acf(X1,2000) n <- length(X1) vv <- 1:(n-1) T <- 2000 MT <- T/2 MT2 <- MT%/%2 ## Parzen window formula on p.536 M_vT <- KK <- as.numeric(0) M_vT = vv/MT for (v in vv) { ??????? K[v] <- if (v <= MT2) ??????????? 1 - 6 * M_vT[v]^2 * (1 - M_vT[v]) ??????? else if ( v <= MT) ??????????? 2 * (1 - M_vT[v])^3 ??????? else 0 ??? } ## Non-parametric kernel spectral density estimator formula on p.536 ?p? = Pt$acf ?P = g = 0 ?for (v in 1:999) { ?g = g + (K[v]*p[v]) ?P[v] = g ?} w? <- seq(.005, 1, by = .005) i.c <- sqrt(as.complex(-1)) g.w <- 0 f.w <- function(w){ ?for (v in 1:999) { ?g.w = g.w+ P[v]*exp(-2*pi*i.c*w*v) ????} ?g.w ?} # f.w(.015) for w=.015 for instance ## memory index delta formula on p.536 g.d = 0 j = 1:999 j1 = j/n j2 = 1000/n f1 = f.w(j1) f2 = f.w(j2) delta = 0 deltak = 0 for (i in 1:999){? ?g.d = g.d + (log(f1[i]) - log(f2)) ???} ??delta = g.d ? ?deltak = delta/999 -------------- next part -------------- A non-text attachment was scrubbed... Name: Parzen(1983).pdf Type: application/pdf Size: 1132008 bytes Desc: not available URL: From jpmaroco at gmail.com Thu Mar 3 15:03:21 2011 From: jpmaroco at gmail.com (jpmaroco) Date: Thu, 3 Mar 2011 06:03:21 -0800 (PST) Subject: [R] Probabilities greather than 1 in HIST Message-ID: <1299161001818-3333388.post@n4.nabble.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From marchywka at hotmail.com Thu Mar 3 15:07:19 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Thu, 3 Mar 2011 09:07:19 -0500 Subject: [R] Developing a web crawler / R "webkit" or something similar? In-Reply-To: <1299144164900-3332993.post@n4.nabble.com> References: <1299144164900-3332993.post@n4.nabble.com> Message-ID: > Date: Thu, 3 Mar 2011 01:22:44 -0800 > From: antujsrv at gmail.com > To: r-help at r-project.org > Subject: [R] Developing a web crawler > > Hi, > > I wish to develop a web crawler in R. I have been using the functionalities > available under the RCurl package. > I am able to extract the html content of the site but i don't know how to go In general this can be a big effort but there may be things in text processing packages you could adapt to execute html and javascript. However, I guess what I'd be looking for is something like a "webkit" package or other open source browser with or without an "R" interface. This actually may be an ideal solution for a lot of things as you get all the content handlers of at least some browser. Now that you mention it, I wonder if there are browser plugins to handle "R" content ( I'd have to give this some thought, put a script up as a web page with mime type "test/R" and have it execute it in R. ) > about analyzing the html formatted document. > I wish to know the frequency of a word in the document. I am only acquainted > with analyzing data sets. > So how should i go about analyzing data that is not available in table > format. > > Few chunks of code that i wrote: > w <- > getURL("http://www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B003DZ1Y8Q/ref=dp_reviewsanchor#FullQuotes") > write.table(w,"test.txt") > t <- readLines(w) > > readLines also didnt prove out to be of any help. > > Any help would be highly appreciated. Thanks in advance. > > > -- > View this message in context: http://r.789695.n4.nabble.com/Developing-a-web-crawler-tp3332993p3332993.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From pmassicotte at hotmail.com Thu Mar 3 15:39:29 2011 From: pmassicotte at hotmail.com (Filoche) Date: Thu, 3 Mar 2011 06:39:29 -0800 (PST) Subject: [R] Greek character and R In-Reply-To: <36180405F8418449918AD20618D110FC095BFA6544@USETCMSXMB02.NAFTA.SYNGENTA.ORG> References: <1299158159334-3333304.post@n4.nabble.com> <36180405F8418449918AD20618D110FC095BFA6544@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Message-ID: <1299163169689-3333467.post@n4.nabble.com> Hi and ty for the answer. However, it's not working. It will print "expression(d13C Station 1)". Thank for any help, Phil -- View this message in context: http://r.789695.n4.nabble.com/Greek-character-and-R-tp3333304p3333467.html Sent from the R help mailing list archive at Nabble.com. From timoumout at yahoo.fr Thu Mar 3 15:09:55 2011 From: timoumout at yahoo.fr (=?UTF-8?Q?Timoth=C3=A9e?=) Date: Thu, 3 Mar 2011 06:09:55 -0800 (PST) Subject: [R] Zero truncated Poisson distribution & R2WinBUGS In-Reply-To: References: Message-ID: <1299161395280-3333406.post@n4.nabble.com> Hi, I have a very similar problem... In some sites, counts data>0 In the other sites, counts = 0 I already applied zero inflated models with the zero-trick (Martin et al 2005 in Ecology letters), but I would like to use truncated distributions (Poisson and negative binomial) to model my counts in order to estimate the number of sites where no counts but presence. The problem is that I can't manage to write the likelihood of such truncated distributions. Did you get any valuable answers to your post ? Would you mind to share with me how you succeeded (if so) ? Thank you very much Timoth?e -- View this message in context: http://r.789695.n4.nabble.com/Zero-truncated-Poisson-distribution-R2WinBUGS-tp3043121p3333406.html Sent from the R help mailing list archive at Nabble.com. From singhalblr at gmail.com Thu Mar 3 16:12:59 2011 From: singhalblr at gmail.com (Harsh) Date: Thu, 3 Mar 2011 20:42:59 +0530 Subject: [R] R usage survey In-Reply-To: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> References: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ivan.calandra at uni-hamburg.de Thu Mar 3 16:14:53 2011 From: ivan.calandra at uni-hamburg.de (Ivan Calandra) Date: Thu, 03 Mar 2011 16:14:53 +0100 Subject: [R] Applying function to multiple data In-Reply-To: References: Message-ID: <4D6FB06D.3050707@uni-hamburg.de> Hi, It might not be the best approach, but here is what I would do. ########## 1) If you have your data in 3 different data.frames: #create a named list where each element is one of your data.frame list_df <- vector(mode="list", length=3) names(list_df) <- c("Bank", "Corporate", "Sovereign") list_df[[1]] <- data.frame(k = c(1:8), ratings = c("A", "B", "C", "D", "E", "F", "G","H"), default_frequency = c(0.00229,0.01296,0.01794,0.04303,0.04641,0.06630,0.06862,0.06936)) list_df[[2]] <- data.frame(k = c(1:8), ratings = c("A", "B", "C", "D", "E", "F", "G","H"), default_frequency = c(0.00101,0.01433,0.02711,0.03701,0.04313,0.05600,0.06041,0.07112)) list_df[[3]] <- data.frame(k = c(1:8), ratings = c("A", "B", "C", "D", "E", "F", "G","H"), default_frequency = c(0.00210,0.01014,0.02001,0.04312,0.05114,0.06801,0.06997,0.07404)) #apply your function DP to each element of the list, i.e. to each data.frame: out1 <- lapply(list_df, FUN=function(x) DP(k=x$k, ODF=x$default_frequency, ratings=x$ratings)) ########## 2) If you have your data in a single data.frame, as it looks from your example, I would first fill all the cells, so that it looks like this: df2 <- structure(list(Class = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("Bank", "Corporate", "Sovereign"), class = "factor"), k = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L), rating = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L), .Label = c("A", "B", "C", "D", "E", "F", "G", "H"), class = "factor"), default_frequency = c(0.00229, 0.01296, 0.01794, 0.04303, 0.04641, 0.0663, 0.06862, 0.06936, 0.00101, 0.01433, 0.02711, 0.03701, 0.04313, 0.056, 0.06041, 0.07112, 0.0021, 0.01014, 0.02001, 0.04312, 0.05114, 0.06801, 0.06997, 0.07404)), .Names = c("Class", "k", "ratings", "default_frequency"), class = "data.frame", row.names = c(NA, -24L)) #then split by Class: list_df2 <- split(df2, df2$Class) #and apply as before: out2 <- lapply(list_df2, FUN=function(x) DP(k=x$k, ODF=x$default_frequency, ratings=x$ratings)) #or in one step using plyr: library(plyr) out3 <- dlply(.data=df2, .variables="Class", .fun=function(x) DP(k=x$k, ODF=x$default_frequency, ratings=x$ratings)) ########## 3) all solutions give the same results: all.equal(out1, out2, check.attributes=FALSE) [1] TRUE all.equal(out1, out3, check.attributes=FALSE) [1] TRUE all.equal(out2, out3, check.attributes=FALSE) [1] TRUE HTH, Ivan Le 3/3/2011 11:06, Akshata Rao a ?crit : > Dear R helpers, > > I know R language at a preliminary level. This is my first post to this R > forum. I have recently learned the use of function and have been successful > in writing few on my own. However I am not able to figure out how to apply > the function to multiple sets of data. > > # MY QUERY > > Suppose I am having following data.frame > > df = data.frame(k = c(1:8), ratings = c("A", "B", "C", "D", "E", "F", "G", > "H"), > default_frequency = > c(0.00229,0.01296,0.01794,0.04303,0.04641,0.06630,0.06862,0.06936)) > > # ------------------------------- > > DP = function(k, ODF, ratings) > > { > > n<- length(ODF) > tot_klnODF<- sum(k*log(ODF)) > tot_k<- sum(k) > tot_lnODF<- sum(log(ODF)) > tot_k2<- sum(k^2) > slope<- exp((n * tot_klnODF - tot_k * tot_lnODF)/(n * tot_k2 - > tot_k^2)) > intercept<- exp((tot_lnODF - log(slope)* tot_k)/n) > IPD<- intercept * slope^k > > return(data.frame(ratings = ratings, default_probability = round(IPD, digits > = 4))) > > } > > result = DP(k = df$k, ODF = df$default_frequency, ratings = df$ratings) > > # > ________________________________________________________________________________________ > > The above code fetches me following result. However, I am dealing with only > one set of data here as defined in 'df'. > >> result > ratings default_probability > 1 A 0.0061 > 2 B 0.0094 > 3 C 0.0145 > 4 D 0.0222 > 5 E 0.0342 > 6 F 0.0527 > 7 G 0.0810 > 8 H 0.1247 > > > # MY PROBLEM > > Suppose I have data as given below > > Class k rating default_frequency > Bank 1 A 0.00229 > 2 B 0.01296 > 3 C 0.01794 > 4 D 0.04303 > 5 E 0.04641 > 6 F 0.06630 > 7 G 0.06862 > 8 H 0.06936 > Corporate 1 A 0.00101 > 2 B 0.01433 > 3 C 0.02711 > 4 D 0.03701 > 5 E 0.04313 > 6 F 0.05600 > 7 G 0.06041 > 8 H 0.07112 > Sovereign 1 A 0.00210 > 2 B 0.01014 > 3 C 0.02001 > 4 D 0.04312 > 5 E 0.05114 > 6 F 0.06801 > 7 G 0.06997 > 8 H 0.07404 > > So I need to use the function "DP" defined above to generate three sets of > results viz. for Bank, Corporate, Sovereign and save each of these results > as diffrent csv files say as bank.csv, corporate.csv etc. Again please note > that there could be say 'm' number of classes. I was trying to use the apply > function but things are not working for me. I will really apprecaite the > guidenace. I hope I am able to put up my query in a neat manner. > > Regards and thanking you all in advance. > > Akshata Rao > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php From murdoch.duncan at gmail.com Thu Mar 3 16:29:55 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Thu, 03 Mar 2011 10:29:55 -0500 Subject: [R] Greek character and R In-Reply-To: <1299158159334-3333304.post@n4.nabble.com> References: <1299158159334-3333304.post@n4.nabble.com> Message-ID: <4D6FB3F3.8020009@gmail.com> On 11-03-03 8:15 AM, Filoche wrote: > Dear R users. > > In a loop, I set the title of my graph with : > > mytitle = expression(paste(delta^13,'C Station ', i) > title(mytitle) > > However, instead of using value of i, it will literally use "i" character. > > Any one know the way to concatenate the value of i to the mathematical > expression? > Use mytitle <- substitute(paste(delta^13,'C Station ', ival), list(ival=i)) Duncan Murdoch From rex.dwyer at syngenta.com Thu Mar 3 16:30:57 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Thu, 3 Mar 2011 10:30:57 -0500 Subject: [R] Greek character and R In-Reply-To: <1299163169689-3333467.post@n4.nabble.com> References: <1299158159334-3333304.post@n4.nabble.com> <36180405F8418449918AD20618D110FC095BFA6544@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <1299163169689-3333467.post@n4.nabble.com> Message-ID: <36180405F8418449918AD20618D110FC095BFA6661@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Eval it. This works at my house: plot(0) title(eval(parse(text=paste("expression(paste(delta^13,'C Station ',",i,"))")))) -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Filoche Sent: Thursday, March 03, 2011 9:39 AM To: r-help at r-project.org Subject: Re: [R] Greek character and R Hi and ty for the answer. However, it's not working. It will print "expression(d13C Station 1)". Thank for any help, Phil -- View this message in context: http://r.789695.n4.nabble.com/Greek-character-and-R-tp3333304p3333467.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. From rex.dwyer at syngenta.com Thu Mar 3 16:31:10 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Thu, 3 Mar 2011 10:31:10 -0500 Subject: [R] R usage survey In-Reply-To: References: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Message-ID: <36180405F8418449918AD20618D110FC095BFA6660@USETCMSXMB02.NAFTA.SYNGENTA.ORG> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jdaily at usgs.gov Thu Mar 3 16:43:48 2011 From: jdaily at usgs.gov (Jonathan P Daily) Date: Thu, 3 Mar 2011 10:43:48 -0500 Subject: [R] Probabilities greather than 1 in HIST In-Reply-To: <1299161001818-3333388.post@n4.nabble.com> References: <1299161001818-3333388.post@n4.nabble.com> Message-ID: If you read ?hist, you will answer your own question. The issue in your code is the parameter prob = T, which does nothing. By default, hist reports density. -------------------------------------- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 "Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it." - Jubal Early, Firefly r-help-bounces at r-project.org wrote on 03/03/2011 09:03:21 AM: > [image removed] > > [R] Probabilities greather than 1 in HIST > > jpmaroco > > to: > > r-help > > 03/03/2011 10:13 AM > > Sent by: > > r-help-bounces at r-project.org > > Dear all, > I am a newbie in R and could not find help on this problem. I am trying to > plot an histogram with probabilities in the y axis. This is the code I am > using: > > #TLC uniform > n=30 > mi=1; mx=6 > nrep=1000 > xbar=rep(0,nrep) > for (i in 1:nrep) {xbar[i]=mean(runif(n,min=mi,max=mx))} > hist(xbar,prob=TRUE,breaks="Sturges",xlim=c(1,6),main=paste("n =",n), > xlab="M??dia", ylab="Probabilidade") > curve(dnorm(x,mean=mean(xbar),sd=sd(xbar)),add=TRUE,lwd=2,col="red") > > The problem is that I am getting greater than 1 probabilities in the Y axis? > Is there a way to correct this? > Many thanks in advance. > Joao > > -- > View this message in context: http://r.789695.n4.nabble.com/ > Probabilities-greather-than-1-in-HIST-tp3333388p3333388.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From wdunlap at tibco.com Thu Mar 3 16:45:51 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 3 Mar 2011 07:45:51 -0800 Subject: [R] Greek character and R In-Reply-To: <4D6FB3F3.8020009@gmail.com> References: <1299158159334-3333304.post@n4.nabble.com> <4D6FB3F3.8020009@gmail.com> Message-ID: <77EB52C6DD32BA4D87471DCD70C8D70003F74B56@NA-PA-VBE03.na.tibco.com> Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Duncan Murdoch > Sent: Thursday, March 03, 2011 7:30 AM > To: Filoche > Cc: r-help at r-project.org > Subject: Re: [R] Greek character and R > > On 11-03-03 8:15 AM, Filoche wrote: > > Dear R users. > > > > In a loop, I set the title of my graph with : > > > > mytitle = expression(paste(delta^13,'C Station ', i) > > title(mytitle) > > > > However, instead of using value of i, it will literally use > "i" character. > > > > Any one know the way to concatenate the value of i to the > mathematical > > expression? > > > > Use > > mytitle <- substitute(paste(delta^13,'C Station ', ival), list(ival=i)) Or use bquote(): mytitle <- bquote(paste(delta^13,'C Station ', .(i))) Note that the original could not have worked since 'i' and 'delta' appeared in the same manner and you only wanted i replaced by it value. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > Duncan Murdoch > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From r.m.krug at gmail.com Thu Mar 3 16:53:09 2011 From: r.m.krug at gmail.com (Rainer M Krug) Date: Thu, 3 Mar 2011 16:53:09 +0100 Subject: [R] Analytical Hierarchical Process (AHP) in R? Message-ID: Hi Does anybody know anything about an implementation of an AHP in R or any other open source tool? I googled but could not find anything Thanks, Rainer -- NEW GERMAN FAX NUMBER!!! Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Cell:? ? ? ? ?? +27 - (0)83 9479 042 Fax:? ? ? ? ? ? +27 - (0)86 516 2782 Fax:? ? ? ? ? ? +49 - (0)321 2125 2244 email:? ? ? ? ? Rainer at krugs.de Skype:? ? ? ? ? RMkrug Google:? ? ? ?? R.M.Krug at gmail.com From sue at xlsolutions-corp.com Thu Mar 3 16:57:44 2011 From: sue at xlsolutions-corp.com (Sue Turner) Date: Thu, 03 Mar 2011 08:57:44 -0700 Subject: [R] R Course***Advanced Statistical Modeling in R by XLSolutions Corp Message-ID: <20110303085744.aa8924c5d28ca71e2a043bb294e795eb.8ede2f8a38.wbe@email00.secureserver.net> Our New York City course: Advanced Statistical Modelling in R/S-PLUS is coming up on March 14-15 http://www.xlsolutions-corp.com/coursedetail.asp?id=13 by XLSolutions Corp Email sue at xlsolutions-corp.com Regards - Sue Turner Senior Account Manager XLSolutions Corporation North American Division 1700 7th Ave Suite 2100 Seattle, WA 98101 Phone: 206-686-1578 Email: sue at xlsolutions-corp.com web: www.xlsolutions-corp.com/rcourses From dwinsemius at comcast.net Thu Mar 3 16:58:57 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Mar 2011 09:58:57 -0600 Subject: [R] Probabilities greather than 1 in HIST In-Reply-To: <1299161001818-3333388.post@n4.nabble.com> References: <1299161001818-3333388.post@n4.nabble.com> Message-ID: <2C8232C9-DF3B-4B72-8719-0C2788761E32@comcast.net> On Mar 3, 2011, at 8:03 AM, jpmaroco wrote: > Dear all, > I am a newbie in R and could not find help on this problem. I am > trying to > plot an histogram with probabilities in the y axis. This is the code > I am > using: > > #TLC uniform > n=30 > mi=1; mx=6 > nrep=1000 > xbar=rep(0,nrep) > for (i in 1:nrep) {xbar[i]=mean(runif(n,min=mi,max=mx))} > hist(xbar,prob=TRUE,breaks="Sturges",xlim=c(1,6),main=paste("n =",n), > xlab="M?dia", ylab="Probabilidade") > curve(dnorm(x,mean=mean(xbar),sd=sd(xbar)),add=TRUE,lwd=2,col="red") > > The problem is that I am getting greater than 1 probabilities in the > Y axis? > Is there a way to correct this? Despite the argument name, which I agree suggests that probabilities will be plotted, what is really described in the help page is that densities will be plotted, and densities may be greater than 1. You can suppress plotting of the y-axis, calculate the probabilities for each of the groups returned by hist, and then use the axes function. > xhist <- hist(xbar,breaks="Sturges",plot=FALSE) > yhist <- xhist$counts/sum(xhist$counts) > yhist [1] 0.002 0.027 0.087 0.236 0.287 0.228 0.107 0.021 0.004 0.001 > Many thanks in advance. > Joao > > -- > View this message in context: http://r.789695.n4.nabble.com/Probabilities-greather-than-1-in-HIST-tp3333388p3333388.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From crosspide at hotmail.com Thu Mar 3 17:07:29 2011 From: crosspide at hotmail.com (agent dunham) Date: Thu, 3 Mar 2011 08:07:29 -0800 (PST) Subject: [R] Recodifying a factor due to results in lm Message-ID: <1299168449357-3333638.post@n4.nabble.com> Dear community, I'm doing a lm. In the independent variables I've got a categorical one. Here is its histogram: http://r.789695.n4.nabble.com/file/n3333638/altitude.png I did this regression: lmeo2.52f <- lm(dat82$IncAltuDom ~ dat82$hdom2+log(dat82$CV)+ dat82$CA+ dat82$FCC+ factor(dat82$IdAltitud)) I obtain: Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 8.78222 0.94619 9.282 3.72e-15 *** dat82$hdom2 -0.30859 0.03875 -7.963 2.73e-12 *** log(dat82$CV) -0.42943 0.24568 -1.748 0.0835 . dat82$CA -2.98157 2.29904 -1.297 0.1977 dat82$FCC 0.02300 0.01067 2.156 0.0335 * factor(dat82$IdAltitud)1 -0.12142 0.40361 -0.301 0.7642 factor(dat82$IdAltitud)2 0.24341 0.43451 0.560 0.5766 factor(dat82$IdAltitud)3 -0.64904 0.47114 -1.378 0.1714 factor(dat82$IdAltitud)4 -1.14334 0.67509 -1.694 0.0935 . factor(dat82$IdAltitud)5 -2.13251 0.82463 -2.586 0.0112 * I thought I need to recodify my factor, q1 How can I do it? q2 Apologies I'm pretty newbie with this, ... I don't know how to interpret the regression when factors ... The factors created by default, compare the 1st factor with the other 5? ... but what does it mean??? is it good in my case ?? Thanks in advance, user at host.com -- View this message in context: http://r.789695.n4.nabble.com/Recodifying-a-factor-due-to-results-in-lm-tp3333638p3333638.html Sent from the R help mailing list archive at Nabble.com. From pjmiller_57 at yahoo.com Thu Mar 3 16:07:17 2011 From: pjmiller_57 at yahoo.com (Paul Miller) Date: Thu, 3 Mar 2011 07:07:17 -0800 (PST) Subject: [R] Pairwise T-Tests and Dunnett's Test (possibly using multcomp) In-Reply-To: Message-ID: <319630.74676.qm@web161614.mail.bf1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From schmitzh at uni-bremen.de Thu Mar 3 16:43:39 2011 From: schmitzh at uni-bremen.de (Heike Schmitz) Date: Thu, 03 Mar 2011 16:43:39 +0100 Subject: [R] Error in model.frame.default Message-ID: <4D6FB72B.5010409@uni-bremen.de> Dear R- Community, to learn i reanalysed some data provided and analysed by Zuur et. al. in their book "Mixed effect models and Extensions in Ecology with R". When i run the last command i get a warning message i dont understand. Loyn<- read.table(file = "loyn.txt",header = TRUE) Loyn$L.AREA<- log10(Loyn$AREA) fGRAZE <-factor(Loyn$GRAZE) M0<- lm(ABUND~ L.AREA + fGRAZE, data = Loyn) summary(M0) plot(x = Loyn$L.AREA, y = Loyn$ABUND, xlab = "Log transformed AREA", ylab = "Bird Abundance") D1<- data.frame(L.AREA= Loyn$L.AREA[Loyn$GRAZE==1], fGraze = "1") P1<- predict(M0,newdata = D1) Warning message: Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : variable lengths differ (found for 'fGRAZE') In addition: Warning message: 'newdata' had 13 rows but variable(s) found have 56 rows I hope anyone has an idea. Thank you in advance. Heike -- Heike Schmitz- Diaspero Population Ecology and Evolutionary Ecology Lab, FB2 University of Bremen Leobener Strasse, Nw2, Room B4050 D-28359 Bremen Germany fon ++49-421-218-62937 email: heike.schmitz at uni-bremen.de http://www.popecol.uni-bremen.de From scttchamberlain4 at gmail.com Thu Mar 3 16:39:20 2011 From: scttchamberlain4 at gmail.com (Scott Chamberlain) Date: Thu, 3 Mar 2011 09:39:20 -0600 Subject: [R] Ordering several histograms In-Reply-To: <1299160800582-3333382.post@n4.nabble.com> References: <1299160800582-3333382.post@n4.nabble.com> Message-ID: <6973DBD07D2141F58EBF744C5AC5A41C@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From f.harrell at vanderbilt.edu Thu Mar 3 17:19:33 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Thu, 3 Mar 2011 08:19:33 -0800 (PST) Subject: [R] Multivariate Granger Causality Tests In-Reply-To: References: <1299143222871-3332968.post@n4.nabble.com> Message-ID: <1299169173521-3333652.post@n4.nabble.com> Beware that causality can only be inferred using information that extends far beyond the data at hand. Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Multivariate-Granger-Causality-Tests-tp3332968p3333652.html Sent from the R help mailing list archive at Nabble.com. From stgries at gmail.com Thu Mar 3 17:40:25 2011 From: stgries at gmail.com (Stefan Th. Gries) Date: Thu, 3 Mar 2011 08:40:25 -0800 Subject: [R] Developing a web crawler Message-ID: Hi The book whose companion website is here deals with many of the things you need for a web crawler, and assignment "other 5" on that site () is a web crawler. Best, STG -- Stefan Th. Gries ----------------------------------------------- University of California, Santa Barbara http://www.linguistics.ucsb.edu/faculty/stgries From Achim.Zeileis at uibk.ac.at Thu Mar 3 17:38:17 2011 From: Achim.Zeileis at uibk.ac.at (Achim Zeileis) Date: Thu, 3 Mar 2011 17:38:17 +0100 (CET) Subject: [R] Zero truncated Poisson distribution & R2WinBUGS In-Reply-To: <1299161395280-3333406.post@n4.nabble.com> References: <1299161395280-3333406.post@n4.nabble.com> Message-ID: On Thu, 3 Mar 2011, Timoth?e wrote: > Hi, > I have a very similar problem... > In some sites, counts data>0 > In the other sites, counts = 0 > I already applied zero inflated models with the zero-trick (Martin et al > 2005 in Ecology letters), but I would like to use truncated distributions > (Poisson and negative binomial) to model my counts in order to estimate the > number of sites where no counts but presence. > The problem is that I can't manage to write the likelihood of such truncated > distributions. If you're looking for a maximum likelihood regression model (without random effects), then the "countreg" package on R-Forge may be helpful for you. The package is not yet on CRAN because we are planning to add more functionality, but the code is well tested. It's the same code-base underlying zeroinfl() and hurdle() in "pscl". hth, Z > Did you get any valuable answers to your post ? > Would you mind to share with me how you succeeded (if so) ? > Thank you very much > Timoth?e > > -- > View this message in context: http://r.789695.n4.nabble.com/Zero-truncated-Poisson-distribution-R2WinBUGS-tp3043121p3333406.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From E.Vettorazzi at uke.uni-hamburg.de Thu Mar 3 17:42:04 2011 From: E.Vettorazzi at uke.uni-hamburg.de (Eik Vettorazzi) Date: Thu, 03 Mar 2011 17:42:04 +0100 Subject: [R] Greek character and R In-Reply-To: <77EB52C6DD32BA4D87471DCD70C8D70003F74B56@NA-PA-VBE03.na.tibco.com> References: <1299158159334-3333304.post@n4.nabble.com> <4D6FB3F3.8020009@gmail.com> <77EB52C6DD32BA4D87471DCD70C8D70003F74B56@NA-PA-VBE03.na.tibco.com> Message-ID: <4D6FC4DC.1050801@uke.uni-hamburg.de> or even without using 'paste' plot(1,1,main=bquote(delta^13~'C Station'~.(i))) Am 03.03.2011 16:45, schrieb William Dunlap: > > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > >> -----Original Message----- >> From: r-help-bounces at r-project.org >> [mailto:r-help-bounces at r-project.org] On Behalf Of Duncan Murdoch >> Sent: Thursday, March 03, 2011 7:30 AM >> To: Filoche >> Cc: r-help at r-project.org >> Subject: Re: [R] Greek character and R >> >> On 11-03-03 8:15 AM, Filoche wrote: >>> Dear R users. >>> >>> In a loop, I set the title of my graph with : >>> >>> mytitle = expression(paste(delta^13,'C Station ', i) >>> title(mytitle) >>> >>> However, instead of using value of i, it will literally use >> "i" character. >>> >>> Any one know the way to concatenate the value of i to the >> mathematical >>> expression? >>> >> >> Use >> >> mytitle <- substitute(paste(delta^13,'C Station ', ival), > list(ival=i)) > > Or use bquote(): > mytitle <- bquote(paste(delta^13,'C Station ', .(i))) > > Note that the original could not have worked since 'i' and 'delta' > appeared in the same manner and you only wanted i replaced by it > value. > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > >> >> Duncan Murdoch >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 From jpmaroco at gmail.com Thu Mar 3 17:29:59 2011 From: jpmaroco at gmail.com (jpmaroco) Date: Thu, 3 Mar 2011 08:29:59 -0800 (PST) Subject: [R] Probabilities greather than 1 in HIST In-Reply-To: <2C8232C9-DF3B-4B72-8719-0C2788761E32@comcast.net> References: <1299161001818-3333388.post@n4.nabble.com> <2C8232C9-DF3B-4B72-8719-0C2788761E32@comcast.net> Message-ID: <007801cbd9c0$2fed7330$8fc85990$@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From kparamas at asu.edu Thu Mar 3 17:12:31 2011 From: kparamas at asu.edu (kparamas) Date: Thu, 3 Mar 2011 08:12:31 -0800 (PST) Subject: [R] Calling a function to store values Message-ID: <1299168751676-3333644.post@n4.nabble.com> Hi, I am calling a function with different arguments to read different files and want the results to be stored in different matrices. Ex: cData1 = NULL cData2 = NULL readData = function(cData, start, end) { .... cData = //reads from the file } I am calling the functions using readData(cData1,1,3) readData(cData2,4,7) But After running the code, cData1 and cData2 are not getting updated. Is there a way in R to do this? -- View this message in context: http://r.789695.n4.nabble.com/Calling-a-function-to-store-values-tp3333644p3333644.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Thu Mar 3 17:50:50 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Mar 2011 10:50:50 -0600 Subject: [R] Calling a function to store values In-Reply-To: <1299168751676-3333644.post@n4.nabble.com> References: <1299168751676-3333644.post@n4.nabble.com> Message-ID: On Mar 3, 2011, at 10:12 AM, kparamas wrote: > Hi, > > I am calling a function with different arguments to read different > files and > want the results to be stored in > different matrices. > > Ex: > cData1 = NULL > cData2 = NULL > > readData = function(cData, start, end) > { > > .... > cData = //reads from the file > } > > I am calling the functions using > readData(cData1,1,3) > readData(cData2,4,7) > > But After running the code, cData1 and cData2 are not getting updated. > Is there a way in R to do this? You need to assign the results of the function to an R object in the global environment. At the moment calling readData() only creates the cData object in the environment of the function which then disappears after function completion. Try : cdata_1_1_3 <- readData(cData1,1,3) cdata_2_4_7 <- readData(cData2,4,7) > > -- > View this message in context: http://r.789695.n4.nabble.com/Calling-a-function-to-store-values-tp3333644p3333644.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From dwinsemius at comcast.net Thu Mar 3 17:58:28 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Mar 2011 10:58:28 -0600 Subject: [R] Probabilities greather than 1 in HIST In-Reply-To: <007801cbd9c0$2fed7330$8fc85990$@gmail.com> References: <1299161001818-3333388.post@n4.nabble.com> <2C8232C9-DF3B-4B72-8719-0C2788761E32@comcast.net> <007801cbd9c0$2fed7330$8fc85990$@gmail.com> Message-ID: <65E198F0-6444-4391-A1E8-1137F258CB35@comcast.net> On Mar 3, 2011, at 10:29 AM, jpmaroco wrote: > Dear David, > > Thanks for your prompt reply. > > I see your point. But how can I get a histogram with relative > frequencies? If I use > >> plot(xhist,yhist) > > I get absolute frequencies in the Y axis. I do not know of any simple setting to do what you want only within hist. I explained that you need to suppress the y axis ( read help(par) for how to do that, possibly with xaxt="n"), calculate the values you desire, then add axis labeling at the locations on the density scale but with the probabilities you get from the calculations I illustrated. > > Best, > > Joao > > > > From: David Winsemius [via R] [mailto:ml-node+3333619-1272092771-215234 at n4.nabble.com > ] > Sent: quinta-feira, 3 de Mar?o de 2011 16:01 > To: jpmaroco > Subject: Re: Probabilities greather than 1 in HIST > > > > > On Mar 3, 2011, at 8:03 AM, jpmaroco wrote: > > >> Dear all, >> I am a newbie in R and could not find help on this problem. I am >> trying to >> plot an histogram with probabilities in the y axis. This is the code >> I am >> using: >> >> #TLC uniform >> n=30 >> mi=1; mx=6 >> nrep=1000 >> xbar=rep(0,nrep) >> for (i in 1:nrep) {xbar[i]=mean(runif(n,min=mi,max=mx))} >> hist(xbar,prob=TRUE,breaks="Sturges",xlim=c(1,6),main=paste("n =",n), >> xlab="M?dia", ylab="Probabilidade") >> curve(dnorm(x,mean=mean(xbar),sd=sd(xbar)),add=TRUE,lwd=2,col="red") >> >> The problem is that I am getting greater than 1 probabilities in the >> Y axis? >> Is there a way to correct this? > > > Despite the argument name, which I agree suggests that probabilities > will be plotted, what is really described in the help page is that > densities will be plotted, and densities may be greater than 1. You > can suppress plotting of the y-axis, calculate the probabilities for > each of the groups returned by hist, and then use the axes function. > >> xhist <- hist(xbar,breaks="Sturges",plot=FALSE) >> yhist <- xhist$counts/sum(xhist$counts) >> yhist > [1] 0.002 0.027 0.087 0.236 0.287 0.228 0.107 0.021 0.004 0.001 > > > > >> Many thanks in advance. >> Joao >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Probabilities-greather-than-1-in-HIST-tp3333388p3333388.html >> > > >> Sent from the R help mailing list archive at Nabble.com. >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > _____ > > If you reply to this email, your message will be added to the > discussion below: > > http://r.789695.n4.nabble.com/Probabilities-greather-than-1-in-HIST-tp3333388p3333619.html > > To unsubscribe from Probabilities greather than 1 in HIST, click > here > . > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Probabilities-greather-than-1-in-HIST-tp3333388p3333670.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From dwinsemius at comcast.net Thu Mar 3 17:59:34 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Mar 2011 10:59:34 -0600 Subject: [R] Probabilities greather than 1 in HIST In-Reply-To: <007801cbd9c0$2fed7330$8fc85990$@gmail.com> References: <1299161001818-3333388.post@n4.nabble.com> <2C8232C9-DF3B-4B72-8719-0C2788761E32@comcast.net> <007801cbd9c0$2fed7330$8fc85990$@gmail.com> Message-ID: On Mar 3, 2011, at 10:29 AM, jpmaroco wrote: > Dear David, > > Thanks for your prompt reply. > > I see your point. But how can I get a histogram with relative > frequencies? If I use > >> plot(xhist,yhist) > > I get absolute frequencies in the Y axis. In my earlier reply I meant to type yaxt="n". -- David. > > Best, > > Joao > > > > From: David Winsemius [via R] [mailto:ml-node+3333619-1272092771-215234 at n4.nabble.com > ] > Sent: quinta-feira, 3 de Mar?o de 2011 16:01 > To: jpmaroco > Subject: Re: Probabilities greather than 1 in HIST > > > > > On Mar 3, 2011, at 8:03 AM, jpmaroco wrote: > > >> Dear all, >> I am a newbie in R and could not find help on this problem. I am >> trying to >> plot an histogram with probabilities in the y axis. This is the code >> I am >> using: >> >> #TLC uniform >> n=30 >> mi=1; mx=6 >> nrep=1000 >> xbar=rep(0,nrep) >> for (i in 1:nrep) {xbar[i]=mean(runif(n,min=mi,max=mx))} >> hist(xbar,prob=TRUE,breaks="Sturges",xlim=c(1,6),main=paste("n =",n), >> xlab="M?dia", ylab="Probabilidade") >> curve(dnorm(x,mean=mean(xbar),sd=sd(xbar)),add=TRUE,lwd=2,col="red") >> >> The problem is that I am getting greater than 1 probabilities in the >> Y axis? >> Is there a way to correct this? > > > Despite the argument name, which I agree suggests that probabilities > will be plotted, what is really described in the help page is that > densities will be plotted, and densities may be greater than 1. You > can suppress plotting of the y-axis, calculate the probabilities for > each of the groups returned by hist, and then use the axes function. > >> xhist <- hist(xbar,breaks="Sturges",plot=FALSE) >> yhist <- xhist$counts/sum(xhist$counts) >> yhist > [1] 0.002 0.027 0.087 0.236 0.287 0.228 0.107 0.021 0.004 0.001 > > > > >> Many thanks in advance. >> Joao >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Probabilities-greather-than-1-in-HIST-tp3333388p3333388.html >> > > >> Sent from the R help mailing list archive at Nabble.com. >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > _____ > > If you reply to this email, your message will be added to the > discussion below: > > http://r.789695.n4.nabble.com/Probabilities-greather-than-1-in-HIST-tp3333388p3333619.html > > To unsubscribe from Probabilities greather than 1 in HIST, click > here > . > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Probabilities-greather-than-1-in-HIST-tp3333388p3333670.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From Martyn.Byng at nag.co.uk Thu Mar 3 18:10:18 2011 From: Martyn.Byng at nag.co.uk (Martyn Byng) Date: Thu, 3 Mar 2011 17:10:18 -0000 Subject: [R] Probabilities greather than 1 in HIST References: <1299161001818-3333388.post@n4.nabble.com><2C8232C9-DF3B-4B72-8719-0C2788761E32@comcast.net><007801cbd9c0$2fed7330$8fc85990$@gmail.com> Message-ID: <49E76DF37649DC48A4CE882BC8CE51C9019204BF@nagmail2.nag.co.uk> Hi, Does xx = rnorm(100) hist(xx,freq=FALSE) curve(dnorm,add=TRUE) give you what you want? Martyn -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of David Winsemius Sent: 03 March 2011 17:00 To: jpmaroco Cc: r-help at r-project.org Subject: Re: [R] Probabilities greather than 1 in HIST On Mar 3, 2011, at 10:29 AM, jpmaroco wrote: > Dear David, > > Thanks for your prompt reply. > > I see your point. But how can I get a histogram with relative > frequencies? If I use > >> plot(xhist,yhist) > > I get absolute frequencies in the Y axis. In my earlier reply I meant to type yaxt="n". -- David. > > Best, > > Joao > > > > From: David Winsemius [via R] [mailto:ml-node+3333619-1272092771-215234 at n4.nabble.com > ] > Sent: quinta-feira, 3 de Mar?o de 2011 16:01 > To: jpmaroco > Subject: Re: Probabilities greather than 1 in HIST > > > > > On Mar 3, 2011, at 8:03 AM, jpmaroco wrote: > > >> Dear all, >> I am a newbie in R and could not find help on this problem. I am >> trying to >> plot an histogram with probabilities in the y axis. This is the code >> I am >> using: >> >> #TLC uniform >> n=30 >> mi=1; mx=6 >> nrep=1000 >> xbar=rep(0,nrep) >> for (i in 1:nrep) {xbar[i]=mean(runif(n,min=mi,max=mx))} >> hist(xbar,prob=TRUE,breaks="Sturges",xlim=c(1,6),main=paste("n =",n), >> xlab="M?dia", ylab="Probabilidade") >> curve(dnorm(x,mean=mean(xbar),sd=sd(xbar)),add=TRUE,lwd=2,col="red") >> >> The problem is that I am getting greater than 1 probabilities in the >> Y axis? >> Is there a way to correct this? > > > Despite the argument name, which I agree suggests that probabilities > will be plotted, what is really described in the help page is that > densities will be plotted, and densities may be greater than 1. You > can suppress plotting of the y-axis, calculate the probabilities for > each of the groups returned by hist, and then use the axes function. > >> xhist <- hist(xbar,breaks="Sturges",plot=FALSE) >> yhist <- xhist$counts/sum(xhist$counts) >> yhist > [1] 0.002 0.027 0.087 0.236 0.287 0.228 0.107 0.021 0.004 0.001 > > > > >> Many thanks in advance. >> Joao >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Probabilities-greather-than-1-in-HIST-tp3333388p3333388.html >> > > >> Sent from the R help mailing list archive at Nabble.com. >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > _____ > > If you reply to this email, your message will be added to the > discussion below: > > http://r.789695.n4.nabble.com/Probabilities-greather-than-1-in-HIST-tp3333388p3333619.html > > To unsubscribe from Probabilities greather than 1 in HIST, click > here > . > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Probabilities-greather-than-1-in-HIST-tp3333388p3333670.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ________________________________________________________________________ This e-mail has been scanned for all viruses by Star.\ _...{{dropped:12}} From djmuser at gmail.com Thu Mar 3 18:51:39 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Thu, 3 Mar 2011 09:51:39 -0800 Subject: [R] Error in model.frame.default In-Reply-To: <4D6FB72B.5010409@uni-bremen.de> References: <4D6FB72B.5010409@uni-bremen.de> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From hilton at meteo.psu.edu Thu Mar 3 18:52:31 2011 From: hilton at meteo.psu.edu (Timothy W. Hilton) Date: Thu, 3 Mar 2011 09:52:31 -0800 Subject: [R] lattice custom axis function -- right side margins Message-ID: <20110303175231.GA17009@Tim.local> Dear R help list, I have a plot with two different vertical scales that I want to display on either side of the plot. It's quite similar to the Fahrenheit-Centigrade example in the examples section of the documentation for axis.default. The right-side axis is clipped off, though, and I haven't been able to figure out anything with viewport() and clipping or trellis.par.set to fix that... Any help greatly appreciated! Minimal example below. I would also like to add a label to the right-side vertical axis similar to the "sill..." label on the left. Bonus points if anyone can throw that in... Many thanks, Tim -- Timothy W. Hilton PhD Candidate, Department of Meteorology The Pennsylvania State University 503 Walker Building, University Park, PA 16802 hilton at meteo.psu.edu -------------------------------------------------- code to produce the plot with right-side labels clipped off example_data <- structure(list(year = structure(c(4L, 2L, 2L, 7L, 2L, 2L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 4L, 4L, 2L, 7L, 3L, 2L, 5L), .Label = c("2000", "2001", "2002", "2003", "2004", "2005", "2006"), class = "factor"), var = structure(c(2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), .Label = c("NEE.model", "NEE.model.res", "NEE.obs"), class = "factor"), par_set = structure(c(6L, 1L, 3L, 6L, 7L, 2L, 7L, 5L, 1L, 6L, 3L, 4L, 1L, 1L, 4L, 1L, 7L, 5L, 9L, 9L, 9L), .Label = c("all.all", "all.ann", "all.mon", "pft.all", "pft.ann", "pft.mon", "site.all", "site.ann", "site.mon"), class = "factor"), sigmasq = c(11430.2595455547, 12118.5387166954, 12982.4722525337, 16366.3059675243, 16650.2206047512, 19730.2121989498, 19958.3416187217, 20491.4117984889, 20647.8829877428, 21389.0300281264, 21413.7674128747, 21445.7255788782, 22002.8026436862, 22042.9802472953, 22201.0461487030, 22340.9959465200, 24782.8974616218, 27207.1283451608, 59450.6758048182, 94725.119215293, 694716.769010273 )), .Names = c("year", "var", "par_set", "sigmasq"), row.names = c(94L, 8L, 20L, 43L, 44L, 68L, 100L, 86L, 2L, 92L, 74L, 80L, 62L, 1L, 82L, 64L, 98L, 37L, 57L, 56L, 59L), class = "data.frame") sigsq2sig <- function(sigmasq) sqrt(2 * sigmasq) sig2sigsq <- function(sig) 0.5 * (sig)^2 # axis method to add a std deviation axis to the right side of a sill plot axis.sigmasq <- function(side, ...) { switch(side, left = { ylim <- current.panel.limits()$ylim pretty_sigmasq <- pretty(ylim) panel.axis(side = side, outside = TRUE, at = pretty_sigmasq, labels = pretty_sigmasq) }, right = { ylim <- current.panel.limits()$ylim pretty_sigmasq <- pretty(ylim) pos_sigmasq <- pretty_sigmasq[pretty_sigmasq >= 0] pretty_sigma <- pretty(sigsq2sig(pos_sigmasq)) panel.axis(side = side, outside = TRUE, at = sig2sigsq(pretty_sigma), labels = pretty_sigma) }, axis.default(side = side, ...)) } my_plot <- function(best.fits, ...) { y.main.label <- expression(sill~group("[", group("(", mu*mol~s^-1~m^-2 ,")")^2, "]")) #plot the parameter values, one per year plt <- xyplot(sigmasq~interaction(var, par_set), data=best.fits, groups=year, axis = axis.sigmasq, scales=list(x=list(rot=45)), xlab=list(label="NEE measure"), ylab=list(label=y.main.label), ...) return(plt) } print(my_plot(example_data)) From rmh at temple.edu Thu Mar 3 19:05:49 2011 From: rmh at temple.edu (Richard M. Heiberger) Date: Thu, 3 Mar 2011 13:05:49 -0500 Subject: [R] lattice custom axis function -- right side margins In-Reply-To: <20110303175231.GA17009@Tim.local> References: <20110303175231.GA17009@Tim.local> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djmuser at gmail.com Thu Mar 3 19:20:20 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Thu, 3 Mar 2011 10:20:20 -0800 Subject: [R] Probabilities greather than 1 in HIST In-Reply-To: <007801cbd9c0$2fed7330$8fc85990$@gmail.com> References: <1299161001818-3333388.post@n4.nabble.com> <2C8232C9-DF3B-4B72-8719-0C2788761E32@comcast.net> <007801cbd9c0$2fed7330$8fc85990$@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From hilton at meteo.psu.edu Thu Mar 3 19:50:22 2011 From: hilton at meteo.psu.edu (Timothy W. Hilton) Date: Thu, 3 Mar 2011 10:50:22 -0800 Subject: [R] lattice custom axis function -- right side margins In-Reply-To: References: Message-ID: <20110303185022.GB17009@Tim.local> Many thanks, Richard -- the position argument does exactly what I needed. I'm not having any luck with the ylab.right argument. My R and lattice are up to date (below); is there something else I should check? Thanks for the help, Tim > sessionInfo() R version 2.12.2 (2011-02-25) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lattice_0.19-17 loaded via a namespace (and not attached): [1] grid_2.12.2 tools_2.12.2 On Thu, Mar 2011, 03 at 01:05:49PM -0500, Richard M. Heiberger wrote: > print(my_plot(example_data, ylab.right=expression(e==mc^2)), > position=c(0,0,.95,1)) > > You will need a recent R version for the ylab.right argument. > > On Thu, Mar 3, 2011 at 12:52 PM, Timothy W. Hilton wrote: > > > Dear R help list, > > From fenerbahcesampiyon2 at gmail.com Thu Mar 3 20:39:12 2011 From: fenerbahcesampiyon2 at gmail.com (fenerbahce sampiyon) Date: Thu, 3 Mar 2011 11:39:12 -0800 Subject: [R] mailing list submission Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From NordlDJ at dshs.wa.gov Thu Mar 3 20:49:04 2011 From: NordlDJ at dshs.wa.gov (Nordlund, Dan (DSHS/RDA)) Date: Thu, 3 Mar 2011 11:49:04 -0800 Subject: [R] Creating a weighted sample - Help In-Reply-To: <1299158476421-3333311.post@n4.nabble.com> References: <1299084606398-3331842.post@n4.nabble.com> <4D6F351A.7030903@ucalgary.ca> <1299158476421-3333311.post@n4.nabble.com> Message-ID: <941871A13165C2418EC144ACB212BDB001CD9BB5@dshsmxoly1504g.dshs.wa.lcl> > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of LouiseS > Sent: Thursday, March 03, 2011 5:21 AM > To: r-help at r-project.org > Subject: Re: [R] Creating a weighted sample - Help > > Hi > > Thanks for responses. The sample I have taken is a random sample from > H, I, > J and K. The further analysis I want to do is all around bad debt > rates so > it could be (H/H+I)*100 = Bad rate percentage also population stability > calculations that are all related to credit scoring. I want to be able > to > report back on any variable that I have in my data set based on my > factored > counts (A) of 10,000 - so every calculation is based on 10,000 account > in > the correct proportions. > > Does his help? > > Thanks once again > Louise > Louise, It appears that you have done a stratified random sample of four types of accounts and have oversampled the less frequent account types. You definitely should consider doing your analyses using the survey package (or similar package) that appropriately accounts for the sampling variability. Otherwise, your variances / standard errors are going to be incorrect. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 From Bettina.Gruen at jku.at Thu Mar 3 21:11:21 2011 From: Bettina.Gruen at jku.at (Bettina Gruen) Date: Thu, 03 Mar 2011 21:11:21 +0100 Subject: [R] Problem on flexmix when trying to apply signature developed in one model to a new sample In-Reply-To: References: Message-ID: <4D6FF5E9.1040007@jku.at> Jon, if I did understand you correctly the problem is that you did not specify the newdata argument in posterior() correctly. You need to specify it in way such that evaluating the formula uses the correct object. If you have a matrix as dependent variable, you have to use a list which contains an object with the name of the dependent variable which contains the data you want to use for determining the a-posteriori probabilities. The same holds for clusters(). Have a look at the following code: library("flexmix") library("mvtnorm") set.seed(123) BM <- rbind(rmvnorm(100, rep(0, 2)), rmvnorm(100, rep(5, 2))) ex2 <- flexmix(BM ~ 1, k = 2, model = FLXMCmvnorm(diagonal = FALSE)) print(ex2) plotEll(ex2, BM) Data2 <- data.frame(var1 = BM[c(1:5, 101:105), 1], var2 = BM[c(1:5, 101:105), 2]) BM2 <- list(BM = cbind(Data2$var1, Data2$var2)) ProbMCI <- posterior(ex2, BM2) HTH, Bettina On 03/01/2011 05:34 PM, Jon Toledo wrote: > > Problem on flexmix when trying to apply signature developed in one model to a new sample. > Dear > R Users, R Core Team, > > > > I have a problem when trying to know the > classification of the tested cases using two variables with the function of flexmix: > > > > After importing the database and creating > a matrix: > > BM<-cbind(Data$var1,Data$var2) > > > > I see that the best model has 2 groups and > use: > > > > ex2 > <- flexmix(BM~1, k=2, model=FLXMCmvnorm(diagonal=FALSE)) > > print(ex2) > > plotEll(ex2, BM) > > > > Then I want to test to which group one > subset of patients belongs, so I import a smaller sample of the previous data: > > BM2<-data.frame (Data2$var1,Data2$var2) > > > > However when I test the results I get are > from the complete training sample I used in ex2 and not from the new sample > BM2. > > > > ProbMCI<-posterior(ex2, BM2) > > > > And if I do the following I get double the > number of entered cases (I think because I entered 2 variables): > > BM2<-cbind (Data2$var1,Data2$var2) > > p<-posterior(ex2)[BMMCI,] > > max.col(p) > > > > (The same with clusters(ex2)[BM2]) > > > > In the future I would like to test the > result of this mixture also in new samples. > > > > Thank you in advance > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- ------------------------------------------------------------------- Bettina Gr?n Institut f?r Angewandte Statistik / IFAS Johannes Kepler Universit?t Linz Altenbergerstra?e 69 4040 Linz, Austria Tel: +43 732 2468-5889 Fax: +43 732 2468-9846 E-Mail: Bettina.Gruen at jku.at www.ifas.jku.at From yanliusun at gmail.com Thu Mar 3 22:09:18 2011 From: yanliusun at gmail.com (yan liu) Date: Thu, 3 Mar 2011 16:09:18 -0500 Subject: [R] plot, y-axis, uneven scale??? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Greg.Snow at imail.org Thu Mar 3 22:16:47 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Thu, 3 Mar 2011 14:16:47 -0700 Subject: [R] Regression with many independent variables In-Reply-To: References: Message-ID: What you might need to do is create a character string with your formula in it (looping through pairs of variables and using paste or sprint) then convert that to a formula using the as.formula function. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: Matthew Douglas [mailto:matt.douglas01 at gmail.com] > Sent: Thursday, March 03, 2011 2:09 PM > To: Greg Snow > Cc: r-help at r-project.org > Subject: Re: [R] Regression with many independent variables > > Thanks greg, > > that formula was exactly what I was looking for. Except now when I > run it on my data I get the following error: > > "Error in model.matrix.default(mt, mf, contrasts) : cannot allocate > vector of length 2043479998" > > I know there are probably many 2-way interactions that are zero so I > thought I could save space by removing these. Is there some way that > can just delete all the two way interactions that are zero and keep > the columns that have non-zero entries? I think that will > significantly cut down the memory needed. Or is there just another way > to get around this? > > thanks, > Matt > > On Tue, Mar 1, 2011 at 3:56 PM, Greg Snow wrote: > > You can use ^2 to get all 2 way interactions and ^3 to get all 3 way > interactions, e.g.: > > > > lm(Sepal.Width ~ (. - Sepal.Length)^2, data=iris) > > > > The lm.fit function is what actually does the fitting, so you could > go directly there, but then you lose the benefits of using . and ^. > ?The Matrix package has ways of dealing with sparse matricies, but I > don't know if ?that would help here or not. > > > > You could also just create x'x and x'y matricies directly since the > variables are 0/1 then use solve. ?A lot depends on what you are doing > and what questions you are trying to answer. > > > > -- > > Gregory (Greg) L. Snow Ph.D. > > Statistical Data Center > > Intermountain Healthcare > > greg.snow at imail.org > > 801.408.8111 > > > > > >> -----Original Message----- > >> From: Matthew Douglas [mailto:matt.douglas01 at gmail.com] > >> Sent: Tuesday, March 01, 2011 1:09 PM > >> To: Greg Snow > >> Cc: r-help at r-project.org > >> Subject: Re: [R] Regression with many independent variables > >> > >> Hi Greg, > >> > >> Thanks for the help, it works perfectly. To answer your question, > >> there are 339 independent variables but only 10 will be used at one > >> time . So at any given line of the data set there will be 10 non > zero > >> entries for the independent variables and the rest will be zeros. > >> > >> One more question: > >> > >> 1. I still want to find a way to look at the interactions of the > >> independent variables. > >> > >> the regression would look like this: > >> > >> y = b12*X1X2 + b23*X2X3 +...+ bk-1k*Xk-1Xk > >> > >> so I think the regression in R would look like this: > >> > >> lm(MARGIN, P235:P236+P236:P237+....,weights = Poss, data = adj0708), > >> > >> my problem is that since I have technically 339 independent > variables, > >> when I do this regression I would have 339 Choose 2 = approx 57000 > >> independent variables (a vast majority will be 0s though) so I dont > >> want to have to write all of these out. Is there a way to do this > >> quickly in R? > >> > >> Also just a curious question that I cant seem to find to online: > >> is there a more efficient model other than lm() that is better for > >> very sparse data sets like mine? > >> > >> Thanks, > >> Matt > >> > >> > >> On Mon, Feb 28, 2011 at 4:30 PM, Greg Snow > wrote: > >> > Don't put the name of the dataset in the formula, use the data > >> argument to lm to provide that. ?A single period (".") on the right > >> hand side of the formula will represent all the columns in the data > set > >> that are not on the left hand side (you can then use "-" to remove > any > >> other columns that you don't want included on the RHS). > >> > > >> > For example: > >> > > >> >> lm(Sepal.Width ~ . - Sepal.Length, data=iris) > >> > > >> > Call: > >> > lm(formula = Sepal.Width ~ . - Sepal.Length, data = iris) > >> > > >> > Coefficients: > >> > ? ? ?(Intercept) ? ? ? Petal.Length ? ? ? ?Petal.Width > >> ?Speciesversicolor > >> > ? ? ? ? ? 3.0485 ? ? ? ? ? ? 0.1547 ? ? ? ? ? ? 0.6234 > ?- > >> 1.7641 > >> > ?Speciesvirginica > >> > ? ? ? ? ?-2.1964 > >> > > >> > > >> > But, are you sure that a regression model with 339 predictors will > be > >> meaningful? > >> > > >> > -- > >> > Gregory (Greg) L. Snow Ph.D. > >> > Statistical Data Center > >> > Intermountain Healthcare > >> > greg.snow at imail.org > >> > 801.408.8111 > >> > > >> > > >> >> -----Original Message----- > >> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > >> >> project.org] On Behalf Of Matthew Douglas > >> >> Sent: Monday, February 28, 2011 1:32 PM > >> >> To: r-help at r-project.org > >> >> Subject: [R] Regression with many independent variables > >> >> > >> >> Hi, > >> >> > >> >> I am trying use lm() on some data, the code works fine but I > would > >> >> like to use a more efficient way to do this. > >> >> > >> >> The data looks like this (the data is very sparse with a few 1s, > -1s > >> >> and the rest 0s): > >> >> > >> >> > head(adj0708) > >> >> ? ? ? MARGIN Poss P235 P247 P703 P218 P430 P489 P83 P307 P337.... > >> >> 1 ? 64.28571 ? 29 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 > ?0 > >> >> 0 ? ?0 ? ?0 > >> >> 2 -100.00000 ? ?6 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?1 ? ?0 > ?0 > >> >> 0 ? ?0 ? ?0 > >> >> 3 ?100.00000 ? ?4 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?1 ? ?0 > ?0 > >> >> 0 ? ?0 ? ?0 > >> >> 4 ?-33.33333 ? ?7 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 > ?0 > >> >> 0 ? ?0 ? ?0 > >> >> 5 ?200.00000 ? ?2 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 > ?0 > >> >> -1 ? ?0 ? ?0 > >> >> 6 ?-83.33333 ? 12 ? ?0 ? ?-1 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 > ?0 > >> >> 0 ? ?0 ? ?0 > >> >> > >> >> adj0708 is actually a 35657x341 data set. Each column after > "Poss" > >> is > >> >> an independent variable, the dependent variable is "MARGIN" and > it > >> is > >> >> weighted by "Poss" > >> >> > >> >> > >> >> The regression is below: > >> >> fit.adj0708 <- lm( adj0708$MARGIN~adj0708$P235 + adj0708$P247 + > >> >> adj0708$P703 + adj0708$P430 + adj0708$P489 + adj0708$P218 + > >> >> adj0708$P605 + adj0708$P337 + .... + > >> >> adj0708$P510,weights=adj0708$Poss) > >> >> > >> >> I have two questions: > >> >> > >> >> 1. Is there a way to to condense how I write the independent > >> variables > >> >> in the lm(), instead of having such a long line of code (I have > 339 > >> >> independent variables to be exact)? > >> >> 2. I would like to pair the data to look a regression of the > >> >> interactions between two independent variables. I think it would > >> look > >> >> something like this.... > >> >> fit.adj0708 <- lm( adj0708$MARGIN~adj0708$P235:adj0708$P247 + > >> >> adj0708$P703:adj0708$P430 + adj0708$P489:adj0708$P218 + > >> >> adj0708$P605:adj0708$P337 + ....,weights=adj0708$Poss) > >> >> but there will be 339 Choose 2 combinations, so a lot of > independent > >> >> variables! Is there a more efficient way of writing this code. Is > >> >> there a way I can do this? > >> >> > >> >> Thanks, > >> >> Matt > >> >> > >> >> ______________________________________________ > >> >> R-help at r-project.org mailing list > >> >> https://stat.ethz.ch/mailman/listinfo/r-help > >> >> PLEASE do read the posting guide http://www.R- > project.org/posting- > >> >> guide.html > >> >> and provide commented, minimal, self-contained, reproducible > code. > >> > > > From Greg.Snow at imail.org Thu Mar 3 22:18:29 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Thu, 3 Mar 2011 14:18:29 -0700 Subject: [R] plot, y-axis, uneven scale??? In-Reply-To: References: Message-ID: You probably want to use the gap.plot function in the plotrix package. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of yan liu > Sent: Thursday, March 03, 2011 2:09 PM > To: r-help at r-project.org > Subject: [R] plot, y-axis, uneven scale??? > > Hello, > > I have a question about the y-axis of plots. Actually I had about 60 > values. About 80 percent of these values are less than 0.2, then the > other > 20 percent values are more than 4,max is 10. So when I plot these > values > together, the y-axis's range will go 0 to 10, and my major values (80% > values <0.2) will be pressed around 0 on the bottom, while other > several > dots will scatter in the major part of the plot area. > > Does anyone know how to assign the y-axis with uneven (jumping) ticks > with > same width, like 0, 0.1, 1, 10,100 with the same width between? > > Thanks a lot! > > Yan > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From m_hofert at web.de Thu Mar 3 22:35:28 2011 From: m_hofert at web.de (Marius Hofert) Date: Thu, 3 Mar 2011 22:35:28 +0100 Subject: [R] lattice: How to increase space between ticks and labels of z-axis? Message-ID: Dear expeRts, How can I increase the space between the ticks and the labels in the wireframe plot below? I tried some variations with par.settings=list(..) but it just didn't work. Many thanks, Marius library(lattice) u <- seq(0, 1, length.out=20) grid <- expand.grid(x=u, y=u) z <- apply(grid, 1, function(x) 1/(x[1]*x[2]+0.0001)) wireframe(z~grid[,1]*grid[,2], aspect=1, scales=list(col=1, arrows=FALSE)) From ggrothendieck at gmail.com Thu Mar 3 22:41:39 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Thu, 3 Mar 2011 16:41:39 -0500 Subject: [R] plot, y-axis, uneven scale??? In-Reply-To: References: Message-ID: On Thu, Mar 3, 2011 at 4:09 PM, yan liu wrote: > Hello, > > I have a question about the y-axis of plots. ?Actually I had about 60 > values. ?About 80 percent of these values are less than 0.2, then the other > 20 percent values are more than 4,max is 10. ?So when I plot these values > together, the y-axis's range will go 0 to 10, and my major values (80% > values <0.2) will be pressed around 0 on the bottom, while other several > dots will scatter in the major part of the plot area. > > Does anyone know how to assign the y-axis with uneven (jumping) ticks with > same width, like 0, 0.1, 1, 10,100 with the same width between? > > Thanks a lot! > > Yan See the log= argument to plot (its actually listed under ?plot.default). Also eaxis in sfsmisc. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From jholtman at gmail.com Thu Mar 3 23:04:52 2011 From: jholtman at gmail.com (jim holtman) Date: Thu, 3 Mar 2011 17:04:52 -0500 Subject: [R] 'merge' function creating duplicate columns names in the output Message-ID: The "merge" command is creating duplicate column names in a dataframe that is the result of the merge. The following is the 'merge' command: x <- merge(invType , allocSlots , by.x = 'index' , by.y = 'indx' , all.x = TRUE ) The 'invType' dataframe was the result of a previous merge and has the following column names that are probably causing the problem: height.x height.y height > str(invType) 'data.frame': 2219 obs. of 30 variables: $ loc : chr "F0AA63" "F0AA65" "F0AA73" "F0AA75" ... $ KLN : int 3569383 3515513 3565497 3555138 3565162 3555001 3565139 3555886 3565796 3556647 ... $ comm : int 451 57 560 40 560 39 560 40 560 46 ... $ case : num 7.70e+09 1.00e+12 3.00e+12 1.00e+12 1.11e+09 ... $ desc : chr "PGPR RTC BONELESS WINGS" "GRTN POT CRNCH FISH FILET" "TYSON CORNISH HENS TWN PK" "GGNT RSTD POT GRLC HERB" ... $ height.x: num 7.2 12.6 11 7.8 6.8 10.1 11.2 10 11 10.5 ... $ length : num 14.5 15.8 20 15.6 22.2 15 20.2 15 17 19.8 ... $ weight : num 11 16.3 39 11 35.6 6.5 36 4 30 12.5 ... $ width : num 9.7 9.2 14.3 8 15.2 7.5 13.2 8.5 13 10 ... $ high : int 5 2 3 3 4 3 3 3 3 3 ... $ pqty : int 65 26 18 45 20 45 21 39 24 30 ... $ boh : int 4372 58 1199 51 836 116 64 312 371 389 ... $ awm : num 694 44.3 53.8 35 0.8 ... $ cubes : num 0.586 1.06 1.821 0.563 1.328 ... $ pallet : num 42 31.2 39 29.4 33.2 36.3 39.6 36 39 37.5 ... $ adm : num 99.143 6.329 7.686 5 0.114 ... $ tie : num 13 13 6 15 5 15 7 13 8 10 ... $ origComm: int 457 57 547 40 541 39 552 40 552 46 ... $ days : num 0.656 6.162 2.342 11.998 216.853 ... $ class : chr "single" "double" "single" "double" ... $ top.x : logi TRUE TRUE FALSE TRUE FALSE TRUE ... $ comm_ord: Factor w/ 30 levels "37A","38A","43A",..: 25 5 23 15 23 8 23 15 23 16 ... $ type.x : int 2 2 2 2 2 2 2 2 2 2 ... $ height.y: num 47 47 47 47 47 47 47 47 47 47 ... $ top.y : logi FALSE TRUE FALSE TRUE FALSE TRUE ... $ noChange: logi FALSE FALSE FALSE FALSE FALSE FALSE ... $ type.y : int 2 2 2 2 2 2 2 2 2 2 ... $ depth : num 48 48 48 48 48 48 48 48 48 48 ... $ height : num 47 47 47 47 47 47 47 47 47 47 ... $ index : int 1 2 3 4 5 6 7 8 9 10 ... Now the "allocSlots" dataframe also has a column name 'height' > str(allocSlots) 'data.frame': 2462 obs. of 6 variables: $ loc : chr "F1AA02" "F1AA12" "F1AA22" "F1AA32" ... $ height: num 72 72 72 72 72 72 72 72 72 72 ... $ depth : num 48 48 48 48 48 48 48 48 48 48 ... $ bay : chr "F1AA0" "F1AA0" "F1AA2" "F1AA2" ... $ indx : int 1675 1617 1386 1096 1077 963 816 471 275 259 ... $ type : int 1 1 1 1 1 1 1 1 1 1 ... Here is the result of the 'merge': (notice that there are now two 'height.x' and 'height.y' columns in the dataframe: > str(x) 'data.frame': 2219 obs. of 35 variables: $ index : int 1 2 3 4 5 6 7 8 9 10 ... $ loc.x : chr "F0AA63" "F0AA65" "F0AA73" "F0AA75" ... $ KLN : int 3569383 3515513 3565497 3555138 3565162 3555001 3565139 3555886 3565796 3556647 ... $ comm : int 451 57 560 40 560 39 560 40 560 46 ... $ case : num 7.70e+09 1.00e+12 3.00e+12 1.00e+12 1.11e+09 ... $ desc : chr "PGPR RTC BONELESS WINGS" "GRTN POT CRNCH FISH FILET" "TYSON CORNISH HENS TWN PK" "GGNT RSTD POT GRLC HERB" ... $ height.x: num 7.2 12.6 11 7.8 6.8 10.1 11.2 10 11 10.5 ... $ length : num 14.5 15.8 20 15.6 22.2 15 20.2 15 17 19.8 ... $ weight : num 11 16.3 39 11 35.6 6.5 36 4 30 12.5 ... $ width : num 9.7 9.2 14.3 8 15.2 7.5 13.2 8.5 13 10 ... $ high : int 5 2 3 3 4 3 3 3 3 3 ... $ pqty : int 65 26 18 45 20 45 21 39 24 30 ... $ boh : int 4372 58 1199 51 836 116 64 312 371 389 ... $ awm : num 694 44.3 53.8 35 0.8 ... $ cubes : num 0.586 1.06 1.821 0.563 1.328 ... $ pallet : num 42 31.2 39 29.4 33.2 36.3 39.6 36 39 37.5 ... $ adm : num 99.143 6.329 7.686 5 0.114 ... $ tie : num 13 13 6 15 5 15 7 13 8 10 ... $ origComm: int 457 57 547 40 541 39 552 40 552 46 ... $ days : num 0.656 6.162 2.342 11.998 216.853 ... $ class : chr "single" "double" "single" "double" ... $ top.x : logi TRUE TRUE FALSE TRUE FALSE TRUE ... $ comm_ord: Factor w/ 30 levels "37A","38A","43A",..: 25 5 23 15 23 8 23 15 23 16 ... $ type.x : int 2 2 2 2 2 2 2 2 2 2 ... $ height.y: num 47 47 47 47 47 47 47 47 47 47 ... $ top.y : logi FALSE TRUE FALSE TRUE FALSE TRUE ... $ noChange: logi FALSE FALSE FALSE FALSE FALSE FALSE ... $ type.y : int 2 2 2 2 2 2 2 2 2 2 ... $ depth.x : num 48 48 48 48 48 48 48 48 48 48 ... $ height.x: num 47 47 47 47 47 47 47 47 47 47 ... $ loc.y : chr "F1KC22" "F1BM34" "F1HC73" "F1FJ65" ... $ height.y: num 72 44 72 44 72 44 72 44 72 72 ... $ depth.y : num 48 48 48 48 48 48 48 48 48 48 ... $ bay : chr "F1KC2" "F1BM2" "F1HC7" "F1FJ5" ... $ type : int 1 2 1 2 1 2 1 2 1 1 ... My workaround is to change one of the "height" to something else to avoid the problem, but someone else might stumble on the same error. Should we expect 'merge' to ensure that the column names are unique in the result? > sessionInfo() R version 2.12.1 (2010-12-16) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From dwinsemius at comcast.net Thu Mar 3 23:20:31 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Mar 2011 16:20:31 -0600 Subject: [R] Calling a function to store values In-Reply-To: References: <1299168751676-3333644.post@n4.nabble.com> Message-ID: <11D7B732-DB41-4BD8-99C9-30BFD548830D@comcast.net> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From crabak at acm.org Thu Mar 3 20:16:26 2011 From: crabak at acm.org (csrabak) Date: Thu, 3 Mar 2011 17:16:26 -0200 Subject: [R] Ordering several histograms In-Reply-To: <1299160800582-3333382.post@n4.nabble.com> References: <1299160800582-3333382.post@n4.nabble.com> Message-ID: Em 3/3/2011 12:00, djbirdnerd escreveu: > Hallo everyone, > > I want to evaluate the change of the distribution for several size classes. > How can i order these separate histograms with the same y-axis along a > common x-axis according to their size classes. It would like it to look a > bit like this > (http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=109) without > the quantile regression. I can produce the separate histograms, but have no > clue how to merge them. i can put them next to each with separate x- and > y-axes. > > Much obliged, > Kenneth, Have tried to adapt the R code which is published in the link you did post? -- Cesar Rabak From hkwok at emedharbor.edu Thu Mar 3 19:59:41 2011 From: hkwok at emedharbor.edu (Kwok, Heemun) Date: Thu, 3 Mar 2011 10:59:41 -0800 Subject: [R] embed latex beamer sans serif default font into R plot Message-ID: <454776AB56045147BB90DDFC184509F901A636B7@hawkeye-2.rei.edu> Hello, I have seen instructions on how to embed Latex Computer Modern fonts into R, but these are the default serif fonts. I am trying to embed the default font used for Latex beamer (theme Warsaw), which is a sans serif font and may be the default LateX Computer Modern sans serif font. Does anyone know the names of the font files? Thanks Heemun ------------------------------------------------- Heemun Kwok, M.D. Research Fellow Harbor-UCLA Department of Emergency Medicine 1000 West Carson Street, Box 21 Torrance, CA 90509-2910 office 310-222-3501, fax 310-212-6101 From ianleinwand at hotmail.com Thu Mar 3 19:44:12 2011 From: ianleinwand at hotmail.com (leinwand) Date: Thu, 3 Mar 2011 10:44:12 -0800 (PST) Subject: [R] Ploting Histogram with Y axis is percentage of sample for each bin Message-ID: <1299177852044-3333961.post@n4.nabble.com> I'm trying to do something very simple... I wan to plot a histogram where the y axis represent the percentage of the total sample that each bin represents. I know how to plot a histogram with the counts and density... but can't find anything that gives me perenct of sample on the y axis. Any help is appriciated Below is the script I'm working with par(mfrow=c(1,2)) hist(ISIS$ASH_BA1K_ISIS[ISIS$Pest_Status=="-1"], main="Ash BA 1K Negative Detection", xlab="ASH BA 1K") lines(density(ISIS$ASH_BA1K_ISIS), col="blue") hist(ISIS$ASH_BA1K_ISIS[ISIS$Pest_Status=="1"], main="Ash BA 1K Positive Detection", xlab="Ash BA 1K") lines(density(ISIS$ASH_BA1K_ISIS), col="red") -- View this message in context: http://r.789695.n4.nabble.com/Ploting-Histogram-with-Y-axis-is-percentage-of-sample-for-each-bin-tp3333961p3333961.html Sent from the R help mailing list archive at Nabble.com. From jcp512 at york.ac.uk Thu Mar 3 21:03:49 2011 From: jcp512 at york.ac.uk (Jorseff) Date: Thu, 3 Mar 2011 12:03:49 -0800 (PST) Subject: [R] Scatter plot with multiple data sets Message-ID: <1299182629792-3334096.post@n4.nabble.com> Hi, I have multiple (6) data sets which I would like to plot together on one scatter graph. The reason they are all separate is that I require a different symbol to be plotted for each set. Could somebody advise on how to do this? Many thanks, Joe -- View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-with-multiple-data-sets-tp3334096p3334096.html Sent from the R help mailing list archive at Nabble.com. From johnsen at fas.harvard.edu Thu Mar 3 20:56:50 2011 From: johnsen at fas.harvard.edu (Sverre Stausland) Date: Thu, 3 Mar 2011 14:56:50 -0500 Subject: [R] Interpreting the coefficient of an interaction between continuous variables in a regression model Message-ID: Hello, my question is triggered by an actual model I am running, but I will pose it as a very general question with a hypothetical example. Take the following regression model: I have a binomial dependent variable "Happiness", whose two values are 0 (=unhappy) and 1 (=happy). My two independent continuous variables are "Income" and "Children". Imagine that "Income" has no significant effect, and that "Children" has a significant positive effect (more children give more happiness). Now I am interested in the interaction between "Income" and "Children", i.e. 'Income : Children'. Say that the model finds a non-significant negative coefficient. How do I interpret that? If I understand it correctly, the model is asking "When the numerical values in "Income" and "Children" are both increasing, does it significantly affect the dependent variable?". But what if the interaction between them is inverse - such that a decreasing value in "Income" paired with an increasing value in "Children" will significantly affect the dependent variable "Happiness". Will the model not be able to capture that connection without doing some additional tweaking? Thank you Sverre From jon_d_cooke at yahoo.co.uk Thu Mar 3 22:58:28 2011 From: jon_d_cooke at yahoo.co.uk (JonC) Date: Thu, 3 Mar 2011 13:58:28 -0800 (PST) Subject: [R] creating a count variable in R Message-ID: <1299189508645-3334288.post@n4.nabble.com> Hi R helpers, I'm trying to create a count in R , but as there is no retain function like in SAS I'm running into difficulties. I have the following : Date_var and wish to obtain Date_var Count_var 01/01/2011 01/01/2011 1 01/01/2011 01/01/2011 2 02/01/2011 02/01/2011 1 02/01/2011 02/01/2011 2 02/01/2011 02/01/2011 3 02/01/2011 02/01/2011 4 03/01/2011 03/01/2011 1 03/01/2011 03/01/2011 2 03/01/2011 03/01/2011 3 03/01/2011 03/01/2011 4 03/01/2011 03/01/2011 5 03/01/2011 03/01/2011 6 03/01/2011 03/01/2011 7 As can be seen above the count var is re initialised every time a new date is found. I hope this is easy. Many thanks in advance for assistance. It is appreciated. Cheers Jon -- View this message in context: http://r.789695.n4.nabble.com/creating-a-count-variable-in-R-tp3334288p3334288.html Sent from the R help mailing list archive at Nabble.com. From jpmaroco at gmail.com Thu Mar 3 20:39:52 2011 From: jpmaroco at gmail.com (jpmaroco) Date: Thu, 3 Mar 2011 11:39:52 -0800 (PST) Subject: [R] Probabilities greather than 1 in HIST In-Reply-To: <49E76DF37649DC48A4CE882BC8CE51C9019204BF@nagmail2.nag.co.uk> References: <1299161001818-3333388.post@n4.nabble.com> <2C8232C9-DF3B-4B72-8719-0C2788761E32@comcast.net> <007801cbd9c0$2fed7330$8fc85990$@gmail.com> <49E76DF37649DC48A4CE882BC8CE51C9019204BF@nagmail2.nag.co.uk> Message-ID: <001201cbd9da$c7581e90$56085bb0$@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From kparamas at asu.edu Thu Mar 3 20:49:42 2011 From: kparamas at asu.edu (Kumaraguru Paramasivam) Date: Thu, 3 Mar 2011 12:49:42 -0700 Subject: [R] Calling a function to store values In-Reply-To: References: <1299168751676-3333644.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From leebli520 at gmail.com Thu Mar 3 22:09:12 2011 From: leebli520 at gmail.com (Bobby Lee) Date: Thu, 3 Mar 2011 16:09:12 -0500 Subject: [R] Help center Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From matt.douglas01 at gmail.com Thu Mar 3 22:08:46 2011 From: matt.douglas01 at gmail.com (Matthew Douglas) Date: Thu, 3 Mar 2011 16:08:46 -0500 Subject: [R] Regression with many independent variables In-Reply-To: References: Message-ID: Thanks greg, that formula was exactly what I was looking for. Except now when I run it on my data I get the following error: "Error in model.matrix.default(mt, mf, contrasts) : cannot allocate vector of length 2043479998" I know there are probably many 2-way interactions that are zero so I thought I could save space by removing these. Is there some way that can just delete all the two way interactions that are zero and keep the columns that have non-zero entries? I think that will significantly cut down the memory needed. Or is there just another way to get around this? thanks, Matt On Tue, Mar 1, 2011 at 3:56 PM, Greg Snow wrote: > You can use ^2 to get all 2 way interactions and ^3 to get all 3 way interactions, e.g.: > > lm(Sepal.Width ~ (. - Sepal.Length)^2, data=iris) > > The lm.fit function is what actually does the fitting, so you could go directly there, but then you lose the benefits of using . and ^. ?The Matrix package has ways of dealing with sparse matricies, but I don't know if ?that would help here or not. > > You could also just create x'x and x'y matricies directly since the variables are 0/1 then use solve. ?A lot depends on what you are doing and what questions you are trying to answer. > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.snow at imail.org > 801.408.8111 > > >> -----Original Message----- >> From: Matthew Douglas [mailto:matt.douglas01 at gmail.com] >> Sent: Tuesday, March 01, 2011 1:09 PM >> To: Greg Snow >> Cc: r-help at r-project.org >> Subject: Re: [R] Regression with many independent variables >> >> Hi Greg, >> >> Thanks for the help, it works perfectly. To answer your question, >> there are 339 independent variables but only 10 will be used at one >> time . So at any given line of the data set there will be 10 non zero >> entries for the independent variables and the rest will be zeros. >> >> One more question: >> >> 1. I still want to find a way to look at the interactions of the >> independent variables. >> >> the regression would look like this: >> >> y = b12*X1X2 + b23*X2X3 +...+ bk-1k*Xk-1Xk >> >> so I think the regression in R would look like this: >> >> lm(MARGIN, P235:P236+P236:P237+....,weights = Poss, data = adj0708), >> >> my problem is that since I have technically 339 independent variables, >> when I do this regression I would have 339 Choose 2 = approx 57000 >> independent variables (a vast majority will be 0s though) so I dont >> want to have to write all of these out. Is there a way to do this >> quickly in R? >> >> Also just a curious question that I cant seem to find to online: >> is there a more efficient model other than lm() that is better for >> very sparse data sets like mine? >> >> Thanks, >> Matt >> >> >> On Mon, Feb 28, 2011 at 4:30 PM, Greg Snow wrote: >> > Don't put the name of the dataset in the formula, use the data >> argument to lm to provide that. ?A single period (".") on the right >> hand side of the formula will represent all the columns in the data set >> that are not on the left hand side (you can then use "-" to remove any >> other columns that you don't want included on the RHS). >> > >> > For example: >> > >> >> lm(Sepal.Width ~ . - Sepal.Length, data=iris) >> > >> > Call: >> > lm(formula = Sepal.Width ~ . - Sepal.Length, data = iris) >> > >> > Coefficients: >> > ? ? ?(Intercept) ? ? ? Petal.Length ? ? ? ?Petal.Width >> ?Speciesversicolor >> > ? ? ? ? ? 3.0485 ? ? ? ? ? ? 0.1547 ? ? ? ? ? ? 0.6234 ? ? ? ? ? ?- >> 1.7641 >> > ?Speciesvirginica >> > ? ? ? ? ?-2.1964 >> > >> > >> > But, are you sure that a regression model with 339 predictors will be >> meaningful? >> > >> > -- >> > Gregory (Greg) L. Snow Ph.D. >> > Statistical Data Center >> > Intermountain Healthcare >> > greg.snow at imail.org >> > 801.408.8111 >> > >> > >> >> -----Original Message----- >> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- >> >> project.org] On Behalf Of Matthew Douglas >> >> Sent: Monday, February 28, 2011 1:32 PM >> >> To: r-help at r-project.org >> >> Subject: [R] Regression with many independent variables >> >> >> >> Hi, >> >> >> >> I am trying use lm() on some data, the code works fine but I would >> >> like to use a more efficient way to do this. >> >> >> >> The data looks like this (the data is very sparse with a few 1s, -1s >> >> and the rest 0s): >> >> >> >> > head(adj0708) >> >> ? ? ? MARGIN Poss P235 P247 P703 P218 P430 P489 P83 P307 P337.... >> >> 1 ? 64.28571 ? 29 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 ? ?0 >> >> 0 ? ?0 ? ?0 >> >> 2 -100.00000 ? ?6 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?1 ? ?0 ? ?0 >> >> 0 ? ?0 ? ?0 >> >> 3 ?100.00000 ? ?4 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?1 ? ?0 ? ?0 >> >> 0 ? ?0 ? ?0 >> >> 4 ?-33.33333 ? ?7 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 ? ?0 >> >> 0 ? ?0 ? ?0 >> >> 5 ?200.00000 ? ?2 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 ? ?0 >> >> -1 ? ?0 ? ?0 >> >> 6 ?-83.33333 ? 12 ? ?0 ? ?-1 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 ? ?0 >> >> 0 ? ?0 ? ?0 >> >> >> >> adj0708 is actually a 35657x341 data set. Each column after "Poss" >> is >> >> an independent variable, the dependent variable is "MARGIN" and it >> is >> >> weighted by "Poss" >> >> >> >> >> >> The regression is below: >> >> fit.adj0708 <- lm( adj0708$MARGIN~adj0708$P235 + adj0708$P247 + >> >> adj0708$P703 + adj0708$P430 + adj0708$P489 + adj0708$P218 + >> >> adj0708$P605 + adj0708$P337 + .... + >> >> adj0708$P510,weights=adj0708$Poss) >> >> >> >> I have two questions: >> >> >> >> 1. Is there a way to to condense how I write the independent >> variables >> >> in the lm(), instead of having such a long line of code (I have 339 >> >> independent variables to be exact)? >> >> 2. I would like to pair the data to look a regression of the >> >> interactions between two independent variables. I think it would >> look >> >> something like this.... >> >> fit.adj0708 <- lm( adj0708$MARGIN~adj0708$P235:adj0708$P247 + >> >> adj0708$P703:adj0708$P430 + adj0708$P489:adj0708$P218 + >> >> adj0708$P605:adj0708$P337 + ....,weights=adj0708$Poss) >> >> but there will be 339 Choose 2 combinations, so a lot of independent >> >> variables! Is there a more efficient way of writing this code. Is >> >> there a way I can do this? >> >> >> >> Thanks, >> >> Matt >> >> >> >> ______________________________________________ >> >> R-help at r-project.org mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide http://www.R-project.org/posting- >> >> guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> > > From Matt.Shotwell at Vanderbilt.Edu Thu Mar 3 20:04:11 2011 From: Matt.Shotwell at Vanderbilt.Edu (Matt Shotwell) Date: Thu, 3 Mar 2011 13:04:11 -0600 Subject: [R] Developing a web crawler / R "webkit" or something similar? [off topic] In-Reply-To: References: <1299144164900-3332993.post@n4.nabble.com> Message-ID: <4D6FE62B.70209@Vanderbilt.edu> On 03/03/2011 08:07 AM, Mike Marchywka wrote: > > > > > > > >> Date: Thu, 3 Mar 2011 01:22:44 -0800 >> From: antujsrv at gmail.com >> To: r-help at r-project.org >> Subject: [R] Developing a web crawler >> >> Hi, >> >> I wish to develop a web crawler in R. I have been using the functionalities >> available under the RCurl package. >> I am able to extract the html content of the site but i don't know how to go > > In general this can be a big effort but there may be things in > text processing packages you could adapt to execute html and javascript. > However, I guess what I'd be looking for is something like a "webkit" > package or other open source browser with or without an "R" interface. > This actually may be an ideal solution for a lot of things as you get > all the content handlers of at least some browser. > > > Now that you mention it, I wonder if there are browser plugins to handle > "R" content ( I'd have to give this some thought, put a script up as > a web page with mime type "test/R" and have it execute it in R. ) There are server-side solutions for this sort of thing. See http://rapache.net/ . Also, there was a string of messages on R-devel some years ago addressing the mime type issue; beginning here: http://tolstoy.newcastle.edu.au/R/devel/05/11/3054.html . Though I don't know whether there was a resolution. Some suggestions were text/x-R, text/x-Rd, application/x-RData. -Matt > > > >> about analyzing the html formatted document. >> I wish to know the frequency of a word in the document. I am only acquainted >> with analyzing data sets. >> So how should i go about analyzing data that is not available in table >> format. >> >> Few chunks of code that i wrote: >> w<- >> getURL("http://www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B003DZ1Y8Q/ref=dp_reviewsanchor#FullQuotes") >> write.table(w,"test.txt") >> t<- readLines(w) >> >> readLines also didnt prove out to be of any help. >> >> Any help would be highly appreciated. Thanks in advance. >> >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Developing-a-web-crawler-tp3332993p3332993.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Matthew S Shotwell Assistant Professor School of Medicine Department of Biostatistics Vanderbilt University From sclare at ualberta.ca Thu Mar 3 22:02:19 2011 From: sclare at ualberta.ca (Shari Clare) Date: Thu, 3 Mar 2011 14:02:19 -0700 Subject: [R] PCA - scores Message-ID: <131E1151-2802-4793-9901-5F8E7EBB3FB8@ualberta.ca> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From simon.boardman at gmail.com Thu Mar 3 18:11:22 2011 From: simon.boardman at gmail.com (sboardman) Date: Thu, 3 Mar 2011 09:11:22 -0800 (PST) Subject: [R] Normalising proportional binomial data that has a set top value Message-ID: <1299172282160-3333754.post@n4.nabble.com> I've got some data in proportional format which has a set maximum value of 0.5 and I want to know how to normalise it. I've calculated a lateralisation index to determine whether an organism deviates from an equal number of left and right turns within a trial (as opposed to measuring the proportion of left and right turns in total). My proportions are therefore between 0 (equal number of left and right turns) and 0.5 (all turns in the same direction, either left or right). I want to analyse the data using an ANOVA to test repeated measures for the repeatability coefficient but recognise I will have to normalise the data for it to work properly. Any help is much appreciated. SB -- View this message in context: http://r.789695.n4.nabble.com/Normalising-proportional-binomial-data-that-has-a-set-top-value-tp3333754p3333754.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Thu Mar 3 23:36:24 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Mar 2011 16:36:24 -0600 Subject: [R] Ploting Histogram with Y axis is percentage of sample for each bin In-Reply-To: <1299177852044-3333961.post@n4.nabble.com> References: <1299177852044-3333961.post@n4.nabble.com> Message-ID: <0CACB00C-D103-4A3D-B7B5-13FF9798EFA9@comcast.net> On Mar 3, 2011, at 12:44 PM, leinwand wrote: > I'm trying to do something very simple... > > I wan to plot a histogram where the y axis represent the percentage > of the > total sample that each bin represents. > > I know how to plot a histogram with the counts and density... but > can't find > anything that gives me perenct of sample on the y axis. > > Any help is appriciated > > Below is the script I'm working with > > par(mfrow=c(1,2)) > hist(ISIS$ASH_BA1K_ISIS[ISIS$Pest_Status=="-1"], main="Ash BA 1K > Negative > Detection", xlab="ASH BA 1K") > lines(density(ISIS$ASH_BA1K_ISIS), col="blue") > hist(ISIS$ASH_BA1K_ISIS[ISIS$Pest_Status=="1"], main="Ash BA 1K > Positive > Detection", xlab="Ash BA 1K") > lines(density(ISIS$ASH_BA1K_ISIS), col="red") Script but no data. There is a pretty much identical question asked and answered earlier today... despite the fact theat the OP doesn't seem to undersatnd or be able to apply the directions, you should see if the thread "Probabilities greather than 1 in HIST" answers your question. > > > -- > View this message in context: http://r.789695.n4.nabble.com/Ploting-Histogram-with-Y-axis-is-percentage-of-sample-for-each-bin-tp3333961p3333961.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From matt.douglas01 at gmail.com Thu Mar 3 23:43:00 2011 From: matt.douglas01 at gmail.com (Matthew Douglas) Date: Thu, 3 Mar 2011 17:43:00 -0500 Subject: [R] Regression with many independent variables In-Reply-To: References: Message-ID: Thanks for getting back to me so quickly greg. Im not quite sure how to do what you just said, is there an example that you can show? I understand how to create the string with a formula in it but im not sure how to loop through the pairs of variables? How do I first get these 2way interaction variables, I can no longer use the "^" right? Sorry for so many questions, Matt On Thu, Mar 3, 2011 at 4:16 PM, Greg Snow wrote: > What you might need to do is create a character string with your formula in it (looping through pairs of variables and using paste or sprint) then convert that to a formula using the as.formula function. > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.snow at imail.org > 801.408.8111 > > >> -----Original Message----- >> From: Matthew Douglas [mailto:matt.douglas01 at gmail.com] >> Sent: Thursday, March 03, 2011 2:09 PM >> To: Greg Snow >> Cc: r-help at r-project.org >> Subject: Re: [R] Regression with many independent variables >> >> Thanks greg, >> >> ?that formula was exactly what I was looking for. Except now when I >> run it on my data I get the following error: >> >> "Error in model.matrix.default(mt, mf, contrasts) : cannot allocate >> vector of length 2043479998" >> >> I know there are probably many 2-way interactions that are zero so I >> thought I could save space by removing these. Is there some way that >> can just delete all the two way interactions that are zero and keep >> the columns that have non-zero entries? I think that will >> significantly cut down the memory needed. Or is there just another way >> to get around this? >> >> thanks, >> Matt >> >> On Tue, Mar 1, 2011 at 3:56 PM, Greg Snow wrote: >> > You can use ^2 to get all 2 way interactions and ^3 to get all 3 way >> interactions, e.g.: >> > >> > lm(Sepal.Width ~ (. - Sepal.Length)^2, data=iris) >> > >> > The lm.fit function is what actually does the fitting, so you could >> go directly there, but then you lose the benefits of using . and ^. >> ?The Matrix package has ways of dealing with sparse matricies, but I >> don't know if ?that would help here or not. >> > >> > You could also just create x'x and x'y matricies directly since the >> variables are 0/1 then use solve. ?A lot depends on what you are doing >> and what questions you are trying to answer. >> > >> > -- >> > Gregory (Greg) L. Snow Ph.D. >> > Statistical Data Center >> > Intermountain Healthcare >> > greg.snow at imail.org >> > 801.408.8111 >> > >> > >> >> -----Original Message----- >> >> From: Matthew Douglas [mailto:matt.douglas01 at gmail.com] >> >> Sent: Tuesday, March 01, 2011 1:09 PM >> >> To: Greg Snow >> >> Cc: r-help at r-project.org >> >> Subject: Re: [R] Regression with many independent variables >> >> >> >> Hi Greg, >> >> >> >> Thanks for the help, it works perfectly. To answer your question, >> >> there are 339 independent variables but only 10 will be used at one >> >> time . So at any given line of the data set there will be 10 non >> zero >> >> entries for the independent variables and the rest will be zeros. >> >> >> >> One more question: >> >> >> >> 1. I still want to find a way to look at the interactions of the >> >> independent variables. >> >> >> >> the regression would look like this: >> >> >> >> y = b12*X1X2 + b23*X2X3 +...+ bk-1k*Xk-1Xk >> >> >> >> so I think the regression in R would look like this: >> >> >> >> lm(MARGIN, P235:P236+P236:P237+....,weights = Poss, data = adj0708), >> >> >> >> my problem is that since I have technically 339 independent >> variables, >> >> when I do this regression I would have 339 Choose 2 = approx 57000 >> >> independent variables (a vast majority will be 0s though) so I dont >> >> want to have to write all of these out. Is there a way to do this >> >> quickly in R? >> >> >> >> Also just a curious question that I cant seem to find to online: >> >> is there a more efficient model other than lm() that is better for >> >> very sparse data sets like mine? >> >> >> >> Thanks, >> >> Matt >> >> >> >> >> >> On Mon, Feb 28, 2011 at 4:30 PM, Greg Snow >> wrote: >> >> > Don't put the name of the dataset in the formula, use the data >> >> argument to lm to provide that. ?A single period (".") on the right >> >> hand side of the formula will represent all the columns in the data >> set >> >> that are not on the left hand side (you can then use "-" to remove >> any >> >> other columns that you don't want included on the RHS). >> >> > >> >> > For example: >> >> > >> >> >> lm(Sepal.Width ~ . - Sepal.Length, data=iris) >> >> > >> >> > Call: >> >> > lm(formula = Sepal.Width ~ . - Sepal.Length, data = iris) >> >> > >> >> > Coefficients: >> >> > ? ? ?(Intercept) ? ? ? Petal.Length ? ? ? ?Petal.Width >> >> ?Speciesversicolor >> >> > ? ? ? ? ? 3.0485 ? ? ? ? ? ? 0.1547 ? ? ? ? ? ? 0.6234 >> ?- >> >> 1.7641 >> >> > ?Speciesvirginica >> >> > ? ? ? ? ?-2.1964 >> >> > >> >> > >> >> > But, are you sure that a regression model with 339 predictors will >> be >> >> meaningful? >> >> > >> >> > -- >> >> > Gregory (Greg) L. Snow Ph.D. >> >> > Statistical Data Center >> >> > Intermountain Healthcare >> >> > greg.snow at imail.org >> >> > 801.408.8111 >> >> > >> >> > >> >> >> -----Original Message----- >> >> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- >> >> >> project.org] On Behalf Of Matthew Douglas >> >> >> Sent: Monday, February 28, 2011 1:32 PM >> >> >> To: r-help at r-project.org >> >> >> Subject: [R] Regression with many independent variables >> >> >> >> >> >> Hi, >> >> >> >> >> >> I am trying use lm() on some data, the code works fine but I >> would >> >> >> like to use a more efficient way to do this. >> >> >> >> >> >> The data looks like this (the data is very sparse with a few 1s, >> -1s >> >> >> and the rest 0s): >> >> >> >> >> >> > head(adj0708) >> >> >> ? ? ? MARGIN Poss P235 P247 P703 P218 P430 P489 P83 P307 P337.... >> >> >> 1 ? 64.28571 ? 29 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 >> ?0 >> >> >> 0 ? ?0 ? ?0 >> >> >> 2 -100.00000 ? ?6 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?1 ? ?0 >> ?0 >> >> >> 0 ? ?0 ? ?0 >> >> >> 3 ?100.00000 ? ?4 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?1 ? ?0 >> ?0 >> >> >> 0 ? ?0 ? ?0 >> >> >> 4 ?-33.33333 ? ?7 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 >> ?0 >> >> >> 0 ? ?0 ? ?0 >> >> >> 5 ?200.00000 ? ?2 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 >> ?0 >> >> >> -1 ? ?0 ? ?0 >> >> >> 6 ?-83.33333 ? 12 ? ?0 ? ?-1 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 >> ?0 >> >> >> 0 ? ?0 ? ?0 >> >> >> >> >> >> adj0708 is actually a 35657x341 data set. Each column after >> "Poss" >> >> is >> >> >> an independent variable, the dependent variable is "MARGIN" and >> it >> >> is >> >> >> weighted by "Poss" >> >> >> >> >> >> >> >> >> The regression is below: >> >> >> fit.adj0708 <- lm( adj0708$MARGIN~adj0708$P235 + adj0708$P247 + >> >> >> adj0708$P703 + adj0708$P430 + adj0708$P489 + adj0708$P218 + >> >> >> adj0708$P605 + adj0708$P337 + .... + >> >> >> adj0708$P510,weights=adj0708$Poss) >> >> >> >> >> >> I have two questions: >> >> >> >> >> >> 1. Is there a way to to condense how I write the independent >> >> variables >> >> >> in the lm(), instead of having such a long line of code (I have >> 339 >> >> >> independent variables to be exact)? >> >> >> 2. I would like to pair the data to look a regression of the >> >> >> interactions between two independent variables. I think it would >> >> look >> >> >> something like this.... >> >> >> fit.adj0708 <- lm( adj0708$MARGIN~adj0708$P235:adj0708$P247 + >> >> >> adj0708$P703:adj0708$P430 + adj0708$P489:adj0708$P218 + >> >> >> adj0708$P605:adj0708$P337 + ....,weights=adj0708$Poss) >> >> >> but there will be 339 Choose 2 combinations, so a lot of >> independent >> >> >> variables! Is there a more efficient way of writing this code. Is >> >> >> there a way I can do this? >> >> >> >> >> >> Thanks, >> >> >> Matt >> >> >> >> >> >> ______________________________________________ >> >> >> R-help at r-project.org mailing list >> >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> >> PLEASE do read the posting guide http://www.R- >> project.org/posting- >> >> >> guide.html >> >> >> and provide commented, minimal, self-contained, reproducible >> code. >> >> > >> > > From jwiley.psych at gmail.com Thu Mar 3 23:43:24 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Thu, 3 Mar 2011 14:43:24 -0800 Subject: [R] Scatter plot with multiple data sets In-Reply-To: <1299182629792-3334096.post@n4.nabble.com> References: <1299182629792-3334096.post@n4.nabble.com> Message-ID: Hi Joe, The easiest option will be to combine all 6 datasets (at least the variables you want to use in your scatter plot), and then create another variable that indicates to which group the observations belong. Here is a small example of what you might do once your data are all together (obviously replace "mtcars" with your dataset name and the variables with your variables): with(mtcars, plot(x = hp, y = mpg, pch = carb)) I am also fond using the ggplot2 package for graphs. require(ggplot2) ggplot(mtcars, aes(x = hp, y = mpg, shape = factor(carb))) + geom_point() Cheers, Josh On Thu, Mar 3, 2011 at 12:03 PM, Jorseff wrote: > Hi, I have multiple (6) data sets which I would like to plot together on one > scatter graph. The reason they are all separate is that I require a > different symbol to be plotted for each set. Could somebody advise on how to > do this? > > Many thanks, > > Joe > > -- > View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-with-multiple-data-sets-tp3334096p3334096.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From dwinsemius at comcast.net Thu Mar 3 23:45:49 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Mar 2011 16:45:49 -0600 Subject: [R] Help center In-Reply-To: References: Message-ID: You need to unsubscribe using the same method you use to subscribe.... go to the webpage for the list and log in with the password you set. If all you wnat to stop is the mailing, you can do so without unsubscribing. On Mar 3, 2011, at 3:09 PM, Bobby Lee wrote: > Could you please take my email off the help center? Thank you very > much. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From wdunlap at tibco.com Thu Mar 3 23:49:47 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 3 Mar 2011 14:49:47 -0800 Subject: [R] creating a count variable in R In-Reply-To: <1299189508645-3334288.post@n4.nabble.com> References: <1299189508645-3334288.post@n4.nabble.com> Message-ID: <77EB52C6DD32BA4D87471DCD70C8D70003F74D3F@NA-PA-VBE03.na.tibco.com> Use cumsum() to count the change points: > Date_var <- as.Date(rep(c("2011-02-04","2011-02-07","2011-01-29"), c(2,3,1))) > data.frame(Date_var, count=cumsum(c(TRUE, Date_var[-1]!=Date_var[-length(Date_var)]))) Date_var count 1 2011-02-04 1 2 2011-02-04 1 3 2011-02-07 2 4 2011-02-07 2 5 2011-02-07 2 6 2011-01-29 3 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of JonC > Sent: Thursday, March 03, 2011 1:58 PM > To: r-help at r-project.org > Subject: [R] creating a count variable in R > > Hi R helpers, > > I'm trying to create a count in R , but as there is no retain > function like > in SAS I'm running into difficulties. > > I have the following : > > Date_var and wish to obtain > Date_var > Count_var > 01/01/2011 > 01/01/2011 > 1 > 01/01/2011 > 01/01/2011 > 2 > 02/01/2011 > 02/01/2011 > 1 > 02/01/2011 > 02/01/2011 > 2 > 02/01/2011 > 02/01/2011 > 3 > 02/01/2011 > 02/01/2011 > 4 > 03/01/2011 > 03/01/2011 > 1 > 03/01/2011 > 03/01/2011 > 2 > 03/01/2011 > 03/01/2011 > 3 > 03/01/2011 > 03/01/2011 > 4 > 03/01/2011 > 03/01/2011 > 5 > 03/01/2011 > 03/01/2011 > 6 > 03/01/2011 > 03/01/2011 > 7 > > As can be seen above the count var is re initialised every > time a new date > is found. I hope this is easy. > > Many thanks in advance for assistance. It is appreciated. > > Cheers > > Jon > > > -- > View this message in context: > http://r.789695.n4.nabble.com/creating-a-count-variable-in-R-t p3334288p3334288.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From dwinsemius at comcast.net Thu Mar 3 23:59:31 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Mar 2011 16:59:31 -0600 Subject: [R] creating a count variable in R In-Reply-To: <1299189508645-3334288.post@n4.nabble.com> References: <1299189508645-3334288.post@n4.nabble.com> Message-ID: <0E66173B-22EC-404C-B1E8-220DEA7EAAAE@comcast.net> On Mar 3, 2011, at 3:58 PM, JonC wrote: > Hi R helpers, > > I'm trying to create a count in R , but as there is no retain > function like > in SAS I'm running into difficulties. Your data is not cut-pastable as presented but this should work: > dfrm$count_var <- ave(as.numeric(dfrm$Date_var), dfrm$Date_var, FUN=seq_along) > dfrm Date_var count_var 1 01/01/2011 1 2 01/01/2011 2 3 02/01/2011 1 4 02/01/2011 2 5 02/01/2011 3 6 02/01/2011 4 7 03/01/2011 1 8 03/01/2011 2 9 03/01/2011 3 10 03/01/2011 4 11 03/01/2011 5 12 03/01/2011 6 13 03/01/2011 7 > > I have the following : > > Date_var and wish to obtain Date_var > Count_var > 01/01/2011 > 01/01/2011 > 1 > 01/01/2011 > 01/01/2011 > 2 > 02/01/2011 > 02/01/2011 > 1 > 02/01/2011 > 02/01/2011 > 2 > 02/01/2011 > 02/01/2011 > 3 > 02/01/2011 > 02/01/2011 > 4 > 03/01/2011 > 03/01/2011 > 1 > 03/01/2011 > 03/01/2011 > 2 > 03/01/2011 > 03/01/2011 > 3 > 03/01/2011 > 03/01/2011 > 4 > 03/01/2011 > 03/01/2011 > 5 > 03/01/2011 > 03/01/2011 > 6 > 03/01/2011 > 03/01/2011 > 7 > > As can be seen above the count var is re initialised every time a > new date > is found. I hope this is easy. > > Many thanks in advance for assistance. It is appreciated. > > Cheers > > Jon > > > -- > View this message in context: http://r.789695.n4.nabble.com/creating-a-count-variable-in-R-tp3334288p3334288.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From Bill.Venables at csiro.au Fri Mar 4 00:03:20 2011 From: Bill.Venables at csiro.au (Bill.Venables at csiro.au) Date: Fri, 4 Mar 2011 10:03:20 +1100 Subject: [R] creating a count variable in R In-Reply-To: <1299189508645-3334288.post@n4.nabble.com> References: <1299189508645-3334288.post@n4.nabble.com> Message-ID: <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A94@EXNSW-MBX03.nexus.csiro.au> You can probably simplify this if you can assume that the dates are in sorted order. Here is a way of doing it even if the days are in arbitrary order. The count refers to the number of times that this date has appeared so far in the sequence. con <- textConnection(" 01/01/2011 01/01/2011 02/01/2011 02/01/2011 02/01/2011 02/01/2011 03/01/2011 03/01/2011 03/01/2011 03/01/2011 03/01/2011 03/01/2011 03/01/2011 ") days <- scan(con, what = "") close(con) X <- model.matrix(~days-1) XX <- apply(X, 2, cumsum) dat <- data.frame(days = days, count = rowSums(X*XX)) dat ### this uses days as a character string vector. If they are actual dates, then convert them to character strings for this operation. Bill Venables. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of JonC Sent: Friday, 4 March 2011 7:58 AM To: r-help at r-project.org Subject: [R] creating a count variable in R Hi R helpers, I'm trying to create a count in R , but as there is no retain function like in SAS I'm running into difficulties. I have the following : Date_var and wish to obtain Date_var Count_var 01/01/2011 01/01/2011 1 01/01/2011 01/01/2011 2 02/01/2011 02/01/2011 1 02/01/2011 02/01/2011 2 02/01/2011 02/01/2011 3 02/01/2011 02/01/2011 4 03/01/2011 03/01/2011 1 03/01/2011 03/01/2011 2 03/01/2011 03/01/2011 3 03/01/2011 03/01/2011 4 03/01/2011 03/01/2011 5 03/01/2011 03/01/2011 6 03/01/2011 03/01/2011 7 As can be seen above the count var is re initialised every time a new date is found. I hope this is easy. Many thanks in advance for assistance. It is appreciated. Cheers Jon -- View this message in context: http://r.789695.n4.nabble.com/creating-a-count-variable-in-R-tp3334288p3334288.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From kparamas at asu.edu Thu Mar 3 23:43:37 2011 From: kparamas at asu.edu (kparamas) Date: Thu, 3 Mar 2011 14:43:37 -0800 (PST) Subject: [R] Plotting Mean in plotting degree distribution Message-ID: <1299192217936-3334375.post@n4.nabble.com> Hi, I am plotting degree distribution of a graph using the function, library(igraph) dd1 = degree.distribution(G) plot(dd1, xlab = "degree", ylab="frequency") I would like to plot the mean of the distribution as a vertical line in the attached plot. Please let me know how to do this. Thanks, Kumar http://r.789695.n4.nabble.com/file/n3334375/cdata3_dd.png cdata3_dd.png -- View this message in context: http://r.789695.n4.nabble.com/Plotting-Mean-in-plotting-degree-distribution-tp3334375p3334375.html Sent from the R help mailing list archive at Nabble.com. From l.mittempergher at nki.nl Fri Mar 4 00:10:31 2011 From: l.mittempergher at nki.nl (l.mittempergher at nki.nl) Date: Fri, 4 Mar 2011 00:10:31 +0100 Subject: [R] R: Help center In-Reply-To: References: , Message-ID: <2E815C4B1C03DC4C8EDB7DE167ECC8562C449687@mstr-2.nki.nl> I also would like to stop the mailing, without unsubscribing myself from the help center. How can I proceed?Thanks Lorenza ________________________________________ Da: r-help-bounces at r-project.org [r-help-bounces at r-project.org] per conto di David Winsemius [dwinsemius at comcast.net] Inviato: gioved? 3 marzo 2011 23.45 A: Bobby Lee Cc: R-help at r-project.org Oggetto: Re: [R] Help center You need to unsubscribe using the same method you use to subscribe.... go to the webpage for the list and log in with the password you set. If all you wnat to stop is the mailing, you can do so without unsubscribing. On Mar 3, 2011, at 3:09 PM, Bobby Lee wrote: > Could you please take my email off the help center? Thank you very > much. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From hilton at meteo.psu.edu Fri Mar 4 01:19:53 2011 From: hilton at meteo.psu.edu (Timothy W. Hilton) Date: Thu, 3 Mar 2011 16:19:53 -0800 Subject: [R] lattice custom axis function -- right side margins In-Reply-To: References: <20110303185022.GB17009@Tim.local> Message-ID: <20110304001953.GA531@Tim.local> To clarify the trouble I'm having with ylab.right, I am not getting an error message; the right-side label just does not appear on the plot. -Tim > On Thu, Mar 3, 2011 at 1:50 PM, Timothy W. Hilton wrote: > > > Many thanks, Richard -- the position argument does exactly what I > > needed. I'm not having any luck with the ylab.right argument. My R and > > lattice are up to date (below); is there something else I should check? > > > > Thanks for the help, > > Tim > > > > > sessionInfo() > > R version 2.12.2 (2011-02-25) > > Platform: i386-apple-darwin9.8.0/i386 (32-bit) > > > > locale: > > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > other attached packages: > > [1] lattice_0.19-17 > > > > loaded via a namespace (and not attached): > > [1] grid_2.12.2 tools_2.12.2 > > > > On Thu, Mar 2011, 03 at 01:05:49PM -0500, Richard M. Heiberger wrote: > > > print(my_plot(example_data, ylab.right=expression(e==mc^2)), > > > position=c(0,0,.95,1)) > > > > > > You will need a recent R version for the ylab.right argument. > > > > > > On Thu, Mar 3, 2011 at 12:52 PM, Timothy W. Hilton > >wrote: > > > > > > > Dear R help list, > > > > > > From jholtman at gmail.com Fri Mar 4 01:26:12 2011 From: jholtman at gmail.com (Jim Holtman) Date: Thu, 3 Mar 2011 19:26:12 -0500 Subject: [R] error in saved .csv In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ehlers at ucalgary.ca Fri Mar 4 01:32:18 2011 From: ehlers at ucalgary.ca (P Ehlers) Date: Thu, 3 Mar 2011 16:32:18 -0800 Subject: [R] lattice: How to increase space between ticks and labels of z-axis? In-Reply-To: References: Message-ID: <4D703312.1090201@ucalgary.ca> Marius Hofert wrote: > Dear expeRts, > > How can I increase the space between the ticks and the labels in the wireframe plot > below? I tried some variations with par.settings=list(..) but it just didn't work. Marius, I tried setting the 'distance' parameter, but that was less than satisfactory. One way is to modify the labels appropriately: z_at <- seq(2000,10000,2000) z_labs <- paste(z_at, " ", sep="") which tacks on some spaces, and then plot: wireframe(z~grid[,1]*grid[,2], aspect=1, scales = list(arrows = FALSE, z = list(at = z_at, lab = z_labs) ), zlab = list("z", hjust = 3), ylab = list(rot = -40), xlab = list(rot = 30) ) Peter Ehlers > > Many thanks, > > Marius > > > > library(lattice) > > u <- seq(0, 1, length.out=20) > grid <- expand.grid(x=u, y=u) > z <- apply(grid, 1, function(x) 1/(x[1]*x[2]+0.0001)) > > wireframe(z~grid[,1]*grid[,2], aspect=1, scales=list(col=1, arrows=FALSE)) > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ehlers at ucalgary.ca Fri Mar 4 02:01:58 2011 From: ehlers at ucalgary.ca (P Ehlers) Date: Thu, 03 Mar 2011 17:01:58 -0800 Subject: [R] lattice custom axis function -- right side margins In-Reply-To: <20110304001953.GA531@Tim.local> References: <20110303185022.GB17009@Tim.local> <20110304001953.GA531@Tim.local> Message-ID: <4D703A06.9090806@ucalgary.ca> Timothy W. Hilton wrote: > To clarify the trouble I'm having with ylab.right, I am not getting an > error message; the right-side label just does not appear on the plot. Maybe this is mac-specific. On Windows, the label shows up just fine. You might be able to make it appear by adjusting the 'vjust' argument to ylab.right: print(my_plot(example_data, ylab.right = list(expression(e==mc^2), vjust = -2)), position = c(0,0,.95,1)) Try playing with 'vjust'. Peter Ehlers > > -Tim > >> On Thu, Mar 3, 2011 at 1:50 PM, Timothy W. Hilton wrote: >> >>> Many thanks, Richard -- the position argument does exactly what I >>> needed. I'm not having any luck with the ylab.right argument. My R and >>> lattice are up to date (below); is there something else I should check? >>> >>> Thanks for the help, >>> Tim >>> >>>> sessionInfo() >>> R version 2.12.2 (2011-02-25) >>> Platform: i386-apple-darwin9.8.0/i386 (32-bit) >>> >>> locale: >>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] lattice_0.19-17 >>> >>> loaded via a namespace (and not attached): >>> [1] grid_2.12.2 tools_2.12.2 >>> >>> On Thu, Mar 2011, 03 at 01:05:49PM -0500, Richard M. Heiberger wrote: >>>> print(my_plot(example_data, ylab.right=expression(e==mc^2)), >>>> position=c(0,0,.95,1)) >>>> >>>> You will need a recent R version for the ylab.right argument. >>>> >>>> On Thu, Mar 3, 2011 at 12:52 PM, Timothy W. Hilton >>> wrote: >>>> >>>>> Dear R help list, >>>>> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jwiley.psych at gmail.com Fri Mar 4 02:15:34 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Thu, 3 Mar 2011 17:15:34 -0800 Subject: [R] PCA - scores In-Reply-To: <131E1151-2802-4793-9901-5F8E7EBB3FB8@ualberta.ca> References: <131E1151-2802-4793-9901-5F8E7EBB3FB8@ualberta.ca> Message-ID: Hi Shari, Yes, please look at the documentation for principal. You can access this (assuming you have loaded psych) by typing at the console: ?principal note the logical argument "scores". Here is a small example: ############################## require(psych) require(GPArotation) dat <- principal(mtcars[, c("mpg", "hp", "wt")], nfactors = 1, rotate = "oblimin", scores = TRUE) dat$scores ############################## Cheerio, Josh On Thu, Mar 3, 2011 at 1:02 PM, Shari Clare wrote: > I am running a PCA, but would like to rotate my data and limit the > number of factors that are analyzed. ?I can do this using the > "principal" command from the psych package [principal(my.data, > nfactors=3,rotate="varimax")], but the issue is that this does not > report scores for the Principal Components the way "princomp" does. > > My question is: > > Can you get an output of scores using "principal" OR, is there a way > to limit the number of factors that are included when you use > "princomp"? > > Thanks, > Shari Clare > > PhD Candidate > Department of Renewable Resources > University of Alberta > sclare at ualberta.ca > 780-492-2540 > > > > > > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From hilton at meteo.psu.edu Fri Mar 4 03:06:45 2011 From: hilton at meteo.psu.edu (Timothy W. Hilton) Date: Thu, 3 Mar 2011 18:06:45 -0800 Subject: [R] lattice custom axis function -- right side margins In-Reply-To: <4D703A06.9090806@ucalgary.ca> References: <20110303185022.GB17009@Tim.local> <20110304001953.GA531@Tim.local> <4D703A06.9090806@ucalgary.ca> Message-ID: <20110304020644.GA846@Tim.local> I went to try your suggestion, and the label appeared without the vjust argument. I usually run R within emacs using ESS; I happened to restart my emacs earlier today. That's the only thing I can think of that I changed. Next time I'll try running R outside of emacs before asking for help. Many thanks to all who responded! -Tim On Thu, Mar 2011, 03 at 05:01:58PM -0800, P Ehlers wrote: > Timothy W. Hilton wrote: > >To clarify the trouble I'm having with ylab.right, I am not getting an > >error message; the right-side label just does not appear on the plot. > > Maybe this is mac-specific. On Windows, the label shows up > just fine. You might be able to make it appear by adjusting > the 'vjust' argument to ylab.right: > > print(my_plot(example_data, > ylab.right = list(expression(e==mc^2), vjust = -2)), > position = c(0,0,.95,1)) > > Try playing with 'vjust'. > > Peter Ehlers > > > > >-Tim > > > >>On Thu, Mar 3, 2011 at 1:50 PM, Timothy W. Hilton wrote: > >> > >>>Many thanks, Richard -- the position argument does exactly what I > >>>needed. I'm not having any luck with the ylab.right argument. My R and > >>>lattice are up to date (below); is there something else I should check? > >>> > >>>Thanks for the help, > >>>Tim > >>> > >>>>sessionInfo() > >>>R version 2.12.2 (2011-02-25) > >>>Platform: i386-apple-darwin9.8.0/i386 (32-bit) > >>> > >>>locale: > >>>[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > >>> > >>>attached base packages: > >>>[1] stats graphics grDevices utils datasets methods base > >>> > >>>other attached packages: > >>>[1] lattice_0.19-17 > >>> > >>>loaded via a namespace (and not attached): > >>>[1] grid_2.12.2 tools_2.12.2 > >>> > >>>On Thu, Mar 2011, 03 at 01:05:49PM -0500, Richard M. Heiberger wrote: > >>>>print(my_plot(example_data, ylab.right=expression(e==mc^2)), > >>>>position=c(0,0,.95,1)) > >>>> > >>>>You will need a recent R version for the ylab.right argument. > >>>> > >>>>On Thu, Mar 3, 2011 at 12:52 PM, Timothy W. Hilton >>>>wrote: > >>>> > >>>>>Dear R help list, > >>>>> > > > >______________________________________________ > >R-help at r-project.org mailing list > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > From Bill.Venables at csiro.au Fri Mar 4 03:11:11 2011 From: Bill.Venables at csiro.au (Bill.Venables at csiro.au) Date: Fri, 4 Mar 2011 13:11:11 +1100 Subject: [R] R usage survey In-Reply-To: References: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Message-ID: <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A95@EXNSW-MBX03.nexus.csiro.au> No. That's not answering the question. ALL surveys are for collecting information. The substantive issue is what purpose do you have in seeking this information in the first place and what are you going to do with it when you get it? Do you have some commercial purpose in mind? If so, what is it? -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Harsh Sent: Friday, 4 March 2011 1:13 AM To: rex.dwyer at syngenta.com Cc: r-help at r-project.org Subject: Re: [R] R usage survey Hi Rex and useRs, The purpose of the survey has been mentioned on the survey link goo.gl/jw1ig but I will also reproduce it here. - Geographical distribution of R users - Application areas where R is being used - Supporting technology being used along with R - Academic background distribution of R users The potential personally identifiable information such as name and employer name are optional fields. Actually all the fields in the survey are optional. Some of the analysis output(s) could be along the lines of :- - Usage statistics of various R packages - Distribution of R users across countries/cities - Mapping various applications to packages - Text Mining of the responses to create informative word clouds Personally, I am excited about the kind of data I will receive through this survey and the various insights that could be derived. As already mentioned, the results will be shared with the community. Thank you Rex for raising an important point. It is indeed necessary for me to personally assure the user community that the results will be shared in a manner that will not contain any personally identifiable information. Those who wish to gain access to the raw data will be provided with all the fields but not the name and employer name fields. Just out of curiosity : It is possible to get name, employer name, location, usage information and academic background details when searching for R users on LinkedIn and the many R related groups there. Does this also provide potential opportunities for misuse and "outrageous" analyses, since almost anyone can get onto LinkedIn and access user profiles ? Thank you for your interest and support. Regards, Harsh On Thu, Mar 3, 2011 at 8:02 PM, wrote: > Harsh, "Suitably analyzed" for whose purposes? One man's "suitable" is > another's "outrageous". That's why people want to see the gowns at the > Oscars. Under what auspices are you conducting this survey? What do you > intend to do with it? You don't give any assurance that the results you > post won't have personally identifiable information. I don't get the > impression that you know much about survey design. > > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of Harsh > Sent: Thursday, March 03, 2011 5:53 AM > To: r-help at r-project.org > Subject: [R] R usage survey > > Hi R users, > I request members of the R community to consider filling a short survey > regarding the use of R. > The survey can be found at http://goo.gl/jw1ig > > Please accept my apologies for posting here for a non-technical reason. > > The data collected will be suitably analyzed and I'll post a link to the > results in the coming weeks. > > Thank you all for your interest and for sharing your R usage information. > > Regards, > Harsh Singhal > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > > message may contain confidential information. If you are not the designated > recipient, please notify the sender immediately, and delete the original and > any copies. Any use of the message by you is prohibited. > > [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From Michael.Folkes at dfo-mpo.gc.ca Fri Mar 4 03:23:36 2011 From: Michael.Folkes at dfo-mpo.gc.ca (Folkes, Michael) Date: Thu, 3 Mar 2011 18:23:36 -0800 Subject: [R] Floating points and floor() ? Message-ID: <63F107BCC37AEA49A75FD94AA3E07CB007A8BBE1@pacpbsex01.pac.dfo-mpo.ca> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ericstrom at aol.com Fri Mar 4 03:21:20 2011 From: ericstrom at aol.com (eric) Date: Thu, 3 Mar 2011 18:21:20 -0800 (PST) Subject: [R] What am I doing wrong with this loop ? In-Reply-To: <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A89@EXNSW-MBX03.nexus.csiro.au> References: <1299122362375-3332703.post@n4.nabble.com> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A89@EXNSW-MBX03.nexus.csiro.au> Message-ID: <1299205280433-3334591.post@n4.nabble.com> Bill, I addressed the first issue with the data frames and length(x). But my loops still isn't working. More importantly, you commented that I should be using if(...) ... else ... rather than ifelse(.,.,). Please help me understand the difference. I thought ifelse was just a faster way of doing if(...)...else(.,.,). What is the difference between these two methods ? -- View this message in context: http://r.789695.n4.nabble.com/What-am-I-doing-wrong-with-this-loop-tp3332703p3334591.html Sent from the R help mailing list archive at Nabble.com. From ericstrom at aol.com Fri Mar 4 03:49:51 2011 From: ericstrom at aol.com (eric) Date: Thu, 3 Mar 2011 18:49:51 -0800 (PST) Subject: [R] What am I doing wrong with this loop ? In-Reply-To: <1299122362375-3332703.post@n4.nabble.com> References: <1299122362375-3332703.post@n4.nabble.com> Message-ID: <1299206991679-3334609.post@n4.nabble.com> Never mind Bill....got it. Always seems to happen this way. Can't figure something out. Post to the site and wham, 5 min later (after posting), it's all clear. Oh well, thanks for the tips -- View this message in context: http://r.789695.n4.nabble.com/What-am-I-doing-wrong-with-this-loop-tp3332703p3334609.html Sent from the R help mailing list archive at Nabble.com. From scttchamberlain4 at gmail.com Fri Mar 4 02:04:26 2011 From: scttchamberlain4 at gmail.com (Scott Chamberlain) Date: Thu, 3 Mar 2011 19:04:26 -0600 Subject: [R] Plotting Mean in plotting degree distribution In-Reply-To: <1299192217936-3334375.post@n4.nabble.com> References: <1299192217936-3334375.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From felipe.parra at quantil.com.co Fri Mar 4 04:18:21 2011 From: felipe.parra at quantil.com.co (Luis Felipe Parra) Date: Fri, 4 Mar 2011 11:18:21 +0800 Subject: [R] Problems with a function warning Message-ID: An embedded and charset-unspecified text was scrubbed... Name: no disponible URL: From jwiley.psych at gmail.com Fri Mar 4 04:30:06 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Thu, 3 Mar 2011 19:30:06 -0800 Subject: [R] Problems with a function warning In-Reply-To: References: Message-ID: Dear Felipe, I did not have any difficulty with it using: R version 2.12.1 (2010-12-16) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] tcltk stats graphics grDevices utils datasets methods [8] base I am wondering if possibly you did not load package tcltk before trying to use your function? Cheers, Josh On Thu, Mar 3, 2011 at 7:18 PM, Luis Felipe Parra wrote: > Hello. I have the following funtion: > > fechasEntrega = function(FechasEntrega,fecha){ > ?if(length(which(FechasEntrega0){ > ? ? ? ?tkmessageBox(title = "Error en Fecha de Valoracion",message="Hay una > fecha de entrega anterior a la fecha de valoracion. Todas las fechas de > entrega deben ser posteriores a la fecha de valoraci?n para el correcto > funcionamiento del programa.", icon="error", type="ok") > ? ? ? ?stop("Hay una fecha de entrega anterior a la fecha de valoracion. > Todas las fechas de entrega deben ser \n posteriores a la fecha de > valoraci?n para el correcto funcionamiento del programa.") > ?} > } > > which has two entries. The first one is a vector of dates and the second one > is a date. It ?verifies a condition and gives a warning and an error message > in the R gui if the condition is satisfied. I am having trouble because I > have two run the function twice before the warning or the error message > appear. I dont know why if I just run the function once none of them appear. > Does any body know what can be going on? > > Thank you > > Felipe Parra > > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From dwinsemius at comcast.net Fri Mar 4 04:33:43 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Mar 2011 21:33:43 -0600 Subject: [R] Floating points and floor() ? In-Reply-To: <63F107BCC37AEA49A75FD94AA3E07CB007A8BBE1@pacpbsex01.pac.dfo-mpo.ca> References: <63F107BCC37AEA49A75FD94AA3E07CB007A8BBE1@pacpbsex01.pac.dfo-mpo.ca> Message-ID: <43B070A1-216F-465A-AC8D-0DF5C98A5D6A@comcast.net> On Mar 3, 2011, at 8:23 PM, Folkes, Michael wrote: > Perhaps somebody could clarify for me if the following is a floating > point matter or otherwise, and how am I to correct for it? > >> floor(100*.1) > [1] 10 > >> 100*(1.0-.9) > [1] 10 > >> floor(100*(1-0.9)) > [1] 9 > > Yes. It's a "floating point matter". What do you mean by "correct for it"? What result would be "correct"? David Winsemius, MD Heritage Laboratories West Hartford, CT From dwinsemius at comcast.net Fri Mar 4 04:41:18 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 3 Mar 2011 21:41:18 -0600 Subject: [R] R: Help center In-Reply-To: <2E815C4B1C03DC4C8EDB7DE167ECC8562C449687@mstr-2.nki.nl> References: , <2E815C4B1C03DC4C8EDB7DE167ECC8562C449687@mstr-2.nki.nl> Message-ID: <767AE22D-7997-4BB7-A575-6B736C780688@comcast.net> On Mar 3, 2011, at 5:10 PM, wrote: > I also would like to stop the mailing, without unsubscribing myself > from the help center. > > How can I proceed?Thanks All of the options for your subscription are changed on the same webpage. -- David. > > Lorenza > ________________________________________ > Da: r-help-bounces at r-project.org [r-help-bounces at r-project.org] per > conto di David Winsemius [dwinsemius at comcast.net] > Inviato: gioved? 3 marzo 2011 23.45 > A: Bobby Lee > Cc: R-help at r-project.org > Oggetto: Re: [R] Help center > > You need to unsubscribe using the same method you use to subscribe.... > go to the webpage for the list and log in with the password you set. > If all you wnat to stop is the mailing, you can do so without > unsubscribing. > > > On Mar 3, 2011, at 3:09 PM, Bobby Lee wrote: > >> Could you please take my email off the help center? Thank you very >> much. >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From jdnewmil at dcn.davis.ca.us Fri Mar 4 04:33:29 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Thu, 03 Mar 2011 19:33:29 -0800 Subject: [R] Floating points and floor() ? In-Reply-To: <63F107BCC37AEA49A75FD94AA3E07CB007A8BBE1@pacpbsex01.pac.dfo-mpo.ca> References: <63F107BCC37AEA49A75FD94AA3E07CB007A8BBE1@pacpbsex01.pac.dfo-mpo.ca> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rex.dwyer at syngenta.com Fri Mar 4 05:24:23 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Thu, 3 Mar 2011 23:24:23 -0500 Subject: [R] Floating points and floor() ? In-Reply-To: <63F107BCC37AEA49A75FD94AA3E07CB007A8BBE1@pacpbsex01.pac.dfo-mpo.ca> References: <63F107BCC37AEA49A75FD94AA3E07CB007A8BBE1@pacpbsex01.pac.dfo-mpo.ca> Message-ID: <36180405F8418449918AD20618D110FC095BFA6E02@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Hi Michael, In floating point calculation, 1.0-.9 is not exactly 0.1. This is easily seen by subtracting. > (1.0-.9)-0.1 [1] -2.775558e-17 > (1.0-.9)==0.1 [1] FALSE David is right, you can't "correct" this. You can only compensate by taking care that you never, ever test whether 2 FP numbers are equal, because they almost never are. You must always ask whether the difference is small. > round(1.0-.9-.1,15)==0 [1] TRUE Unfortunately, most of us forget this rule once in a while and write a loop like "while (x!=0)..." that won't terminate. HTH Rex -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Folkes, Michael Sent: Thursday, March 03, 2011 9:24 PM To: r-help at r-project.org Subject: [R] Floating points and floor() ? Perhaps somebody could clarify for me if the following is a floating point matter or otherwise, and how am I to correct for it? > floor(100*.1) [1] 10 > 100*(1.0-.9) [1] 10 > floor(100*(1-0.9)) [1] 9 Thanks! Michael _______________________________________________________ Michael Folkes Salmon Stock Assessment Canadian Dept. of Fisheries & Oceans Pacific Biological Station 3190 Hammond Bay Rd. Nanaimo, B.C., Canada V9T-6N7 Ph (250) 756-7264 Fax (250) 756-7053 Michael.Folkes at dfo-mpo.gc.ca [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. From tingting.zhan at jefferson.edu Fri Mar 4 05:29:22 2011 From: tingting.zhan at jefferson.edu (tingtingzhan) Date: Thu, 3 Mar 2011 20:29:22 -0800 (PST) Subject: [R] how to store lme/lmer fit result In-Reply-To: <40e66e0b0811030851p32a3fd53gdebe3b44a6647d4e@mail.gmail.com> References: <1299212962335-869777.post@n4.nabble.com> <40e66e0b0811030851p32a3fd53gdebe3b44a6647d4e@mail.gmail.com> Message-ID: <1299212962327-3334663.post@n4.nabble.com> Hi All, I'm experiencing difficulties in saving a model fit by gls(). Basically if I just save() a gls object "gls.fit" in my workspace into an .RData file and later reload this .RData file, I get error when running script such as summary(gls.fit)$coefficients Dr. Bates's reply (quoted below) is the latest info I could find by Google. Would any one suggest if there is any modern way to save an gls object? Many thanks, Tingting Re: how to store lme/lmer fit result Nov 03, 2008; 11:51am ? by Douglas Bates-2 On Thu, Oct 9, 2008 at 8:51 PM, liujb <[hidden email]> wrote: > Dear R users, > I am building a hierarchical model on a large data set. It can take quite > some time to finish one fit, I was just wondering whether it is possible > to > store the fit object (the result) to a file for later (offline) analysis. As others have suggested, you can use save/load to save and restore the fitted model. However, there is a problem with trying to save and load a model fit by lmer. These are S4 classed objects and can only be reloaded in a version of the lme4 package with the same definition of the S4 class. Unfortunately, as I develop the computing methods I change the class definition to reflect the newer approach. This can mean that you can restore a model fit with a previous version of the lme4 package but you can't do anything with it. I regret that this happens but I do feel that the changes ultimately are beneficial. The best advice is that, in addition to saving the fitted model, you should also save the original data and a copy of the code that was used to fit the model. An alternative is to save a copy of the lme4 package and the Matrix package along with the saved model so you can downdate those packages if necessary. This may be tricky because of dependencies between packages that may prevent downdating some but not all packages. -- View this message in context: http://r.789695.n4.nabble.com/how-to-store-lme-lmer-fit-result-tp869777p3334663.html Sent from the R help mailing list archive at Nabble.com. From lists at revelle.net Fri Mar 4 05:42:18 2011 From: lists at revelle.net (William Revelle) Date: Thu, 3 Mar 2011 22:42:18 -0600 Subject: [R] PCA - scores In-Reply-To: References: <131E1151-2802-4793-9901-5F8E7EBB3FB8@ualberta.ca> Message-ID: Shari, Josh partly answered your question, but his example did not include rotation because he took out just one factor. Try: require(psych) mt.pc <- principal(mtcars,3,scores=TRUE) #this gives you the varimax rotated first 3 principal components #pc.scores <- mt.pc$scores #here are the scores biplot(mt.pc) #show the data as well as the principal components in a biplot Bill At 5:15 PM -0800 3/3/11, Joshua Wiley wrote: >Hi Shari, > >Yes, please look at the documentation for principal. You can access >this (assuming you have loaded psych) by typing at the console: > >?principal > >note the logical argument "scores". > >Here is a small example: > >############################## >require(psych) >require(GPArotation) > >dat <- principal(mtcars[, c("mpg", "hp", "wt")], nfactors = 1, > rotate = "oblimin", scores = TRUE) > >dat$scores >############################## > >Cheerio, > >Josh > >On Thu, Mar 3, 2011 at 1:02 PM, Shari Clare wrote: >> I am running a PCA, but would like to rotate my data and limit the >> number of factors that are analyzed. I can do this using the >> "principal" command from the psych package [principal(my.data, >> nfactors=3,rotate="varimax")], but the issue is that this does not >> report scores for the Principal Components the way "princomp" does. >> >> My question is: >> >> Can you get an output of scores using "principal" OR, is there a way >> to limit the number of factors that are included when you use >> "princomp"? >> >> Thanks, >> Shari Clare >> >> PhD Candidate >> Department of Renewable Resources >> University of Alberta >> sclare at ualberta.ca >> 780-492-2540 >> >> >> >> >> >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > >-- >Joshua Wiley >Ph.D. Student, Health Psychology >University of California, Los Angeles >http://www.joshuawiley.com/ > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. From carrieandstat at gmail.com Fri Mar 4 06:09:32 2011 From: carrieandstat at gmail.com (Carrie Li) Date: Fri, 4 Mar 2011 00:09:32 -0500 Subject: [R] questions about using loop, while and next Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From erinm.hodgess at gmail.com Fri Mar 4 06:47:36 2011 From: erinm.hodgess at gmail.com (Erin Hodgess) Date: Thu, 3 Mar 2011 23:47:36 -0600 Subject: [R] advice on classes/methods/extending classes Message-ID: Dear R People: What is the best way to learn about classes, methods, extending classes, and namespaces, please? I know a bit about classes, but would like to learn much more. Thanks in advance for any advice! Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodgess at gmail.com From jwiley.psych at gmail.com Fri Mar 4 08:02:47 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Thu, 3 Mar 2011 23:02:47 -0800 Subject: [R] advice on classes/methods/extending classes In-Reply-To: References: Message-ID: Hi Erin, One good option would be the official manual: http://cran.r-project.org/doc/manuals/R-exts.html It depends to an extent, I think, on what types of methods you would like to work with and use. FWIW, I have and really enjoy both S Programming by Venables & Ripley (mostly S3 methods and general programming, though I am sure that poor summary does not do it justice) and Software for Data Analysis by John Chambers (S4 methods). HTH, Josh On Thu, Mar 3, 2011 at 9:47 PM, Erin Hodgess wrote: > Dear R People: > > What is the best way to learn about classes, methods, extending > classes, and namespaces, please? > > I know a bit about classes, but would like to learn much more. > > Thanks in advance for any advice! > > Sincerely, > Erin > > > -- > Erin Hodgess > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: erinm.hodgess at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From alaios at yahoo.com Fri Mar 4 08:42:20 2011 From: alaios at yahoo.com (Alaios) Date: Thu, 3 Mar 2011 23:42:20 -0800 (PST) Subject: [R] How two compare two matrixes Message-ID: <85081.47838.qm@web120107.mail.ne1.yahoo.com> Dear all I have two 10*10 matrixes and I would like to compare theirs contents. By the word content I mean to check visually (not with any mathematical formulation) how similar are the contents. I also know edit that prints my matrix in the scree but still one edit blocks the prompt to launch a second edit() screen. What is the best way to compare these two matrices? I would like to thank you in avdance for your help Regards Alex From p.pagel at wzw.tum.de Fri Mar 4 09:04:41 2011 From: p.pagel at wzw.tum.de (Philipp Pagel) Date: Fri, 4 Mar 2011 09:04:41 +0100 Subject: [R] How two compare two matrixes In-Reply-To: <85081.47838.qm@web120107.mail.ne1.yahoo.com> References: <85081.47838.qm@web120107.mail.ne1.yahoo.com> Message-ID: <20110304080441.GA3897@maker> > Dear all I have two 10*10 matrixes and I would like to compare > theirs contents. By the word content I mean to check visually (not > with any mathematical formulation) how similar are the contents. If they are really only 10x10 you can simply print them both to the screen and look at them. I'm not sure what else you could do if you are not interested in a specific distance emasure etc. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ From savicky at praha1.ff.cuni.cz Fri Mar 4 09:16:12 2011 From: savicky at praha1.ff.cuni.cz (Petr Savicky) Date: Fri, 4 Mar 2011 09:16:12 +0100 Subject: [R] Floating points and floor() ? In-Reply-To: <63F107BCC37AEA49A75FD94AA3E07CB007A8BBE1@pacpbsex01.pac.dfo-mpo.ca> References: <63F107BCC37AEA49A75FD94AA3E07CB007A8BBE1@pacpbsex01.pac.dfo-mpo.ca> Message-ID: <20110304081612.GA15715@praha1.ff.cuni.cz> On Thu, Mar 03, 2011 at 06:23:36PM -0800, Folkes, Michael wrote: > Perhaps somebody could clarify for me if the following is a floating > point matter or otherwise, and how am I to correct for it? > > > floor(100*.1) > [1] 10 > > > 100*(1.0-.9) > [1] 10 > > > floor(100*(1-0.9)) > [1] 9 As others pointed out, 0.1 is not exactly representable in base 2, so we get formatC(0.1, digits=20) [1] "0.10000000000000000555" formatC(100*0.1, digits=20, width=-1) [1] "10" formatC(100*(1 - 0.9), digits=20) [1] "9.9999999999999982236" A correct result may be obtained, if you reorganize your calculation, so that all intermediate results are integers and the inaccurate division is only the last operation. Then, floor(n/10) will be correct and also n %/% 10 may be used. Alternatively, if you work with decimal numbers with 1 or 2 decimal digits, then also floor(round(x, 1)) or floor(round(x, 2)) work correctly, if x is not too large. See FAQ 7.31 http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f and http://rwiki.sciviews.org/doku.php?id=misc:r_accuracy:decimal_numbers for further examples and some hints. Hope this helps. Petr Savicky. From m_hofert at web.de Fri Mar 4 09:17:47 2011 From: m_hofert at web.de (Marius Hofert) Date: Fri, 4 Mar 2011 09:17:47 +0100 Subject: [R] lattice: wireframe "eats up" points; how to make points on wireframe visible? Message-ID: <1D3CE86E-61F3-4D79-B3CB-BCDB79FFEFC1@web.de> Dear expeRts, I would like to add two points to a wireframe plot. The points have (x,y,z) coordinates where z is determined to be on the wireframe [same z-value]. Now something strange happens. One point is perfectly plotted, the other isn't shown at all. It only appears if I move it upwards in z-direction by adding a positive number. So somehow it disappears in the wireframe-surface *although* the plot symbol [the cross] has a positive length in each dimension [I also chose cex=5 to make it large enough so that it should (theoretically) be visible]. My wireframe plot is a complicated function which I cannot post here. Below is a minimal example, however, it didn't show the same problem [the surface is too nice I guess]. I therefore *artifically* create the problem in the example below so that you know what I mean. For one of the points, I subtract an epsilon [=0.25] in z-direction and suddenly the point completely disappears. The strange thing is that the point is not even "under" the surface [use the screen-argument to rotate the wireframe plot to check this], it's simply gone, eaten up by the surface. How can I make the two points visible? I also tried to use the alpha-argument to make the wireframe transparent, but I couldn't solve the problem. Cheers, Marius PS: One also faces this problem for example if one wants to make points visible that are on "opposite sides" of the wireframe. library(lattice) f <- function(x) 1/((1-x[1])*(1-x[2])+1) u <- seq(0, 1, length.out=20) grid <- expand.grid(x=u, y=u) x <- grid[,1] y <- grid[,2] z <- apply(grid, 1, f) pt.x <- c(0.2, 0.5) pt.y <- c(0.6, 0.8) eps <- 0.25 pts <- rbind(c(pt.x, f(pt.x)-eps), c(pt.y, f(pt.y))) # points to add to the wireframe wireframe(z~x*y, pts=pts, aspect=1, scales=list(col=1, arrows=FALSE), panel.3d.wireframe = function(x,y,z,xlim,ylim,zlim,xlim.scaled, ylim.scaled,zlim.scaled,pts,...){ panel.3dwire(x=x, y=y, z=z, xlim=xlim, ylim=ylim, zlim=zlim, xlim.scaled=xlim.scaled, ylim.scaled=ylim.scaled, zlim.scaled=zlim.scaled, ...) panel.3dscatter(x=pts[,1], y=pts[,2], z=pts[,3], xlim=xlim, ylim=ylim, zlim=zlim, xlim.scaled=xlim.scaled, ylim.scaled=ylim.scaled, zlim.scaled=zlim.scaled, type="p", col=c(2,3), cex=1.8, .scale=TRUE, ...) }, key=list(x=0.5, y=0.95, points=list(col=c(2,3)), text=list(c("Point 1", "Point 2")), cex=1, align=TRUE, transparent=TRUE)) From m_hofert at web.de Fri Mar 4 09:19:24 2011 From: m_hofert at web.de (Marius Hofert) Date: Fri, 4 Mar 2011 09:19:24 +0100 Subject: [R] lattice: How to increase space between ticks and labels of z-axis? In-Reply-To: <4D703312.1090201@ucalgary.ca> References: <4D703312.1090201@ucalgary.ca> Message-ID: Dear Peter, nice approach! Of course it's a bit tedious because you have to specify where the ticks are drawn yourself. But it solves the problem. Thanks! Marius On 2011-03-04, at 01:32 , P Ehlers wrote: > Marius Hofert wrote: >> Dear expeRts, >> How can I increase the space between the ticks and the labels in the wireframe plot >> below? I tried some variations with par.settings=list(..) but it just didn't work. > > Marius, > > I tried setting the 'distance' parameter, but that was less > than satisfactory. One way is to modify the labels appropriately: > > z_at <- seq(2000,10000,2000) > z_labs <- paste(z_at, " ", sep="") > > which tacks on some spaces, and then plot: > > wireframe(z~grid[,1]*grid[,2], > aspect=1, > scales = list(arrows = FALSE, > z = list(at = z_at, lab = z_labs) > ), > zlab = list("z", hjust = 3), > ylab = list(rot = -40), > xlab = list(rot = 30) > ) > > Peter Ehlers > >> Many thanks, >> Marius >> library(lattice) >> u <- seq(0, 1, length.out=20) >> grid <- expand.grid(x=u, y=u) >> z <- apply(grid, 1, function(x) 1/(x[1]*x[2]+0.0001)) >> wireframe(z~grid[,1]*grid[,2], aspect=1, scales=list(col=1, arrows=FALSE)) >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. From bhh at xs4all.nl Fri Mar 4 10:00:23 2011 From: bhh at xs4all.nl (Berend Hasselman) Date: Fri, 4 Mar 2011 01:00:23 -0800 (PST) Subject: [R] questions about using loop, while and next In-Reply-To: References: Message-ID: <1299229223477-3334880.post@n4.nabble.com> Carrie Li wrote: > > ... > In my loop, I have some random generation of data, but if the data doesn't > meet some condition, then I want it to go next, and generate data again > for > next round. > > # just an example.. > # i want to generate the data again, if the sum is smaller than 25 > temp=rep(NA, 10) > for(i in 1:10) > { > dt=sum(rbinom(10, 5, 0.5)) > while (dt<25) next > temp[i]=dt > } > > I also tried while(dt<25) {i=i+1} > But it doesn't seem right to me, since it running nonstop. Any solutions ? > ... > You don't need next. I think you mean this temp <- rep(NA, 10) for(i in 1:10) { dt <- 0 while (dt<25) dt <- sum(rbinom(10, 5, 0.5)) temp[i] <- dt } /Berend -- View this message in context: http://r.789695.n4.nabble.com/questions-about-using-loop-while-and-next-tp3334692p3334880.html Sent from the R help mailing list archive at Nabble.com. From singhalblr at gmail.com Fri Mar 4 10:19:12 2011 From: singhalblr at gmail.com (Harsh) Date: Fri, 4 Mar 2011 14:49:12 +0530 Subject: [R] R usage survey In-Reply-To: <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A95@EXNSW-MBX03.nexus.csiro.au> References: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A95@EXNSW-MBX03.nexus.csiro.au> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From crosspide at hotmail.com Fri Mar 4 10:11:15 2011 From: crosspide at hotmail.com (agent dunham) Date: Fri, 4 Mar 2011 01:11:15 -0800 (PST) Subject: [R] cv.lm syntax error Message-ID: <1299229875043-3334889.post@n4.nabble.com> Dear all, I've tried a multiple regression, and now I want to try a cross-validation. I obtain this error (it must be sth related to df) that I don't understand, any help would be appreciated. cv.lm(df= dat, lm2.52f, m=3) Error en `[.data.frame`(df, , ynam) : undefined columns selected lm2.52f is my lm object, dat is a dataframe where the variables involved in .lm are I tried CVlm also but the same error Thanks, user at host.com -- View this message in context: http://r.789695.n4.nabble.com/cv-lm-syntax-error-tp3334889p3334889.html Sent from the R help mailing list archive at Nabble.com. From dimitrij.kudriavcev at ntsg.lt Fri Mar 4 07:08:48 2011 From: dimitrij.kudriavcev at ntsg.lt (Dmitrij Kudriavcev) Date: Fri, 4 Mar 2011 17:08:48 +1100 Subject: [R] How to copy data from data.frame to matrix Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From fangxiaofeng at gmail.com Fri Mar 4 07:24:23 2011 From: fangxiaofeng at gmail.com (Jeff Fang) Date: Fri, 4 Mar 2011 14:24:23 +0800 Subject: [R] Question about Chi-squared test Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From hiemstra at knmi.nl Fri Mar 4 09:48:00 2011 From: hiemstra at knmi.nl (hiemstra) Date: Fri, 04 Mar 2011 09:48:00 +0100 Subject: [R] parallel bootstrap linear model on multicore mac (re-post) In-Reply-To: <4D6EC6F9.60202@fiu.edu> References: <4D6EC6F9.60202@fiu.edu> Message-ID: <4D70A740.5070405@knmi.nl> On 03/02/2011 11:38 PM, Anthony Dick wrote: > Hello all, > > I am re-posting my previous question with a simpler, more transparent, > commented code. > > I have been ramming my head against this problem, and I wondered if > anyone could lend a hand. I want to make parallel a bootstrap of a > linear mixed model on my 8-core mac. Below is the process that I want to > make parallel (namely, the boot.out<-boot(dat.res,boot.fun, R = nboot) > command). This is an extension to lmer of the bootstrapping linear > models example in Venables and Ripley. Please excuse my rather terrible > programming skills. I am always open to suggestions. Below the example I > describe what methods I have tried. > > library(boot) > library(lme4) > dat<-read.table("http://www2.fiu.edu/~adick/downloads/toy2.dat ", header = T) > nboot<-1000 # number of bootstraps > attach(dat) > x<-dat[,2] # IV number 1 > y<-dat[,4] # DV > z<-dat[,3] # IV number 2 > subj<-dat[,1] # random factor > boot.fun<-function(data,i) { # function to resample residuals > d<-data > d$y<- d$fitted+d$res[i] # populate new y values based on > resampled residuals > as.numeric(coef(update(m2.fit,data=d))[1][[1]][1,c(1:4)]) > # update the linear model and output the coefficients > } > fit<-lmer(y~x*z + (1|(subj))) # the linear model > dat.res<-data.frame(y,x,z,subj, res=resid(fit), fitted=fitted(fit)) # > add residuals and fitted values to dat > boot.out<-boot(dat.res,boot.fun, R = nboot) # run the bootstrap using > the boot.fun > boot.out > > Methods attempted: > > Using the multicore package, I tried > boot.out<-collect(parallel(boot(dat.res,boot.fun, R = nboot))). This > returned a correct result, but did not speed things up. Not sure why... Hi Anthony, When the individual calls passed on to the cluster are very short (which might be the case for your bootstrap), the overhead of running them parallel becomes very large, negating the positive effect of running the processes parallel. This could be an explanation for the lack of speed improvement. A solution could be to not send individual bootstrap calls to the cluster, but sets of calls. This decrease the overhead for parallel running. cheers, Paul > I also tried snowfall and snow. While I can create a cluster and run > simple processes (e.g., provided example from literature), I can't get > the bootstrap to run. For example, using snow: > > cl<- makeCluster(8) > clusterSetupRNG(cl) > clusterEvalQ(cl,library(boot)) > clusterEvalQ(cl,library(lme4)) > boot.out<-clusterCall(cl,boot(dat.res,boot.fun, R = nboot)) > stopCluster() > > returns the following error: > > Error in checkForRemoteErrors(lapply(cl, recvResult)) : > 8 nodes produced errors; first error: could not find function "fun" > > I am stuck and at the limit of my programming knowledge and am punting > to the R-help list. I need to run this process thousands of times, which > is the reason to make it parallel. Any suggestions are much appreciated. > > > Anthony > -- Paul Hiemstra, MSc Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 From vioravis at gmail.com Fri Mar 4 09:39:50 2011 From: vioravis at gmail.com (vioravis) Date: Fri, 4 Mar 2011 00:39:50 -0800 (PST) Subject: [R] Zero Inflated Distributions Message-ID: <1299227990001-3334861.post@n4.nabble.com> I am currently fitting the following distributions using JMP and looking for ways to fit the same distributions in R: Zero Inflated Lognormal Zero Inflated Loglogistic Zero Inflated Frechet Zero Inflated Weibull Threshold Frechet Threshold Loglogistic Threshold Lognormal Log Generalized Gamma Threshold Weibull LEV Logistic Normal SEV Are there any packages that contain these distributions??? I am specifically interested in the zero inflated distributions since the data I have contains quite a bit of zeros. Thank you. Ravi -- View this message in context: http://r.789695.n4.nabble.com/Zero-Inflated-Distributions-tp3334861p3334861.html Sent from the R help mailing list archive at Nabble.com. From muralidharan.somasundaram at tcs.com Fri Mar 4 06:12:32 2011 From: muralidharan.somasundaram at tcs.com (Muralidharan Somasundaram) Date: Fri, 4 Mar 2011 10:42:32 +0530 Subject: [R] Help required for rpart package Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From matthew.finkbeiner at mq.edu.au Fri Mar 4 10:43:55 2011 From: matthew.finkbeiner at mq.edu.au (Matthew Finkbeiner) Date: Fri, 4 Mar 2011 20:43:55 +1100 Subject: [R] scramble items Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From alaios at yahoo.com Fri Mar 4 10:49:29 2011 From: alaios at yahoo.com (Alaios) Date: Fri, 4 Mar 2011 01:49:29 -0800 (PST) Subject: [R] How two compare two matrixes In-Reply-To: <20110304080441.GA3897@maker> Message-ID: <219195.1782.qm@web120115.mail.ne1.yahoo.com> That's the problem Even a 10*10 matrix does not fit to the screen (10 columns do not fit in one screen's row) and thus I do not get a well aligned matrix printed. This is that makes comparisons not that easy to the eye. From the other hand with edit(mymatrix) I get scrolls so I can scroll to one row and see only the area I want to focus in. Problem with edit is that it blocks cli and thus I can not have two edits running at the same time. I would like to thank you in advacne for your help Regards Alex --- On Fri, 3/4/11, Philipp Pagel wrote: > From: Philipp Pagel > Subject: Re: [R] How two compare two matrixes > To: r-help at r-project.org > Date: Friday, March 4, 2011, 8:04 AM > > Dear all I have two 10*10 > matrixes and I would like to compare > > theirs contents. By the word content I mean to check > visually (not > > with any mathematical formulation) how similar are the > contents. > > If they are really only 10x10 you can simply print them > both to the > screen and look at them. I'm not sure what else you could > do if you > are not interested in a specific distance emasure etc. > > cu > ??? Philipp > > -- > Dr. Philipp Pagel > Lehrstuhl f?r Genomorientierte Bioinformatik > Technische Universit?t M?nchen > Wissenschaftszentrum Weihenstephan > Maximus-von-Imhof-Forum 3 > 85354 Freising, Germany > http://webclu.bio.wzw.tum.de/~pagel/ > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > From ligges at statistik.tu-dortmund.de Fri Mar 4 10:51:14 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Fri, 04 Mar 2011 10:51:14 +0100 Subject: [R] vector("integer", length) : vector size specified is too large In-Reply-To: <006001cbd9a9$2706d250$751476f0$@up.ac.za> References: <006001cbd9a9$2706d250$751476f0$@up.ac.za> Message-ID: <4D70B612.3030003@statistik.tu-dortmund.de> Please ask the author of parts() in the partitions package or the author of the function that calls the former: your function calls generate a call parts(J) where J is 1272. Internally, a J*P(J) (1272 * 1.514126e+19) vector is generated (and that one is too large for R). Uwe Ligges On 03.03.2011 14:44, Robert Guldemond wrote: > Good day to the R community, > > > > I am interested to run the plot.count() function in the "untb" package. > > My script is as follows:- > > > >> library(untb) > >> Community1<- > >> structure(c(371,167,119,78,74,53,50,31,28,25,20,19,19,17,13,12,12,10, > >> 9,9,8,8,7,7,7,7,6,6,6,6,5,5,5,5,4,4,4,3,3,3,2,2,2,2,2,2,2,1,1,1,1,1, > >> 1,1,1,1,1,1,1,1), .Dim = 60, .Dimnames = > list(c("Spp.80","Spp.111","Spp.129", > >> > "Spp.101","Spp.40","Spp.11","Spp.14","Spp.128","Spp.58","Spp.103","Spp.112", > >> "Spp.50","Spp.115","Spp.31","Spp.86","Spp.92","Spp.108","Spp.79","Spp.81", > >> "Spp.110","Spp.75","Spp.83","Spp.30","Spp.62","Spp.63","Spp.76","Spp.27", > >> "Spp.87","Spp.102","Spp.121","Spp.22","Spp.33","Spp.67","Spp.109","Spp.1", > >> "Spp.10","Spp.18","Spp.12","Spp.47","Spp.114","Spp.8","Spp.42","Spp.65", > >> > "Spp.69","Spp.100","Spp.106","Spp.130","Spp.38","Spp.43","Spp.56","Spp.82", > >> > "Spp.93","Spp.95","Spp.107","Spp.116","Spp.117","Spp.118","Spp.119","Spp.136 > ", > >> "Spp.144")), class = c("count", "table")) > >> Community1 > >> summary(unphi(phi(Community1))) > >> plot.count(Community1,uncertainty=TRUE,expectation=TRUE,theta=NULL,n=10) > > > > When I run this I get the following message:- > > > > Error in vector("integer", length) : vector size specified is too large > > In addition: Warning message: > > In parts(J) : NAs introduced by coercion > > > > Does anyone have any ideas for me why I get these error messages and what I > should > > do to overcome this challenge? > > > > Thank you in advance > > > > Rob Guldemond > > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Research Scientist > Conservation Ecology Research Unit > Department of Zoology and Entomology > University of Pretoria > Pretoria > 0002 > South Africa > tel: (+27) 12 420 3231 > fax: (+27) 12 420 4523 > cell: (+27) 83 770 9694 > rguldemond at zoology.up.ac.za > http://www.ceru.up.ac.za > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From nicolas.berkowitsch at unibas.ch Fri Mar 4 11:04:57 2011 From: nicolas.berkowitsch at unibas.ch (Nicolas Berkowitsch) Date: Fri, 04 Mar 2011 11:04:57 +0100 Subject: [R] overleap an iteration within a for-loop when error message produced Message-ID: <4D70B949.6000003@unibas.ch> Dear R-list member, I'm using the function pmnorm() (-->library(mnormt)) within a for-loop. Certain parameter values leads to an error message: "(In sqrt(diag(S)) : NaNs produced, In sqrt(1/diag(V)) : NaNs produced, In cov2cor(S) : diag(.) had 0 or NA entries; non-finite result is doubtful)" obviously because "NaNs" were produced. Is it possible to tell R that it should overleap the iteration which produce the error message? Here is an example code (does not lead to an error message): for (subject in 1:10) { p[subject] = pmnorm(x = subject*c(-.3,1), varcov = diag(2)) } Assume that the 5th iteration (subject=5) leads to the error message. How can I tell R to continue with the 6th iteration? Thanks a lot for you help and input, Nicolas ____________ lic. phil. Nicolas A. J. Berkowitsch Universit?t Basel Fakult?t f?r Psychologie Economic Psychology Missionsstrasse 62a CH-4055 Basel Tel. +41 61 267 05 75 E-Mail nicolas.berkowitsch at unibas.ch Web http://psycho.unibas.ch/abteilungen/abteilung-details/home/abteilung/economic-psychology/ From ligges at statistik.tu-dortmund.de Fri Mar 4 11:11:30 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Fri, 04 Mar 2011 11:11:30 +0100 Subject: [R] How two compare two matrixes In-Reply-To: <85081.47838.qm@web120107.mail.ne1.yahoo.com> References: <85081.47838.qm@web120107.mail.ne1.yahoo.com> Message-ID: <4D70BAD2.9020808@statistik.tu-dortmund.de> On 04.03.2011 08:42, Alaios wrote: > Dear all I have two 10*10 matrixes and I would like to compare theirs contents. By the word content I mean to check visually (not with any mathematical formulation) how similar are the contents. > > I also know edit that prints my matrix in the scree but still one edit blocks the prompt to launch a second edit() screen. > > What is the best way to compare these two matrices? > > I would like to thank you in avdance for your help See ?image. Uwe Ligges > Regards > Alex > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From p.pagel at wzw.tum.de Fri Mar 4 11:12:53 2011 From: p.pagel at wzw.tum.de (Philipp Pagel) Date: Fri, 4 Mar 2011 11:12:53 +0100 Subject: [R] How two compare two matrixes In-Reply-To: <219195.1782.qm@web120115.mail.ne1.yahoo.com> References: <20110304080441.GA3897@maker> <219195.1782.qm@web120115.mail.ne1.yahoo.com> Message-ID: <20110304101253.GA5075@maker> On Fri, Mar 04, 2011 at 01:49:29AM -0800, Alaios wrote: > That's the problem > Even a 10*10 matrix does not fit to the screen (10 columns do not > fit in one screen's row) and thus I do not get a well aligned matrix > printed. > > This is that makes comparisons not that easy to the eye. From the > other hand with edit(mymatrix) I get scrolls so I can scroll to one > row and see only the area I want to focus in. Problem with edit is > that it blocks cli and thus I can not have two edits running at the > same time. Hm - it does fit on my screen but if you're on a laptop... Maybe you could write both matrices to files and compare them in an external viewer (Excel, less, ...). If I remember correctly, the object browser/data viewer of JGR allows editing several objects at once. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ From nick.sabbe at ugent.be Fri Mar 4 11:14:18 2011 From: nick.sabbe at ugent.be (Nick Sabbe) Date: Fri, 4 Mar 2011 11:14:18 +0100 Subject: [R] Generic mixup? Message-ID: <041101cbda54$ec4f2930$c4ed7b90$@sabbe@ugent.be> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From p.pagel at wzw.tum.de Fri Mar 4 11:15:39 2011 From: p.pagel at wzw.tum.de (Philipp Pagel) Date: Fri, 4 Mar 2011 11:15:39 +0100 Subject: [R] overleap an iteration within a for-loop when error message produced In-Reply-To: <4D70B949.6000003@unibas.ch> References: <4D70B949.6000003@unibas.ch> Message-ID: <20110304101539.GA5174@maker> > Assume that the 5th iteration (subject=5) leads to the error > message. How can I tell R to continue with the 6th iteration? try or tryCatch are probably what you want. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ From wesleycmathew at gmail.com Fri Mar 4 11:21:01 2011 From: wesleycmathew at gmail.com (wesley mathew) Date: Fri, 4 Mar 2011 10:21:01 +0000 Subject: [R] Cannot find JRI native library Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ivan.calandra at uni-hamburg.de Fri Mar 4 11:28:10 2011 From: ivan.calandra at uni-hamburg.de (Ivan Calandra) Date: Fri, 04 Mar 2011 11:28:10 +0100 Subject: [R] How to copy data from data.frame to matrix In-Reply-To: References: Message-ID: <4D70BEBA.7030402@uni-hamburg.de> Hi, Let's say your data.frame is called df: df <- data.frame(a=rnorm(10), b=rnorm(10)) data.matrix <- as.matrix(df) This should work, but be careful with coercion if you have different modes in your data.frame HTH, Ivan PS: next time, provide a reproducible example, using dput() for example Le 3/4/2011 07:08, Dmitrij Kudriavcev a ?crit : > Hello > > I'm a new in R > I have a large data.frame "s" (this is actualy just a table in mysql) : > >> names(s) > [1] "symbols", "day", "value" > > I need to convert it to simple matrix. I have define this matrix like this: > >> data.matrix<- matrix(nrow=nDays, ncol=nSymbols, dimnames=list(days, > symbols)) > > then i just copy values to the matrix using for() loop, but it seems to take > very long time. Is is a more fast way to do it in R? I know, what i can just > gyve s$value as source data to the matrix, but problem is, what for some > symbols couple days could be just missed. > > Cheers, > Dima > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php From Thierry.ONKELINX at inbo.be Fri Mar 4 11:28:26 2011 From: Thierry.ONKELINX at inbo.be (ONKELINX, Thierry) Date: Fri, 4 Mar 2011 10:28:26 +0000 Subject: [R] Zero Inflated Distributions In-Reply-To: <1299227990001-3334861.post@n4.nabble.com> References: <1299227990001-3334861.post@n4.nabble.com> Message-ID: library(sos) findFn("Zero Inflated Lognormal") ---------------------------------------------------------------------------- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics & Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey > -----Oorspronkelijk bericht----- > Van: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] Namens vioravis > Verzonden: vrijdag 4 maart 2011 9:40 > Aan: r-help at r-project.org > Onderwerp: [R] Zero Inflated Distributions > > I am currently fitting the following distributions using JMP > and looking for ways to fit the same distributions in R: > > Zero Inflated Lognormal > Zero Inflated Loglogistic > Zero Inflated Frechet > Zero Inflated Weibull > Threshold Frechet > Threshold Loglogistic > Threshold Lognormal > Log Generalized Gamma > Threshold Weibull > LEV > Logistic > Normal > SEV > > Are there any packages that contain these distributions??? I > am specifically interested in the zero inflated distributions > since the data I have contains quite a bit of zeros. > > Thank you. > > Ravi > > -- > View this message in context: > http://r.789695.n4.nabble.com/Zero-Inflated-Distributions-tp33 > 34861p3334861.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From clare.embling at plymouth.ac.uk Fri Mar 4 12:08:14 2011 From: clare.embling at plymouth.ac.uk (Clare Embling) Date: Fri, 4 Mar 2011 11:08:14 +0000 Subject: [R] Anyone know a forum for stats advice? Message-ID: <590E5A500AD0F943948B2F36C92E98299B9BCB53CE@ILS133.uopnet.plymouth.ac.uk> Hi, I know this forum is for R-related issues, but the question I have is a statistical question & I was wondering if anyone could recommend a good statistics forum where I can ask the question? My question is relating to bootstrapping of binary data (ecology data) - I can give more detail, but wasn't sure I could address the question here as it is more statistical based than R based (though all the analysis is done in R). Thanks in advance Clare From Stephan.Kolassa at gmx.de Fri Mar 4 12:26:19 2011 From: Stephan.Kolassa at gmx.de (Stephan Kolassa) Date: Fri, 04 Mar 2011 12:26:19 +0100 Subject: [R] Anyone know a forum for stats advice? In-Reply-To: <590E5A500AD0F943948B2F36C92E98299B9BCB53CE@ILS133.uopnet.plymouth.ac.uk> References: <590E5A500AD0F943948B2F36C92E98299B9BCB53CE@ILS133.uopnet.plymouth.ac.uk> Message-ID: <4D70CC5B.4090108@gmx.de> Hi Clare, you want to go here: http://stats.stackexchange.com/questions HTH Stephan Am 04.03.2011 12:08, schrieb Clare Embling: > Hi, > > I know this forum is for R-related issues, but the question I have is a statistical question& I was wondering if anyone could recommend a good statistics forum where I can ask the question? My question is relating to bootstrapping of binary data (ecology data) - I can give more detail, but wasn't sure I could address the question here as it is more statistical based than R based (though all the analysis is done in R). > > Thanks in advance > Clare > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From nicolas.berkowitsch at unibas.ch Fri Mar 4 12:28:23 2011 From: nicolas.berkowitsch at unibas.ch (Nicolas Berkowitsch) Date: Fri, 04 Mar 2011 12:28:23 +0100 Subject: [R] overleap an iteration within a for-loop when error message produced In-Reply-To: <041601cbda55$57c84890$0758d9b0$@sabbe@ugent.be> References: <4D70B949.6000003@unibas.ch> <041601cbda55$57c84890$0758d9b0$@sabbe@ugent.be> Message-ID: <4D70CCD7.40507@unibas.ch> Dear Nick, Dear Philipp, Thanks for quick responses - it worked! Below the implemented solution - in case others are interested: ## This will lead to an error message library (mnormt) p = matrix(NA,9,1) for (subject in 2:10) { p[subject]=pmnorm(x = subject*c(-.3,1), varcov = matrix(c((-4)^subject,2,2,2),2,2)) print(subject) } ## This will NOT lead to an error message library (mnormt) p = matrix(NA,9,1) for (subject in 2:10) { p[subject]=try(pmnorm(x = subject*c(-.3,1), varcov = matrix(c((-4)^subject,2,2,2),2,2)), silent=FALSE) print(subject) } Am 04.03.2011 11:17, schrieb Nick Sabbe: > Check ?tryCatch. > In most languages, adding try-catch blocks may seriously affect performance > - I don't know what the impact is in R. > But perhaps that is not an issue to you. > > HTH, > > > Nick Sabbe > -- > ping: nick.sabbe at ugent.be > link: http://biomath.ugent.be > wink: A1.056, Coupure Links 653, 9000 Gent > ring: 09/264.59.36 > > -- Do Not Disapprove > > > > > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On > Behalf Of Nicolas Berkowitsch > Sent: vrijdag 4 maart 2011 11:05 > To: r-help at r-project.org > Subject: [R] overleap an iteration within a for-loop when error message > produced > > Dear R-list member, > > I'm using the function pmnorm() (-->library(mnormt)) within a for-loop. > Certain parameter values leads to an error message: > "(In sqrt(diag(S)) : NaNs produced, In sqrt(1/diag(V)) : NaNs > produced, In cov2cor(S) : diag(.) had 0 or NA entries; non-finite result > is doubtful)" > obviously because "NaNs" were produced. > Is it possible to tell R that it should overleap the iteration which > produce the error message? > > Here is an example code (does not lead to an error message): > > for (subject in 1:10) { > > p[subject] = pmnorm(x = subject*c(-.3,1), varcov = diag(2)) > > } > > Assume that the 5th iteration (subject=5) leads to the error message. > How can I tell R to continue with the 6th iteration? > > Thanks a lot for you help and input, > Nicolas > > ____________ > > > lic. phil. Nicolas A. J. Berkowitsch > Universit?t Basel > Fakult?t f?r Psychologie > Economic Psychology > Missionsstrasse 62a > CH-4055 Basel > > Tel. +41 61 267 05 75 > E-Mail nicolas.berkowitsch at unibas.ch > Web > http://psycho.unibas.ch/abteilungen/abteilung-details/home/abteilung/economi > c-psychology/ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- ____________ lic. phil. Nicolas A. J. Berkowitsch Universit?t Basel Fakult?t f?r Psychologie Economic Psychology Missionsstrasse 62a CH-4055 Basel Tel. +41 61 267 05 75 E-Mail nicolas.berkowitsch at unibas.ch Web http://psycho.unibas.ch/abteilungen/abteilung-details/home/abteilung/economic-psychology/ From r.m.krug at gmail.com Fri Mar 4 12:31:41 2011 From: r.m.krug at gmail.com (Rainer M Krug) Date: Fri, 4 Mar 2011 12:31:41 +0100 Subject: [R] Anyone know a forum for stats advice? In-Reply-To: <590E5A500AD0F943948B2F36C92E98299B9BCB53CE@ILS133.uopnet.plymouth.ac.uk> References: <590E5A500AD0F943948B2F36C92E98299B9BCB53CE@ILS133.uopnet.plymouth.ac.uk> Message-ID: r-sig-ecology (https://stat.ethz.ch/mailman/listinfo/r-sig-ecology) is also a good source when the stats is related to R, they are usually quite open. Rainer On Fri, Mar 4, 2011 at 12:08 PM, Clare Embling wrote: > Hi, > > I know this forum is for R-related issues, but the question I have is a statistical question & I was wondering if anyone could recommend a good statistics forum where I can ask the question? ?My question is relating to bootstrapping of binary data (ecology data) - I can give more detail, but wasn't sure I could address the question here as it is more statistical based than R based (though all the analysis is done in R). > > Thanks in advance > Clare > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- NEW GERMAN FAX NUMBER!!! Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Cell:? ? ? ? ?? +27 - (0)83 9479 042 Fax:? ? ? ? ? ? +27 - (0)86 516 2782 Fax:? ? ? ? ? ? +49 - (0)321 2125 2244 email:? ? ? ? ? Rainer at krugs.de Skype:? ? ? ? ? RMkrug Google:? ? ? ?? R.M.Krug at gmail.com From marchywka at hotmail.com Fri Mar 4 13:40:49 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Fri, 4 Mar 2011 07:40:49 -0500 Subject: [R] How two compare two matrixes In-Reply-To: <20110304080441.GA3897@maker> References: <85081.47838.qm@web120107.mail.ne1.yahoo.com>, <20110304080441.GA3897@maker> Message-ID: > ?image > ?matrix > z<-matrix(rnorm(100),nrow=10) > image(1:10,1:10,z) > heatmap(z) > ---------------------------------------- > Date: Fri, 4 Mar 2011 09:04:41 +0100 > From: p.pagel at wzw.tum.de > To: r-help at r-project.org > Subject: Re: [R] How two compare two matrixes > > > Dear all I have two 10*10 matrixes and I would like to compare > > theirs contents. By the word content I mean to check visually (not > > with any mathematical formulation) how similar are the contents. > > If they are really only 10x10 you can simply print them both to the > screen and look at them. I'm not sure what else you could do if you > are not interested in a specific distance emasure etc. > > cu > Philipp > > -- > Dr. Philipp Pagel > Lehrstuhl f?r Genomorientierte Bioinformatik > Technische Universit?t M?nchen > Wissenschaftszentrum Weihenstephan > Maximus-von-Imhof-Forum 3 > 85354 Freising, Germany > http://webclu.bio.wzw.tum.de/~pagel/ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From drflxms at googlemail.com Fri Mar 4 14:18:03 2011 From: drflxms at googlemail.com (drflxms) Date: Fri, 04 Mar 2011 14:18:03 +0100 Subject: [R] sum of digits or how to slice a number into its digits Message-ID: <4D70E68B.2010703@googlemail.com> Dear R colleagues, I face a seemingly simple problem I couldn't find a solution for myself so far: I have to sum the digits of numbers. Example: 1010 ->2 100100110 -> 4 Unfortunately there seems not to be a function for this task. So my idea was to use sum(x) for it. But I did not figure out how to slice a number to a vector of its digits. Example (continued from above): 1010 -> c(1,0,1,0) 100100110 -> (1,0,0,1,0,0,1,1,0). Does anyone know either a function for calculating the sum of the digits of a bumber, or how to slice a number into a vector of its digits as described above? I'd appreciate any kind of help very much! Thanx in advance and greetings from cloudy Munich, Felix From izahn at psych.rochester.edu Fri Mar 4 14:24:59 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Fri, 4 Mar 2011 13:24:59 +0000 Subject: [R] Generic mixup? In-Reply-To: <-6525824186037005978@unknownmsgid> References: <-6525824186037005978@unknownmsgid> Message-ID: Hi Nick, I think showMethods is for s4 classes (which I know nothing about). I think you want methods(print) Best, Ista On Fri, Mar 4, 2011 at 10:14 AM, Nick Sabbe wrote: > Hello list. > > > > This is from an R session (admittedly, I'm still using R 2.11.1): > >> print > > function (x, ...) > > UseMethod("print") > > > >> showMethods("print") > > > > Function "print": > > ? > > > > Don't the two results contradict each other? Or do I have a terrible > misunderstanding of what comprises a generic function? > > > > Thx, > > > > Nick Sabbe > > -- > > ping: nick.sabbe at ugent.be > > link: ? http://biomath.ugent.be > > wink: A1.056, Coupure Links 653, 9000 Gent > > ring: 09/264.59.36 > > > > -- Do Not Disapprove > > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From d.rizopoulos at erasmusmc.nl Fri Mar 4 14:25:21 2011 From: d.rizopoulos at erasmusmc.nl (Dimitris Rizopoulos) Date: Fri, 04 Mar 2011 14:25:21 +0100 Subject: [R] sum of digits or how to slice a number into its digits In-Reply-To: <4D70E68B.2010703@googlemail.com> References: <4D70E68B.2010703@googlemail.com> Message-ID: <4D70E841.80804@erasmusmc.nl> one way is using function strsplit(), e.g., x <- c("100100110", "1001001", "1101", "00101") sapply(strsplit(x, ""), function (x) sum(x == 1)) I hope it helps. Best, Dimitris On 3/4/2011 2:18 PM, drflxms wrote: > Dear R colleagues, > > I face a seemingly simple problem I couldn't find a solution for myself > so far: > > I have to sum the digits of numbers. Example: 1010 ->2 100100110 -> 4 > Unfortunately there seems not to be a function for this task. So my idea > was to use sum(x) for it. But I did not figure out how to slice a number > to a vector of its digits. Example (continued from above): 1010 -> > c(1,0,1,0) 100100110 -> (1,0,0,1,0,0,1,1,0). > > Does anyone know either a function for calculating the sum of the digits > of a bumber, or how to slice a number into a vector of its digits as > described above? > > I'd appreciate any kind of help very much! > Thanx in advance and greetings from cloudy Munich, > Felix > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ From ivan.calandra at uni-hamburg.de Fri Mar 4 14:27:40 2011 From: ivan.calandra at uni-hamburg.de (Ivan Calandra) Date: Fri, 04 Mar 2011 14:27:40 +0100 Subject: [R] sum of digits or how to slice a number into its digits In-Reply-To: <4D70E68B.2010703@googlemail.com> References: <4D70E68B.2010703@googlemail.com> Message-ID: <4D70E8CC.20602@uni-hamburg.de> Hi, Here is the best I've found: x <- 100100110 sum(as.numeric(unlist(strsplit(as.character(x), split="")))) It first converts x to character, then splits every character, unlist()s the results, then reconverts to numeric and sums it. HTH, Ivan Le 3/4/2011 14:18, drflxms a ?crit : > Dear R colleagues, > > I face a seemingly simple problem I couldn't find a solution for myself > so far: > > I have to sum the digits of numbers. Example: 1010 ->2 100100110 -> 4 > Unfortunately there seems not to be a function for this task. So my idea > was to use sum(x) for it. But I did not figure out how to slice a number > to a vector of its digits. Example (continued from above): 1010 -> > c(1,0,1,0) 100100110 -> (1,0,0,1,0,0,1,1,0). > > Does anyone know either a function for calculating the sum of the digits > of a bumber, or how to slice a number into a vector of its digits as > described above? > > I'd appreciate any kind of help very much! > Thanx in advance and greetings from cloudy Munich, > Felix > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php From uwwo at in-chemnitz.de Fri Mar 4 14:42:24 2011 From: uwwo at in-chemnitz.de (Uwe Wolfram) Date: Fri, 04 Mar 2011 14:42:24 +0100 Subject: [R] Coefficient of Determination for nonlinear function Message-ID: <1299246144.1764.18.camel@pollux> Dear Subscribers, I did fit an equation of the form 1 = f(x1,x2,x3) using a minimization scheme. Now I want to compute the coefficient of determination. Normally I would compute it as r_square = 1- sserr/sstot with sserr = sum_i (y_i - f_i) and sstot = sum_i (y_i - mean(y)) sserr is clear to me but how can I compute sstot when there is no such thing than differing y_i. These are all one. Thus mean(y)=1. Therefore, sstot is 0. Thank you very much for your efforts, Uwe -- Uwe Wolfram Dipl.-Ing. (Ph.D Student) __________________________________________________ Institute of Orthopaedic Research and Biomechanics Director and Chair: Prof. Dr. Anita Ignatius Center of Musculoskeletal Research Ulm University Hospital Ulm Helmholtzstr. 14 89081 Ulm, Germany Phone: +49 731 500-55301 Fax: +49 731 500-55302 http://www.biomechanics.de From drflxms at googlemail.com Fri Mar 4 14:52:22 2011 From: drflxms at googlemail.com (drflxms) Date: Fri, 04 Mar 2011 14:52:22 +0100 Subject: [R] sum of digits or how to slice a number into its digits In-Reply-To: <4D70E841.80804@erasmusmc.nl> References: <4D70E68B.2010703@googlemail.com> <4D70E841.80804@erasmusmc.nl> Message-ID: <4D70EE96.2010206@googlemail.com> Hi Dimitris, thank you very much for your quick an efficient help! Your solution is perfect for me. Does exactly what I was looking for if combined with unlist and as.numeric before using sum. Now I can keep on with my real problem ;)... Thanx Again!!! Best, Felix Am 04.03.2011 14:25, schrieb Dimitris Rizopoulos: > one way is using function strsplit(), e.g., > > x <- c("100100110", "1001001", "1101", "00101") > sapply(strsplit(x, ""), function (x) sum(x == 1)) > > > I hope it helps. > > Best, > Dimitris > > > On 3/4/2011 2:18 PM, drflxms wrote: >> Dear R colleagues, >> >> I face a seemingly simple problem I couldn't find a solution for myself >> so far: >> >> I have to sum the digits of numbers. Example: 1010 ->2 100100110 -> 4 >> Unfortunately there seems not to be a function for this task. So my idea >> was to use sum(x) for it. But I did not figure out how to slice a number >> to a vector of its digits. Example (continued from above): 1010 -> >> c(1,0,1,0) 100100110 -> (1,0,0,1,0,0,1,1,0). >> >> Does anyone know either a function for calculating the sum of the digits >> of a bumber, or how to slice a number into a vector of its digits as >> described above? >> >> I'd appreciate any kind of help very much! >> Thanx in advance and greetings from cloudy Munich, >> Felix >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > From a.strelniece at eurotransplant.org Fri Mar 4 10:52:13 2011 From: a.strelniece at eurotransplant.org (Aggita) Date: Fri, 4 Mar 2011 01:52:13 -0800 (PST) Subject: [R] message: please select CRAN mirror In-Reply-To: References: <1299080842948-3331711.post@n4.nabble.com> Message-ID: <1299232333285-3334943.post@n4.nabble.com> > str(getCRANmirrors(all=FALSE,local.only=FALSE)) gives --> chr(0) -- View this message in context: http://r.789695.n4.nabble.com/message-please-select-CRAN-mirror-tp3331711p3334943.html Sent from the R help mailing list archive at Nabble.com. From albertonegron at gmail.com Fri Mar 4 10:33:23 2011 From: albertonegron at gmail.com (Alberto Negron) Date: Fri, 4 Mar 2011 09:33:23 +0000 Subject: [R] How to copy data from data.frame to matrix In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Bengt.Walerud at capacent.se Fri Mar 4 13:41:50 2011 From: Bengt.Walerud at capacent.se (Bengt Walerud) Date: Fri, 4 Mar 2011 12:41:50 +0000 Subject: [R] Problem w/ function Message-ID: <3398205D1214B643809955049A69DE84D32143@capexch01.cap.loc> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bsmith030465 at gmail.com Fri Mar 4 14:54:42 2011 From: bsmith030465 at gmail.com (Brian Smith) Date: Fri, 4 Mar 2011 08:54:42 -0500 Subject: [R] linear model - lm (Adjusted R-squared)? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dimitrij.kudriavcev at ntsg.lt Fri Mar 4 11:33:23 2011 From: dimitrij.kudriavcev at ntsg.lt (Dmitrij Kudriavcev) Date: Fri, 4 Mar 2011 21:33:23 +1100 Subject: [R] How to copy data from data.frame to matrix In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dimitrij.kudriavcev at ntsg.lt Fri Mar 4 11:38:19 2011 From: dimitrij.kudriavcev at ntsg.lt (Dmitrij Kudriavcev) Date: Fri, 4 Mar 2011 21:38:19 +1100 Subject: [R] How to copy data from data.frame to matrix In-Reply-To: <4D70BEBA.7030402@uni-hamburg.de> References: <4D70BEBA.7030402@uni-hamburg.de> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From fjgochez at googlemail.com Fri Mar 4 14:12:18 2011 From: fjgochez at googlemail.com (Francisco Gochez) Date: Fri, 4 Mar 2011 13:12:18 +0000 Subject: [R] Generic mixup? In-Reply-To: <-6525824186037005978@unknownmsgid> References: <-6525824186037005978@unknownmsgid> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From grant.j.gillis at gmail.com Fri Mar 4 12:20:04 2011 From: grant.j.gillis at gmail.com (Grant Gillis) Date: Fri, 4 Mar 2011 11:20:04 +0000 Subject: [R] tricky (for me) merging of data...more clarity In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From joanna.lewis at ucl.ac.uk Fri Mar 4 12:48:06 2011 From: joanna.lewis at ucl.ac.uk (Joanna Lewis) Date: Fri, 04 Mar 2011 11:48:06 +0000 Subject: [R] Multi-line input to rsympy Message-ID: <4D70D176.6020608@ucl.ac.uk> Dear R users, I have been using rsympy to solve a set of simultaneous equations from R. There are two solutions for the variable I'm interested in, xx[0] and xx[1], which are in terms of symbols called lam and conc. I'd like to pick out the one which is positive at (lam=0, conc=0) and call it mysol. In python I could write: if (xx[0].subs(lam,0)).subs(conc,0)>0: mysol=xx[0] else: mysol=xx[1] but I'm not sure how to do it from R via rsympy. The various combinations of \t and \n characters and spaces I've tried haven't worked, and I haven't been able to find any examples online or in the help file. Do you know whether it is possible to enter multi-line input using rsympy, and if so how? Thank you in advance, Joanna From Martin.Scheuringer at hvb.sozvers.at Fri Mar 4 11:11:02 2011 From: Martin.Scheuringer at hvb.sozvers.at (Scheuringer Martin) Date: Fri, 4 Mar 2011 11:11:02 +0100 Subject: [R] column removing under certain conditions Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mijony at live.se Fri Mar 4 14:50:34 2011 From: mijony at live.se (purna) Date: Fri, 4 Mar 2011 05:50:34 -0800 (PST) Subject: [R] delete rows whose sum is X Message-ID: <1299246634668-3335254.post@n4.nabble.com> Rnoob here. I have a matrix of zeroes ond ones. I want to delete the rows whose sum of values is not =5, alternatively extract the rows who sum up to 5. Thank you/Mikael -- View this message in context: http://r.789695.n4.nabble.com/delete-rows-whose-sum-is-X-tp3335254p3335254.html Sent from the R help mailing list archive at Nabble.com. From mikrowelle1234 at gmx.de Fri Mar 4 14:31:34 2011 From: mikrowelle1234 at gmx.de (Alexx Hardt) Date: Fri, 04 Mar 2011 14:31:34 +0100 Subject: [R] Creating a .png with just an expression() in it Message-ID: <4D70E9B6.8040800@gmx.de> Hey, I'm trying to create an image file with the results of a regression analysis. In TeX, the line would be something like: $ size = 0.34 + 4.3 var_1 $ Can I create a plot window with just this line in it? I tried playing around with plot.new() or dev.new(), but didn't really find something that worked. Thanks in advance, Alex -- alexx at alexx-fett:~$ vi .emacs From paco at ceam.es Fri Mar 4 12:33:04 2011 From: paco at ceam.es (Paco Pastor) Date: Fri, 04 Mar 2011 12:33:04 +0100 Subject: [R] Time series analysis for a daily series Message-ID: <4D70CDF0.5040700@ceam.es> Hi everyone I am trying to do some time series analysis with daily temperature data (40 years). I have created a zoo object and ts object but can't apply stl function. It says the series is not periodic or has less than two periods. I've searched through google and found a lot of messages about this problem but not a solution/example to look for trend and seasonal component of a daily series. Is there any guide/document to perform this analysis? I suppose there are another choices but stl, which should I try for daily series? Thanks in advance Paco -- ----------- Francisco Pastor Meteorology department, Instituto Universitario CEAM-UMH http://www.ceam.es ----------- mail: paco at ceam.es skype: paco.pastor.guzman Researcher ID: http://www.researcherid.com/rid/B-8331-2008 Cosis profile: http://www.cosis.net/profile/francisco.pastor ----------- Parque Tecnologico, C/ Charles R. Darwin, 14 46980 PATERNA (Valencia), Spain Tlf. 96 131 82 27 - Fax. 96 131 81 90 From schmitzh at uni-bremen.de Fri Mar 4 10:47:18 2011 From: schmitzh at uni-bremen.de (Heike Schmitz) Date: Fri, 04 Mar 2011 10:47:18 +0100 Subject: [R] Error in model.frame.default In-Reply-To: References: <4D6FB72B.5010409@uni-bremen.de> Message-ID: <4D70B526.4090004@uni-bremen.de> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From uwe.wolfram at uni-ulm.de Fri Mar 4 14:40:01 2011 From: uwe.wolfram at uni-ulm.de (Uwe Wolfram) Date: Fri, 04 Mar 2011 14:40:01 +0100 Subject: [R] Coefficient of Determination for nonlinear function Message-ID: <1299246001.1764.17.camel@pollux> Dear Subscribers, I did fit an equation of the form 1 = f(x1,x2,x3) using a minimization scheme. Now I want to compute the coefficient of determination. Normally I would compute it as r_square = 1- sserr/sstot with sserr = sum_i (y_i - f_i) and sstot = sum_i (y_i - mean(y)) sserr is clear to me but how can I compute sstot when there is no such thing than differing y_i. These are all one. Thus mean(y)=1. Therefore, sstot is 0. Thank you very much for your efforts, Uwe -- Uwe Wolfram Dipl.-Ing. (Ph.D Student) __________________________________________________ Institute of Orthopaedic Research and Biomechanics Director and Chair: Prof. Dr. Anita Ignatius Center of Musculoskeletal Research Ulm University Hospital Ulm Helmholtzstr. 14 89081 Ulm, Germany Phone: +49 731 500-55301 Fax: +49 731 500-55302 http://www.biomechanics.de From vioravis at gmail.com Fri Mar 4 13:11:19 2011 From: vioravis at gmail.com (vioravis) Date: Fri, 4 Mar 2011 04:11:19 -0800 (PST) Subject: [R] Zero Inflated Distributions In-Reply-To: References: <1299227990001-3334861.post@n4.nabble.com> Message-ID: <1299240679055-3335122.post@n4.nabble.com> Thanks, Thierry. Has anyone used the "bayescount" for estimating zero inflated distributions? It states that it is a "crude function". Does that mean the estimates are only approximate??? The example they have given seems to work only with Gamma Poisson. data <- rpois(100, rgamma(100, shape=1, scale=8)) data[1:15] <- 0 maximise.likelihood(data, "ZIGP") However, when I tried fitting Gamma/LogNormal/Weibull (assuming that data is continuous), it throws out the following error: shape scale zi 9.532 4 21 Error in optim(c(shape, scale, zi), f6, control = list(fnscale = -1)) : function cannot be evaluated at initial parameters What is this error about??? Moreover, the function seems extremely slow. For the 100 data point example considered, it takes around 8 seconds for the estimation. Please let me know your opinions on this package and alternative packages, if any. Thank you. Ravi -- View this message in context: http://r.789695.n4.nabble.com/Zero-Inflated-Distributions-tp3334861p3335122.html Sent from the R help mailing list archive at Nabble.com. From KINLEY_ROBERT at lilly.com Fri Mar 4 15:05:28 2011 From: KINLEY_ROBERT at lilly.com (Robert Kinley) Date: Fri, 4 Mar 2011 14:05:28 +0000 Subject: [R] Rstudio question Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From shigesong at gmail.com Fri Mar 4 15:14:20 2011 From: shigesong at gmail.com (Shige Song) Date: Fri, 4 Mar 2011 09:14:20 -0500 Subject: [R] Rstudio question In-Reply-To: References: Message-ID: Why don't you post the question to the RStudio support forum? The folks there are quite responsive and very helpful. Shige On Fri, Mar 4, 2011 at 9:05 AM, Robert Kinley wrote: > ?I really like RStudio ... > > ... but I wish it wouldn't automatically reload the last .RData it had. > > Anyone know how to fix this ... ? > > Also - does anyone know is there an Rstudio-user email-list forum thingy > out there ? > > ? ? ? ?ta. > > ? ? Robert Kinley > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From fisher at plessthan.com Fri Mar 4 15:16:53 2011 From: fisher at plessthan.com (Dennis Fisher) Date: Fri, 4 Mar 2011 06:16:53 -0800 Subject: [R] Environment variable PATH in Windows Message-ID: <5D67331F-8332-4628-9D80-4AA542D3D5FC@plessthan.com> Colleagues, I am trying to understand how R (2.12.1) obtains the PATH environment variable in Windows (7 or Vista). Startup {base} directs one to: "R_ENVIRON" -- which equals "" in my systems R_HOME/etc/Renviron.site -- which does not exist Next, it directs to: R_HOME/etc/Rprofile.site -- which also does not exist (the expected behavior in a "factory-fresh" installation) I found the following files in R_HOME/etc: Makeconf Rcmd_environ Rconsole Rdevga respositories rgb.text Rprofile.txt none of which refer to PATH (except Makeconf, which refers to JAVA path, but not to PATH itself) Today's R-SIG-Mac Digest addresses a similar issue in OS X -- that R.app gets the values from /.MacOSX/environment.plist whereas a session started in Terminal uses .bashrc or /etc/profile. What are the corresponding sources in Windows? Thanks in advance. Dennis Dennis Fisher MD P < (The "P Less Than" Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-866-PLessThan (1-866-753-7784) www.PLessThan.com From aikidasgupta at gmail.com Fri Mar 4 15:17:01 2011 From: aikidasgupta at gmail.com (Abhijit Dasgupta) Date: Fri, 04 Mar 2011 09:17:01 -0500 Subject: [R] Rstudio question In-Reply-To: References: Message-ID: <4D70F45D.3030804@araastat.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From fangxiaofeng at gmail.com Fri Mar 4 15:16:10 2011 From: fangxiaofeng at gmail.com (Jeff Fang) Date: Fri, 4 Mar 2011 22:16:10 +0800 Subject: [R] Question in Chi-squared test, can I do it with percentage data? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Fri Mar 4 15:26:00 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Mar 2011 09:26:00 -0500 Subject: [R] Question about Chi-squared test In-Reply-To: References: Message-ID: On Mar 4, 2011, at 1:24 AM, Jeff Fang wrote: > Hi all, > > I know Chi-squared test can be done with the frequency data by R > function > "chisq.test()", but I am not sure if it can be applied to the > percentage > data ? The example of my data is as follow: > > ############################################# > > KSL MHL MWS CLGC LYGC > independent (%) 96.22 92.18 68.54 93.80 85.74 > > ############################################# Surely, the source of this information must have included the N from which these proprtions arose? David Winsemius, MD Heritage Laboratories West Hartford, CT From sarah.goslee at gmail.com Fri Mar 4 15:30:28 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Fri, 4 Mar 2011 09:30:28 -0500 Subject: [R] delete rows whose sum is X In-Reply-To: <1299246634668-3335254.post@n4.nabble.com> References: <1299246634668-3335254.post@n4.nabble.com> Message-ID: On Fri, Mar 4, 2011 at 8:50 AM, purna wrote: > Rnoob here. > I have a matrix of zeroes ond ones. I want to delete the rows whose sum of > values is not =5, alternatively extract the rows who sum up to 5. > > Thank you/Mikael I think you would greatly benefit from reading some of the intro to R materials that are widely available. > mymat <- matrix(sample(c(1,0), 100, r=TRUE), ncol=10) > rowSums(mymat) [1] 3 4 3 6 3 4 5 7 7 3 > mymat[rowSums(mymat) == 5, , drop=FALSE] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0 0 0 1 1 1 0 0 1 1 > mymat[rowSums(mymat) != 5, , drop=FALSE] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0 1 1 0 0 0 0 0 1 0 [2,] 0 1 0 1 0 0 0 0 1 1 [3,] 0 0 0 0 1 0 0 1 1 0 [4,] 0 1 0 1 1 1 1 0 0 1 [5,] 1 0 1 0 0 1 0 0 0 0 [6,] 1 1 1 0 0 0 0 1 0 0 [7,] 1 1 0 1 1 1 1 0 1 0 [8,] 1 0 1 1 1 1 1 1 0 0 [9,] 0 0 0 0 0 0 1 0 1 1 Sarah -- Sarah Goslee http://www.functionaldiversity.org From ivan.calandra at uni-hamburg.de Fri Mar 4 15:33:13 2011 From: ivan.calandra at uni-hamburg.de (Ivan Calandra) Date: Fri, 04 Mar 2011 15:33:13 +0100 Subject: [R] column removing under certain conditions In-Reply-To: References: Message-ID: <4D70F829.9040509@uni-hamburg.de> Dear Martin, I'm not sure I understood you well, because you basically have the answer already... What about this? A[,apply(A, 2, function(x) median(x)>0), drop=FALSE] (drop=FALSE ensures that you keep it as column even if only one column is selected) HTH, Ivan Le 3/4/2011 11:11, Scheuringer Martin a ?crit : > Dear collegues! > > > Given a matrix, I would like to remove columns, that do not fulfill a certain condition. The condition is, that the median of the column is higher than a certain value. > > I've seen the help on removing NA columns, but I cannot figure out how to change the function part of the statement, so that the function is only TRUE if the median of the column is higher than x; > > A[,apply(A, 2, function(x) all(x>=0))] > > > Thank you very much, > Regards! > > > > Martin Scheuringer > > --------------------------------- > Mag. Martin Scheuringer > Abteilung f?r Evidenzbasierte Wirtschaftliche Gesundheitsversorgung (EWG) > Bereich Gesundheits?konomie > > Evidence Based Economic Health Care > Health Economics > > > Hauptverband der ?sterreichischen Sozialversicherungstr?ger > Main Association of Austrian Social Insurance Insitututions > Kundmanngasse 21 > 1031 Wien > Tel.: +43-1-71132-3624 > Fax.: +43-1-71132-3786 > http://www.hauptverband.at > P Save paper, do you really need to print this e-mail? > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php From csardi at rmki.kfki.hu Fri Mar 4 15:30:35 2011 From: csardi at rmki.kfki.hu (=?ISO-8859-1?B?R+Fib3IgQ3PhcmRp?=) Date: Fri, 4 Mar 2011 09:30:35 -0500 Subject: [R] Plotting Mean in plotting degree distribution In-Reply-To: References: <1299192217936-3334375.post@n4.nabble.com> Message-ID: I think this would be rather something like abline(v=mean(degree(G))) Best, Gabor On Thu, Mar 3, 2011 at 8:04 PM, Scott Chamberlain wrote: > library(igraph) > G <- erdos.renyi.game(1000, 1/1000) # a random graph > > dd1 = degree.distribution(G) > > plot(dd1, xlab = "degree", ylab="frequency") > abline(h = mean(dd1)) # the mean would be a horizontal line > > On Thursday, March 3, 2011 at 4:43 PM, kparamas wrote: >> Hi, >> >> I am plotting degree distribution of a graph using the function, >> >> library(igraph) >> dd1 = degree.distribution(G) >> >> plot(dd1, xlab = "degree", ylab="frequency") >> >> I would like to plot the mean of the distribution as a vertical line in the >> attached plot. >> Please let me know how to do this. >> >> Thanks, >> Kumar http://r.789695.n4.nabble.com/file/n3334375/cdata3_dd.png >> cdata3_dd.png >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Plotting-Mean-in-plotting-degree-distribution-tp3334375p3334375.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Gabor Csardi ? ?? UNIL DGM From ivan.calandra at uni-hamburg.de Fri Mar 4 15:36:46 2011 From: ivan.calandra at uni-hamburg.de (Ivan Calandra) Date: Fri, 04 Mar 2011 15:36:46 +0100 Subject: [R] delete rows whose sum is X In-Reply-To: <1299246634668-3335254.post@n4.nabble.com> References: <1299246634668-3335254.post@n4.nabble.com> Message-ID: <4D70F8FE.1090601@uni-hamburg.de> Hi Mikael You really need to provide a reproducible example in the future, it will help people to better understand what you want to do and help you, and help you better understand the answers as well. Try something like this: mat[apply(mat, 1, FUN=function(x) sum(x)=5),] HTH, Ivan Le 3/4/2011 14:50, purna a ?crit : > Rnoob here. > I have a matrix of zeroes ond ones. I want to delete the rows whose sum of > values is not =5, alternatively extract the rows who sum up to 5. > > Thank you/Mikael > > -- > View this message in context: http://r.789695.n4.nabble.com/delete-rows-whose-sum-is-X-tp3335254p3335254.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php From ivan.calandra at uni-hamburg.de Fri Mar 4 15:39:30 2011 From: ivan.calandra at uni-hamburg.de (Ivan Calandra) Date: Fri, 04 Mar 2011 15:39:30 +0100 Subject: [R] How to copy data from data.frame to matrix In-Reply-To: References: Message-ID: <4D70F9A2.9010700@uni-hamburg.de> I have never used it, but I think the reshape and/or reshape2 packages are designed for it. Check the melt() and cast() functions in these packages... I guess... Ivan Le 3/4/2011 11:33, Dmitrij Kudriavcev a ?crit : > Hello, no. I need to change data format, so i can build covariance matrix on > it > > Cheers, > Dima > > 2011/3/4 Alberto Negron > >> Can't you just convert you df as follow matrix<- as.matrix(s) ? >> >> Double check it as I am a newbie too. :-) >> >> Regards, >> >> Alberto >> >> On 4 March 2011 06:08, Dmitrij Kudriavcevwrote: >> >>> Hello >>> >>> I'm a new in R >>> I have a large data.frame "s" (this is actualy just a table in mysql) : >>> >>>> names(s) >>> [1] "symbols", "day", "value" >>> >>> I need to convert it to simple matrix. I have define this matrix like >>> this: >>> >>>> data.matrix<- matrix(nrow=nDays, ncol=nSymbols, dimnames=list(days, >>> symbols)) >>> >>> then i just copy values to the matrix using for() loop, but it seems to >>> take >>> very long time. Is is a more fast way to do it in R? I know, what i can >>> just >>> gyve s$value as source data to the matrix, but problem is, what for some >>> symbols couple days could be just missed. >>> >>> Cheers, >>> Dima >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php From ligges at statistik.tu-dortmund.de Fri Mar 4 15:41:20 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Fri, 4 Mar 2011 15:41:20 +0100 Subject: [R] Environment variable PATH in Windows In-Reply-To: <5D67331F-8332-4628-9D80-4AA542D3D5FC@plessthan.com> References: <5D67331F-8332-4628-9D80-4AA542D3D5FC@plessthan.com> Message-ID: <4D70FA10.2040007@statistik.tu-dortmund.de> On 04.03.2011 15:16, Dennis Fisher wrote: > Colleagues, > > I am trying to understand how R (2.12.1) obtains the PATH environment variable in Windows (7 or Vista). Startup {base} directs one to: > "R_ENVIRON" -- which equals "" in my systems > R_HOME/etc/Renviron.site -- which does not exist > Next, it directs to: > R_HOME/etc/Rprofile.site -- which also does not exist (the expected behavior in a "factory-fresh" installation) You can create any of these files and set env variables therein. If you want to change PATH, you can do so as well in the control panel (system). The current value can be shown by Sys.getenv("PATH") in R. Uwe Ligges > I found the following files in R_HOME/etc: > Makeconf > Rcmd_environ > Rconsole > Rdevga > respositories > rgb.text > Rprofile.txt > none of which refer to PATH (except Makeconf, which refers to JAVA path, but not to PATH itself) > > Today's R-SIG-Mac Digest addresses a similar issue in OS X -- that R.app gets the values from /.MacOSX/environment.plist whereas a session started in Terminal uses .bashrc or /etc/profile. What are the corresponding sources in Windows? > > Thanks in advance. > > Dennis > > Dennis Fisher MD > P< (The "P Less Than" Company) > Phone: 1-866-PLessThan (1-866-753-7784) > Fax: 1-866-PLessThan (1-866-753-7784) > www.PLessThan.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ivan.calandra at uni-hamburg.de Fri Mar 4 15:41:55 2011 From: ivan.calandra at uni-hamburg.de (Ivan Calandra) Date: Fri, 04 Mar 2011 15:41:55 +0100 Subject: [R] delete rows whose sum is X In-Reply-To: <4D70F8FE.1090601@uni-hamburg.de> References: <1299246634668-3335254.post@n4.nabble.com> <4D70F8FE.1090601@uni-hamburg.de> Message-ID: <4D70FA33.5090702@uni-hamburg.de> Oops, forgot one "=": mat[apply(mat, 1, FUN=function(x) sum(x)==5),] Le 3/4/2011 15:36, Ivan Calandra a ?crit : > Hi Mikael > > You really need to provide a reproducible example in the future, it > will help people to better understand what you want to do and help > you, and help you better understand the answers as well. > > Try something like this: > mat[apply(mat, 1, FUN=function(x) sum(x)=5),] > > HTH, > Ivan > > Le 3/4/2011 14:50, purna a ?crit : >> Rnoob here. >> I have a matrix of zeroes ond ones. I want to delete the rows whose >> sum of >> values is not =5, alternatively extract the rows who sum up to 5. >> >> Thank you/Mikael >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/delete-rows-whose-sum-is-X-tp3335254p3335254.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php From dwinsemius at comcast.net Fri Mar 4 15:42:41 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Mar 2011 09:42:41 -0500 Subject: [R] Error in model.frame.default In-Reply-To: <4D70B526.4090004@uni-bremen.de> References: <4D6FB72B.5010409@uni-bremen.de> <4D70B526.4090004@uni-bremen.de> Message-ID: On Mar 4, 2011, at 4:47 AM, Heike Schmitz wrote: > Hi again, dear Dennis, > > i checked the spelling in Zuur et al. and they wrote it like i did. > I tried your suggestion but now i have another warning message: > > >> D1<- data.frame(L.AREA= Loyn$L.AREA[Loyn$fGRAZE==1], fGraze = "1") > > Error in data.frame(L.AREA = Loyn$L.AREA[Loyn$fGRAZE == 1], fGraze = > "1") : > arguments imply differing number of rows: 0, 1 In the code you originally offered, 'fGRAZE' was just a vector in the global environment and not in the 'Lon' dataframe. -- David. > > My aim i to display the predicted/fitted values of the different > slopes > for a factor in a plot. > I tried different ways for my own data and when i found this > solution i > was happy, but even the provided solution with the provided dataset do > not work... > aaaahhhhh. > > Any ideas?? First i thought i had a problem with missing values, but > in > the Zuur data are no missing values. > > Heike > > > > Am 3/3/2011 18:51, schrieb Dennis Murphy: >> Hi: >> >> You need the second variable in D1 to be named fGRAZE - the variable >> names in the newdata data frame (D1) have to be the same as the >> variable names on the RHS of the model formula, in this case L.AREA >> and fGRAZE. >> >> HTH, >> Dennis >> >> On Thu, Mar 3, 2011 at 7:43 AM, Heike Schmitz > > wrote: >> >> Dear R- Community, >> >> to learn i reanalysed some data provided and analysed by Zuur et. >> al. in their book "Mixed effect models and Extensions in Ecology >> with R". When i run the last command i get a warning message i >> dont understand. >> >> >> Loyn<- read.table(file = "loyn.txt",header = TRUE) >> Loyn$L.AREA<- log10(Loyn$AREA) >> fGRAZE <-factor(Loyn$GRAZE) >> >> M0<- lm(ABUND~ L.AREA + fGRAZE, data = Loyn) >> summary(M0) >> >> plot(x = Loyn$L.AREA, y = Loyn$ABUND, >> xlab = "Log transformed AREA", >> ylab = "Bird Abundance") >> >> D1<- data.frame(L.AREA= Loyn$L.AREA[Loyn$GRAZE==1], fGraze = "1") >> P1<- predict(M0,newdata = D1) >> >> Warning message: >> Error in model.frame.default(Terms, newdata, na.action = >> na.action, xlev = object$xlevels) : >> variable lengths differ (found for 'fGRAZE') >> In addition: Warning message: >> 'newdata' had 13 rows but variable(s) found have 56 rows >> >> I hope anyone has an idea. >> Thank you in advance. >> Heike >> >> -- >> Heike Schmitz- Diaspero >> Population Ecology and Evolutionary Ecology Lab, FB2 >> University of Bremen >> Leobener Strasse, Nw2, Room B4050 >> D-28359 Bremen >> Germany >> fon ++49-421-218-62937 >> email: heike.schmitz at uni-bremen.de >> >> >> http://www.popecol.uni-bremen.de >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > -- > Heike Schmitz- Diaspero > Population Ecology and Evolutionary Ecology Lab, FB2 > University of Bremen > Leobener Strasse, Nw2, Room B4050 > D-28359 Bremen > Germany > fon ++49-421-218-62937 > email: heike.schmitz at uni-bremen.de > > http://www.popecol.uni-bremen.de > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From ligges at statistik.tu-dortmund.de Fri Mar 4 15:45:34 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Fri, 04 Mar 2011 15:45:34 +0100 Subject: [R] How to copy data from data.frame to matrix In-Reply-To: References: <4D70BEBA.7030402@uni-hamburg.de> Message-ID: <4D70FB0E.7070305@statistik.tu-dortmund.de> On 04.03.2011 11:38, Dmitrij Kudriavcev wrote: > Hello > > Let's say, my data.frame is > > symbol,day,value > A, 2010-01-01, 0.8888 > A, 2010-01-02, 0.6666 > B, 2010-01-01, 0.7777 > > i need to get matrix as See ?reshape, in this case if you data.frame is in dat: reshape(dat, v.names="value", direction="wide", idvar="day", timevar="symbol") Uwe Ligges > , A, B > 2010-01-01, 0.8888, 0.7777 > 2010-01-02, 0.6666, NA > > where A and B is columns name and date used as row name > > I found a way how to do it with tapply function, is it a best way (i will > need to do this pretty offen and wish to save some time) > > Cheers, > Dima > > > 2011/3/4 Ivan Calandra > >> Hi, >> >> Let's say your data.frame is called df: >> df<- data.frame(a=rnorm(10), b=rnorm(10)) >> data.matrix<- as.matrix(df) >> >> This should work, but be careful with coercion if you have different modes >> in your data.frame >> >> HTH, >> Ivan >> >> PS: next time, provide a reproducible example, using dput() for example >> >> Le 3/4/2011 07:08, Dmitrij Kudriavcev a ?crit : >> >>> Hello >>> >>> I'm a new in R >>> I have a large data.frame "s" (this is actualy just a table in mysql) : >>> >>> names(s) >>>> >>> [1] "symbols", "day", "value" >>> >>> I need to convert it to simple matrix. I have define this matrix like >>> this: >>> >>> data.matrix<- matrix(nrow=nDays, ncol=nSymbols, dimnames=list(days, >>>> >>> symbols)) >>> >>> then i just copy values to the matrix using for() loop, but it seems to >>> take >>> very long time. Is is a more fast way to do it in R? I know, what i can >>> just >>> gyve s$value as source data to the matrix, but problem is, what for some >>> symbols couple days could be just missed. >>> >>> Cheers, >>> Dima >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> -- >> Ivan CALANDRA >> PhD Student >> University of Hamburg >> Biozentrum Grindel und Zoologisches Museum >> Abt. S?ugetiere >> Martin-Luther-King-Platz 3 >> D-20146 Hamburg, GERMANY >> +49(0)40 42838 6231 >> ivan.calandra at uni-hamburg.de >> >> ********** >> http://www.for771.uni-bonn.de >> http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php >> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Fri Mar 4 15:47:50 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Fri, 04 Mar 2011 15:47:50 +0100 Subject: [R] delete rows whose sum is X In-Reply-To: <4D70FA33.5090702@uni-hamburg.de> References: <1299246634668-3335254.post@n4.nabble.com> <4D70F8FE.1090601@uni-hamburg.de> <4D70FA33.5090702@uni-hamburg.de> Message-ID: <4D70FB96.4080401@statistik.tu-dortmund.de> On 04.03.2011 15:41, Ivan Calandra wrote: > Oops, forgot one "=": > mat[apply(mat, 1, FUN=function(x) sum(x)==5),] Yes, but since floating point issues may ba apparent in the end, I'd vote for: mat[apply(mat, 1, FUN = function(x) isTRUE(all.equal(sum(x), 5))),] > > Le 3/4/2011 15:36, Ivan Calandra a ?crit : >> Hi Mikael >> >> You really need to provide a reproducible example in the future, it >> will help people to better understand what you want to do and help >> you, and help you better understand the answers as well. >> >> Try something like this: >> mat[apply(mat, 1, FUN=function(x) sum(x)=5),] >> >> HTH, >> Ivan >> >> Le 3/4/2011 14:50, purna a ?crit : >>> Rnoob here. >>> I have a matrix of zeroes ond ones. I want to delete the rows whose >>> sum of >>> values is not =5, alternatively extract the rows who sum up to 5. >>> >>> Thank you/Mikael >>> >>> -- >>> View this message in context: >>> http://r.789695.n4.nabble.com/delete-rows-whose-sum-is-X-tp3335254p3335254.html >>> >>> Sent from the R help mailing list archive at Nabble.com. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> > From ivan.calandra at uni-hamburg.de Fri Mar 4 15:52:17 2011 From: ivan.calandra at uni-hamburg.de (Ivan Calandra) Date: Fri, 04 Mar 2011 15:52:17 +0100 Subject: [R] delete rows whose sum is X In-Reply-To: <4D70FB96.4080401@statistik.tu-dortmund.de> References: <1299246634668-3335254.post@n4.nabble.com> <4D70F8FE.1090601@uni-hamburg.de> <4D70FA33.5090702@uni-hamburg.de> <4D70FB96.4080401@statistik.tu-dortmund.de> Message-ID: <4D70FCA1.4080106@uni-hamburg.de> True, I didn't think about it because the matrix is supposed to be filled with 0 and 1, and I automatically thought about integers. It wouldn't be a problem with integers, right? Le 3/4/2011 15:47, Uwe Ligges a ?crit : > > > On 04.03.2011 15:41, Ivan Calandra wrote: >> Oops, forgot one "=": >> mat[apply(mat, 1, FUN=function(x) sum(x)==5),] > > > Yes, but since floating point issues may ba apparent in the end, I'd > vote for: > > mat[apply(mat, 1, FUN = function(x) isTRUE(all.equal(sum(x), 5))),] > > >> >> Le 3/4/2011 15:36, Ivan Calandra a ?crit : >>> Hi Mikael >>> >>> You really need to provide a reproducible example in the future, it >>> will help people to better understand what you want to do and help >>> you, and help you better understand the answers as well. >>> >>> Try something like this: >>> mat[apply(mat, 1, FUN=function(x) sum(x)=5),] >>> >>> HTH, >>> Ivan >>> >>> Le 3/4/2011 14:50, purna a ?crit : >>>> Rnoob here. >>>> I have a matrix of zeroes ond ones. I want to delete the rows whose >>>> sum of >>>> values is not =5, alternatively extract the rows who sum up to 5. >>>> >>>> Thank you/Mikael >>>> >>>> -- >>>> View this message in context: >>>> http://r.789695.n4.nabble.com/delete-rows-whose-sum-is-X-tp3335254p3335254.html >>>> >>>> >>>> Sent from the R help mailing list archive at Nabble.com. >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >> > -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php From ligges at statistik.tu-dortmund.de Fri Mar 4 15:53:51 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Fri, 04 Mar 2011 15:53:51 +0100 Subject: [R] delete rows whose sum is X In-Reply-To: <4D70FCA1.4080106@uni-hamburg.de> References: <1299246634668-3335254.post@n4.nabble.com> <4D70F8FE.1090601@uni-hamburg.de> <4D70FA33.5090702@uni-hamburg.de> <4D70FB96.4080401@statistik.tu-dortmund.de> <4D70FCA1.4080106@uni-hamburg.de> Message-ID: <4D70FCFF.6010701@statistik.tu-dortmund.de> On 04.03.2011 15:52, Ivan Calandra wrote: > True, I didn't think about it because the matrix is supposed to be > filled with 0 and 1, and I automatically thought about integers. It > wouldn't be a problem with integers, right? If the matrix is really an integer matrix, right, otherwise not. Best, Uwe > > Le 3/4/2011 15:47, Uwe Ligges a ?crit : >> >> >> On 04.03.2011 15:41, Ivan Calandra wrote: >>> Oops, forgot one "=": >>> mat[apply(mat, 1, FUN=function(x) sum(x)==5),] >> >> >> Yes, but since floating point issues may ba apparent in the end, I'd >> vote for: >> >> mat[apply(mat, 1, FUN = function(x) isTRUE(all.equal(sum(x), 5))),] >> >> >>> >>> Le 3/4/2011 15:36, Ivan Calandra a ?crit : >>>> Hi Mikael >>>> >>>> You really need to provide a reproducible example in the future, it >>>> will help people to better understand what you want to do and help >>>> you, and help you better understand the answers as well. >>>> >>>> Try something like this: >>>> mat[apply(mat, 1, FUN=function(x) sum(x)=5),] >>>> >>>> HTH, >>>> Ivan >>>> >>>> Le 3/4/2011 14:50, purna a ?crit : >>>>> Rnoob here. >>>>> I have a matrix of zeroes ond ones. I want to delete the rows whose >>>>> sum of >>>>> values is not =5, alternatively extract the rows who sum up to 5. >>>>> >>>>> Thank you/Mikael >>>>> >>>>> -- >>>>> View this message in context: >>>>> http://r.789695.n4.nabble.com/delete-rows-whose-sum-is-X-tp3335254p3335254.html >>>>> >>>>> >>>>> Sent from the R help mailing list archive at Nabble.com. >>>>> >>>>> ______________________________________________ >>>>> R-help at r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>> >> > From dwinsemius at comcast.net Fri Mar 4 16:12:33 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Mar 2011 10:12:33 -0500 Subject: [R] Creating a .png with just an expression() in it In-Reply-To: <4D70E9B6.8040800@gmx.de> References: <4D70E9B6.8040800@gmx.de> Message-ID: <40F723C4-B626-447C-B724-079594C2B3A5@comcast.net> On Mar 4, 2011, at 8:31 AM, Alexx Hardt wrote: > Hey, > I'm trying to create an image file with the results of a regression > analysis. In TeX, the line would be something like: > $ size = 0.34 + 4.3 var_1 $ > > Can I create a plot window with just this line in it? I tried > playing around with plot.new() or dev.new(), but didn't really find > something that worked. plot(NULL, xlim=c(0,1), ylim=c(0,1), ylab="") abline(0.34, 4.3) -- David Winsemius, MD Heritage Laboratories West Hartford, CT From r.m.krug at gmail.com Fri Mar 4 16:20:01 2011 From: r.m.krug at gmail.com (Rainer M Krug) Date: Fri, 4 Mar 2011 16:20:01 +0100 Subject: [R] Time series analysis for a daily series In-Reply-To: <4D70CDF0.5040700@ceam.es> References: <4D70CDF0.5040700@ceam.es> Message-ID: On Fri, Mar 4, 2011 at 12:33 PM, Paco Pastor wrote: > Hi everyone > > I am trying to do some time series analysis with daily temperature data (40 > years). I have created a zoo object and ts object but can't apply stl > function. It says the series is not periodic or has less than two periods. > I've searched through google and found a lot of messages about this problem > but not a solution/example to look for trend and seasonal component of a > daily series. Difficult to say without your code.... But have you looked into the window.s option of stl? Rainer > > Is there any guide/document to perform this analysis? I suppose there are > another choices but stl, which should I try for daily series? > > Thanks in advance > > Paco > > > -- > ----------- > Francisco Pastor > Meteorology department, Instituto Universitario CEAM-UMH > http://www.ceam.es > ----------- > mail: paco at ceam.es > skype: paco.pastor.guzman > Researcher ID: http://www.researcherid.com/rid/B-8331-2008 > Cosis profile: http://www.cosis.net/profile/francisco.pastor > ----------- > Parque Tecnologico, C/ Charles R. Darwin, 14 > 46980 PATERNA (Valencia), Spain > Tlf. 96 131 82 27 - Fax. 96 131 81 90 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- NEW GERMAN FAX NUMBER!!! Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Cell:? ? ? ? ?? +27 - (0)83 9479 042 Fax:? ? ? ? ? ? +27 - (0)86 516 2782 Fax:? ? ? ? ? ? +49 - (0)321 2125 2244 email:? ? ? ? ? Rainer at krugs.de Skype:? ? ? ? ? RMkrug Google:? ? ? ?? R.M.Krug at gmail.com From ligges at statistik.tu-dortmund.de Fri Mar 4 16:29:14 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Fri, 04 Mar 2011 16:29:14 +0100 Subject: [R] Creating a .png with just an expression() in it In-Reply-To: <40F723C4-B626-447C-B724-079594C2B3A5@comcast.net> References: <4D70E9B6.8040800@gmx.de> <40F723C4-B626-447C-B724-079594C2B3A5@comcast.net> Message-ID: <4D71054A.7040104@statistik.tu-dortmund.de> On 04.03.2011 16:12, David Winsemius wrote: > > On Mar 4, 2011, at 8:31 AM, Alexx Hardt wrote: > >> Hey, >> I'm trying to create an image file with the results of a regression >> analysis. In TeX, the line would be something like: >> $ size = 0.34 + 4.3 var_1 $ >> >> Can I create a plot window with just this line in it? I tried playing >> around with plot.new() or dev.new(), but didn't really find something >> that worked. > > plot(NULL, xlim=c(0,1), ylim=c(0,1), ylab="") > abline(0.34, 4.3) > I thought the question was to say plot.new() plot.window(xlim=c(0,1), ylim=c(0,1)) text(0.5, 0.5, expression(size == 0.34 + 4.3 * var[1])) Anyway, this shows that the question was not too precise. Best, Uwe Ligges From dwinsemius at comcast.net Fri Mar 4 16:31:05 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Mar 2011 10:31:05 -0500 Subject: [R] Creating a .png with just an expression() in it In-Reply-To: <40F723C4-B626-447C-B724-079594C2B3A5@comcast.net> References: <4D70E9B6.8040800@gmx.de> <40F723C4-B626-447C-B724-079594C2B3A5@comcast.net> Message-ID: <2FD6071E-DD7C-425D-A84A-4ABE3509F49B@comcast.net> On Mar 4, 2011, at 10:12 AM, David Winsemius wrote: > > On Mar 4, 2011, at 8:31 AM, Alexx Hardt wrote: > >> Hey, >> I'm trying to create an image file with the results of a regression >> analysis. In TeX, the line would be something like: >> $ size = 0.34 + 4.3 var_1 $ >> >> Can I create a plot window with just this line in it? I tried >> playing around with plot.new() or dev.new(), but didn't really find >> something that worked. > > plot(NULL, xlim=c(0,1), ylim=c(0,1), ylab="") > abline(0.34, 4.3) After looking at the subject line I suspect I may have msinterpreted you hopes. Here is a different interpretation of what you requested: plot(0,0, type="n", frame.plot=F, axes=F, ylab="", xlab='') text(0,0, "size = 0.34 + 4.3 var_1") -- David Winsemius, MD Heritage Laboratories West Hartford, CT From asanramzan at yahoo.com Fri Mar 4 15:50:51 2011 From: asanramzan at yahoo.com (Asan Ramzan) Date: Fri, 4 Mar 2011 06:50:51 -0800 (PST) Subject: [R] a simple problem Message-ID: <369091.18570.qm@web44711.mail.sp1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From hugoazina at gmail.com Fri Mar 4 15:54:27 2011 From: hugoazina at gmail.com (Caribu) Date: Fri, 4 Mar 2011 06:54:27 -0800 (PST) Subject: [R] AIC on GLMM pscl package Message-ID: <1299250467152-3335371.post@n4.nabble.com> Hello, I'm using GLMM on the pscl package and i'm not getting the AIC on the summary. The code i'm using is (example) : mmall3 <-glmmPQL(allclues ~ cycloc + male, data=dados, family=poisson, random=~1|animal/idfid) and the results: Linear mixed-effects model fit by maximum likelihood Data: dados AIC BIC logLik NA NA NA Random effects: Formula: ~1 | animal (Intercept) StdDev: 0.4235518 Formula: ~1 | idfid %in% animal (Intercept) Residual StdDev: 0.947683 1.752526 Variance function: Structure: fixed weights Formula: ~invwt Fixed effects: allclues ~ cycloc + male Value Std.Error DF t-value p-value (Intercept) 1.8050720 0.2653779 333 6.801892 0.0000 cycloc 0.0718826 0.0128099 181 5.611469 0.0000 male1 -0.6254748 0.3552453 5 -1.760684 0.1386 Correlation: (Intr) cycloc cycloc -0.060 male1 -0.744 -0.008 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.8968532 -0.6758671 -0.2687695 0.3338190 4.5704000 Number of Observations: 522 Number of Groups: Am i doing something wrong? Or theres a code to extract the AIC of the model? Thanks, Best regards Hugo Caribu -- View this message in context: http://r.789695.n4.nabble.com/AIC-on-GLMM-pscl-package-tp3335371p3335371.html Sent from the R help mailing list archive at Nabble.com. From wwl_mok at yahoo.co.uk Fri Mar 4 16:47:22 2011 From: wwl_mok at yahoo.co.uk (William Mok) Date: Fri, 4 Mar 2011 15:47:22 +0000 (GMT) Subject: [R] apply.rolling() to a multi column timeSeries Message-ID: <137183.14703.qm@web27905.mail.ukl.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From a.mosnier at gmail.com Fri Mar 4 17:02:21 2011 From: a.mosnier at gmail.com (Arnaud Mosnier) Date: Fri, 4 Mar 2011 11:02:21 -0500 Subject: [R] Problem with tcltk Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Fri Mar 4 17:03:58 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Mar 2011 11:03:58 -0500 Subject: [R] a simple problem In-Reply-To: <369091.18570.qm@web44711.mail.sp1.yahoo.com> References: <369091.18570.qm@web44711.mail.sp1.yahoo.com> Message-ID: <92CDCBD8-4E99-4355-8A28-069DF450DFA1@comcast.net> On Mar 4, 2011, at 9:50 AM, Asan Ramzan wrote: > Hello R-help > > I am working with large data table that have the occasional label, > a particular time point in an experiment. E.g: > > "Time (min)", "R1 R1", "R2 R1", "R3 R1", "R4 R1" > .909, 1.117, 1.225, 1.048, 1.258 > 3.942, 1.113, 1.230, 1.049, 1.262 > 3.976, 1.105, 1.226, 1.051, 1.259 > 4.009, 1.114, 1.231, 1.053, 1.259 > 4.042, 1.107, 1.230, 1.048, 1.262 > 4.076, 1.108, 1.226, 1.045, 1.257 > 4.109, 1.109, 1.227, 1.047, 1.259 > 4.142, 1.108, 1.225, 1.052, 1.260 > 4.176, 1.105, 1.222, 1.046, 1.260 > 4.209, 1.106, 1.226, 1.050, 1.258 > 4.242, 1.105, 1.224, 1.047, 1.258 > 4.276, 1.104, 1.223, 1.048, 1.259 > 4.309, 1.106, 1.228, 1.050, 1.260 > 4.342, 1.103, 1.219, 1.049, 1.260 > 4.376, 1.107, 1.225, 1.052, 1.259 > 4.409, 1.105, 1.222, 1.047, 1.258 > 4.442, 1.106, 1.227, 1.048, 1.262 > 4.476, 1.105, 1.222, 1.049, 1.261 > 4.509, 1.102, 1.222, 1.047, 1.259 > 4.555, "Gly sar" > 4.555, 1.107, 1.224, 1.048, 1.261 > 4.576, 1.109, 1.228, 1.053, 1.259 > 4.609, 1.103, 1.218, 1.046, 1.258 > 4.642, 1.105, 1.223, 1.048, 1.256 > 4.676, 1.108, 1.217, 1.048, 1.260 > 4.709, 1.124, 1.222, 1.047, 1.258 > When I try to read in the table, I get: >> try<-read.table("200810_01.R",header=T,sep=",") > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, > na.strings, : > line 136 did not have 5 elements > > Is there any way to tell R to ignore these labels or better > still interpret them as being label for particular time > points, so when it comes to draw a line graph it is annotated > with these labels. Option 1: Prepare your data properly with an editor: Option 2: You could read the file with readLines, identify the offending lines with grep or grepl, then separate the offenders and non-offenders. lines <- readLines(textConnection('"Time (min)", "R1 R1", "R2 R1", "R3 R1", "R4 R1" .909, 1.117, 1.225, 1.048, 1.258 3.942, 1.113, 1.230, 1.049, 1.262 3.976, 1.105, 1.226, 1.051, 1.259 4.009, 1.114, 1.231, 1.053, 1.259 4.042, 1.107, 1.230, 1.048, 1.262 4.076, 1.108, 1.226, 1.045, 1.257 4.109, 1.109, 1.227, 1.047, 1.259 4.142, 1.108, 1.225, 1.052, 1.260 4.176, 1.105, 1.222, 1.046, 1.260 4.209, 1.106, 1.226, 1.050, 1.258 4.242, 1.105, 1.224, 1.047, 1.258 4.276, 1.104, 1.223, 1.048, 1.259 4.309, 1.106, 1.228, 1.050, 1.260 4.342, 1.103, 1.219, 1.049, 1.260 4.376, 1.107, 1.225, 1.052, 1.259 4.409, 1.105, 1.222, 1.047, 1.258 4.442, 1.106, 1.227, 1.048, 1.262 4.476, 1.105, 1.222, 1.049, 1.261 4.509, 1.102, 1.222, 1.047, 1.259 4.555, "Gly sar" 4.555, 1.107, 1.224, 1.048, 1.261 4.576, 1.109, 1.228, 1.053, 1.259 4.609, 1.103, 1.218, 1.046, 1.258 4.642, 1.105, 1.223, 1.048, 1.256 4.676, 1.108, 1.217, 1.048, 1.260 4.709, 1.124, 1.222, 1.047, 1.258')) read.table(textConnection( lines[ c(TRUE, !grepl("[[:alpha:]]", lines)[-1]) ]), skip=1) # the quotes and spaces don't work well with R column naming conventions V1 V2 V3 V4 V5 1 .909, 1.117, 1.225, 1.048, 1.258 2 3.942, 1.113, 1.230, 1.049, 1.262 3 3.976, 1.105, 1.226, 1.051, 1.259 snipped 23 4.642, 1.105, 1.223, 1.048, 1.256 24 4.676, 1.108, 1.217, 1.048, 1.260 25 4.709, 1.124, 1.222, 1.047, 1.258 So even more compact would be: read.table(textConnection( lines[ !grepl("[[:alpha:]]", lines) ] ) ) Using the non-negated grepl expression should get you all the "labels" lines David Winsemius, MD Heritage Laboratories West Hartford, CT From eriki at ccbr.umn.edu Fri Mar 4 17:05:22 2011 From: eriki at ccbr.umn.edu (Erik Iverson) Date: Fri, 04 Mar 2011 10:05:22 -0600 Subject: [R] linear model - lm (Adjusted R-squared)? In-Reply-To: References: Message-ID: <4D710DC2.6060003@ccbr.umn.edu> See: http://en.wikipedia.org/wiki/Coefficient_of_determination#Adjusted_R2 and the implementation in summary.lm : ans$adj.r.squared <- 1 - (1 - ans$r.squared) * ((n - df.int)/rdf) Brian Smith wrote: > Hi, > > Sorry for the naive question, but what exactly does the 'Adjusted R-squared' > coefficient in the summary of linear model adjust for? > > Sample code: > >> x <- rnorm(15) >> y <- rnorm(15) >> lmr <- lm(y~x) >> summary(lmr) > > Call: > lm(formula = y ~ x) > > Residuals: > Min 1Q Median 3Q Max > -1.7828 -0.7379 -0.4485 0.7563 2.1570 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) -0.13084 0.28845 -0.454 0.658 > x 0.01923 0.25961 0.074 0.942 > > Residual standard error: 1.106 on 13 degrees of freedom > Multiple R-squared: 0.0004217, Adjusted R-squared: -0.07647 > F-statistic: 0.005485 on 1 and 13 DF, p-value: 0.942 > >> cor(x,y) > [1] 0.02053617 > > > - What factors are included in the adjustment? > > many thanks! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From gunter.berton at gene.com Fri Mar 4 17:20:36 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Fri, 4 Mar 2011 08:20:36 -0800 Subject: [R] Coefficient of Determination for nonlinear function In-Reply-To: <1299246001.1764.17.camel@pollux> References: <1299246001.1764.17.camel@pollux> Message-ID: The coefficient of determination, R^2, is a measure of how well your model fits versus a "NULL" model, which is that the data are constant. In nonlinear models, as opposed to linear models, such a null model rarely makes sense. Therefore the coefficient of determination is generally not meaningful in nonlinear modeling. Yet another way in which linear and nonlinear models fundamentally differ. -- Bert On Fri, Mar 4, 2011 at 5:40 AM, Uwe Wolfram wrote: > Dear Subscribers, > > I did fit an equation of the form 1 = f(x1,x2,x3) using a minimization > scheme. Now I want to compute the coefficient of determination. Normally > I would compute it as > > r_square = 1- sserr/sstot with sserr = sum_i (y_i - f_i) and sstot = > sum_i (y_i - mean(y)) > > sserr is clear to me but how can I compute sstot when there is no such > thing than differing y_i. These are all one. Thus mean(y)=1. Therefore, > sstot is 0. > > Thank you very much for your efforts, > > Uwe > -- > Uwe Wolfram > Dipl.-Ing. (Ph.D Student) > __________________________________________________ > Institute of Orthopaedic Research and Biomechanics > Director and Chair: Prof. Dr. Anita Ignatius > Center of Musculoskeletal Research Ulm > University Hospital Ulm > Helmholtzstr. 14 > 89081 Ulm, Germany > Phone: +49 731 500-55301 > Fax: +49 731 500-55302 > http://www.biomechanics.de > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Bert Gunter Genentech Nonclinical Biostatistics 467-7374 http://devo.gene.com/groups/devo/depts/ncb/home.shtml From m.r.nixon at ex.ac.uk Fri Mar 4 17:18:45 2011 From: m.r.nixon at ex.ac.uk (mattnixon) Date: Fri, 4 Mar 2011 08:18:45 -0800 (PST) Subject: [R] Reading in and manipulating multiple data sets from the same input file Message-ID: <1299255525698-3335523.post@n4.nabble.com> Hi, I am attempting to write code which will read in my data which is of this form: X1 Y1 X2 Y2 .... Xn Yn 0 0 0 0 0 0 1 0 1 255 1 0 2 255 2 0 2 255 3 0 3 0 3 0 4 0 4 0 4 0 5 0 5 0 5 255 6 125 6 125 6 0 7 0 7 0 7 0 8 0 8 0 8 125 . . . With n~100. My current code deals with only 1 data set, n~1 (below): profile<-read.table("datav1.txt",header=T) attach(profile) lines<-profile[Y>100,] d<-lines$X i<-1 l<-1:1:i while(i<30){ l[i]<-(d[(i+1)]-d[i]) temp<-i+1 i<-temp } L<-l[l>22] I want to extend this to accept n data sets to see how L varies between each data set. The way I have been trying to do this is as follows: profile<-read.table("datav2.txt",header=T) j<-1 lines[j]<-profile[profile$Y[(2*j)]>100,] etc. However this returns the message "Error in profile$Y : object of type 'closure' is not subsettable". Does anybody know if there is any way I can read in a file containing many data sets and save each data set as an element of some matrix before performing the calculations (above) on it? Or some other method to achieve the same thing? Any help or suggestions would be great! Thank you. -- View this message in context: http://r.789695.n4.nabble.com/Reading-in-and-manipulating-multiple-data-sets-from-the-same-input-file-tp3335523p3335523.html Sent from the R help mailing list archive at Nabble.com. From mails4me at gmx.at Fri Mar 4 17:21:12 2011 From: mails4me at gmx.at (Marcel J.) Date: Fri, 04 Mar 2011 17:21:12 +0100 Subject: [R] make an own (different) color legend with spplot() Message-ID: <4D711178.9080101@gmx.at> Hi! Is there a way to manually costumize the color legend in an spplot() - especially where to draw ticks and labels for the ticks? The reason I'm asking: Usually spplot() automatically divides the data into fitting slices and makes a color legend (also automatically). I want to assign the slices myself and have a fixed scale instead of an automatic/dynamic scale. I think what I want gets clear in this example: library(sp) data(meuse.grid) gridded(meuse.grid) = ~x+y ## DATA GENERATION meuse.grid$random <- rnorm(nrow(meuse.grid), 7, 2) # generate random data meuse.grid$random[meuse.grid$random < 0] <- 0 # make sure there is no value is smaller than zero ... meuse.grid$random[meuse.grid$random > 10] <- 10 # and bigger than ten ## DATA GENERATION FINISHED ## making a factor out of meuse.grid$ random to have absolute values plotted meuse.grid$random <- cut(meuse.grid$random, seq(0, 10, 0.1)) # here I assign the levels I want to use in my plot!!! spplot(meuse.grid, c("random"), col.regions = rainbow(100, start = 4/6, end = 1)) # look at the color-legend - not so good. The graphic itself is like I want it, but the legend doesn't look too good. Although I assign 100 factors, I want just a few ticks in the legend (and also just a few labels). How can this be achieved? Thank you! Marcel From rex.dwyer at syngenta.com Fri Mar 4 17:33:14 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Fri, 4 Mar 2011 11:33:14 -0500 Subject: [R] questions about using loop, while and next In-Reply-To: References: Message-ID: <36180405F8418449918AD20618D110FC095BFA7212@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Carrie, If your while-loop condition depends only on dt, and you don't change dt in your loop, your loop won't terminate. The only thing inside your loop is "next". Perhaps you mean to write: temp=rep(NA, 10) for(i in 1:10) { dt=sum(rbinom(10, 5, 0.5)) while (dt<25) { dt=sum(rbinom(10, 5, 0.5)) } temp[i]=dt } It doesn't look like you understand "next". Try reading the help with ?"next" -- the quotes are necessary in this case. If you still don't understand next, you should be able to program without it with appropriate if's. HTH Rex -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Carrie Li Sent: Friday, March 04, 2011 12:10 AM To: r-help at r-project.org Subject: [R] questions about using loop, while and next Hello R helpers, I have a quick question about loop and next In my loop, I have some random generation of data, but if the data doesn't meet some condition, then I want it to go next, and generate data again for next round. # just an example.. # i want to generate the data again, if the sum is smaller than 25 temp=rep(NA, 10) for(i in 1:10) { dt=sum(rbinom(10, 5, 0.5)) while (dt<25) next temp[i]=dt } I also tried while(dt<25) {i=i+1} But it doesn't seem right to me, since it running nonstop. Any solutions ? Thanks for helps! Carrie-- [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. From mails4me at gmx.at Fri Mar 4 17:39:40 2011 From: mails4me at gmx.at (Marcel J.) Date: Fri, 04 Mar 2011 17:39:40 +0100 Subject: [R] make an own (different) color legend with spplot() Message-ID: <4D7115CC.20109@gmx.at> Hi! Is there a way to manually costumize the color legend in an spplot() - especially where to draw ticks and labels for the ticks? The reason I'm asking: Usually spplot() automatically divides the data into fitting slices and makes a color legend (also automatically). I want to assign the slices myself and have a fixed scale instead of an automatic/dynamic scale. I think what I want gets clear in this example: library(sp) data(meuse.grid) gridded(meuse.grid) = ~x+y ## DATA GENERATION meuse.grid$random <- rnorm(nrow(meuse.grid), 7, 2) # generate random data meuse.grid$random[meuse.grid$random < 0] <- 0 # make sure there is no value is smaller than zero ... meuse.grid$random[meuse.grid$random > 10] <- 10 # and bigger than ten ## DATA GENERATION FINISHED ## making a factor out of meuse.grid$ random to have absolute values plotted meuse.grid$random <- cut(meuse.grid$random, seq(0, 10, 0.1)) # here I assign the levels I want to use in my plot!!! spplot(meuse.grid, c("random"), col.regions = rainbow(100, start = 4/6, end = 1)) # look at the color-legend - not so good. The graphic itself is like I want it, but the legend doesn't look too good. Although I assign 100 factors, I want just a few ticks in the legend (and also just a few labels). How can this be achieved? Thank you! Marcel From andy_liaw at merck.com Fri Mar 4 17:44:37 2011 From: andy_liaw at merck.com (Liaw, Andy) Date: Fri, 4 Mar 2011 11:44:37 -0500 Subject: [R] Coefficient of Determination for nonlinear function In-Reply-To: References: <1299246001.1764.17.camel@pollux> Message-ID: As far as I can tell, Uwe is not even fitting a model, but instead just solving a nonlinear equation, so I don't know why he wants a R^2. I don't see a statistical model here, so I don't know why one would want a statistical measure. Andy > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Bert Gunter > Sent: Friday, March 04, 2011 11:21 AM > To: uwe.wolfram at uni-ulm.de; r-help at r-project.org > Subject: Re: [R] Coefficient of Determination for nonlinear function > > The coefficient of determination, R^2, is a measure of how well your > model fits versus a "NULL" model, which is that the data are constant. > In nonlinear models, as opposed to linear models, such a null model > rarely makes sense. Therefore the coefficient of determination is > generally not meaningful in nonlinear modeling. > > Yet another way in which linear and nonlinear models > fundamentally differ. > > -- Bert > > On Fri, Mar 4, 2011 at 5:40 AM, Uwe Wolfram > wrote: > > Dear Subscribers, > > > > I did fit an equation of the form 1 = f(x1,x2,x3) using a > minimization > > scheme. Now I want to compute the coefficient of > determination. Normally > > I would compute it as > > > > r_square = 1- sserr/sstot with sserr = sum_i (y_i - f_i) and sstot = > > sum_i (y_i - mean(y)) > > > > sserr is clear to me but how can I compute sstot when there > is no such > > thing than differing y_i. These are all one. Thus > mean(y)=1. Therefore, > > sstot is 0. > > > > Thank you very much for your efforts, > > > > Uwe > > -- > > Uwe Wolfram > > Dipl.-Ing. (Ph.D Student) > > __________________________________________________ > > Institute of Orthopaedic Research and Biomechanics > > Director and Chair: Prof. Dr. Anita Ignatius > > Center of Musculoskeletal Research Ulm > > University Hospital Ulm > > Helmholtzstr. 14 > > 89081 Ulm, Germany > > Phone: +49 731 500-55301 > > Fax: +49 731 500-55302 > > http://www.biomechanics.de > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Bert Gunter > Genentech Nonclinical Biostatistics > 467-7374 > http://devo.gene.com/groups/devo/depts/ncb/home.shtml > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Notice: This e-mail message, together with any attachme...{{dropped:11}} From izahn at psych.rochester.edu Fri Mar 4 17:58:08 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Fri, 4 Mar 2011 11:58:08 -0500 Subject: [R] Reading in and manipulating multiple data sets from the same input file In-Reply-To: <1299255525698-3335523.post@n4.nabble.com> References: <1299255525698-3335523.post@n4.nabble.com> Message-ID: Hi, I'm afraid it's not clear to me what you are trying to do. Can you clarify what result you are trying to achieve? Best, Ista On Fri, Mar 4, 2011 at 11:18 AM, mattnixon wrote: > Hi, > > I am attempting to write code which will read in my data which is of this > form: > > X1 ? ? ? ?Y1 ? ? ? ?X2 ? ? ? Y2 ? ? ? ?.... ? ? ? Xn ? ? ? ?Yn > 0 ? ? ? ? ?0 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? ? ? ? ? ? ?0 ? ? ? ? ?0 > 1 ? ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? 255 ? ? ? ? ? ? ? ? 1 ? ? ? ? ?0 > 2 ? ? ? ? ?255 ? ? ? 2 ? ? ? ? 0 ? ? ? ? ? ? ? ? ? ?2 ? ? ? ? ?255 > 3 ? ? ? ? ?0 ? ? ? ? ?3 ? ? ? ? 0 ? ? ? ? ? ? ? ? ? ?3 ? ? ? ? ?0 > 4 ? ? ? ? ?0 ? ? ? ? ?4 ? ? ? ? 0 ? ? ? ? ? ? ? ? ? ?4 ? ? ? ? ?0 > 5 ? ? ? ? ?0 ? ? ? ? ?5 ? ? ? ? 0 ? ? ? ? ? ? ? ? ? ?5 ? ? ? ? ?255 > 6 ? ? ? ? ?125 ? ? ? 6 ? ? ? ? 125 ? ? ? ? ? ? ? ? 6 ? ? ? ? 0 > 7 ? ? ? ? ?0 ? ? ? ? ?7 ? ? ? ? 0 ? ? ? ? ? ? ? ? ? ?7 ? ? ? ? ?0 > 8 ? ? ? ? ?0 ? ? ? ? ?8 ? ? ? ? 0 ? ? ? ? ? ? ? ? ? ?8 ? ? ? ? ?125 > . > . > . > > With n~100. My current code deals with only 1 data set, n~1 (below): > > > profile<-read.table("datav1.txt",header=T) > attach(profile) > > lines<-profile[Y>100,] > d<-lines$X > i<-1 > l<-1:1:i > > while(i<30){ > l[i]<-(d[(i+1)]-d[i]) > temp<-i+1 > i<-temp > } > > L<-l[l>22] > > > I want to extend this to accept n data sets to see how L varies between each > data set. The way I have been trying to do this is as follows: > > profile<-read.table("datav2.txt",header=T) > j<-1 > lines[j]<-profile[profile$Y[(2*j)]>100,] > > etc. > > However this returns the message "Error in profile$Y : object of type > 'closure' is not subsettable". Does anybody know if there is any way I can > read in a file containing many data sets and save each data set as an > element of some matrix before performing the calculations (above) on it? Or > some other method to achieve the same thing? > > Any help or suggestions would be great! Thank you. > > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Reading-in-and-manipulating-multiple-data-sets-from-the-same-input-file-tp3335523p3335523.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From rex.dwyer at syngenta.com Fri Mar 4 18:03:31 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Fri, 4 Mar 2011 12:03:31 -0500 Subject: [R] R usage survey In-Reply-To: References: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A95@EXNSW-MBX03.nexus.csiro.au> Message-ID: <36180405F8418449918AD20618D110FC095BFA7290@USETCMSXMB02.NAFTA.SYNGENTA.ORG> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ligges at statistik.tu-dortmund.de Fri Mar 4 18:14:47 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Fri, 04 Mar 2011 18:14:47 +0100 Subject: [R] Generic mixup? In-Reply-To: <041101cbda54$ec4f2930$c4ed7b90$@sabbe@ugent.be> References: <041101cbda54$ec4f2930$c4ed7b90$@sabbe@ugent.be> Message-ID: <4D711E07.1060004@statistik.tu-dortmund.de> On 04.03.2011 11:14, Nick Sabbe wrote: > Hello list. > > > > This is from an R session (admittedly, I'm still using R 2.11.1): > >> print > > function (x, ...) > > UseMethod("print") > > > >> showMethods("print") > > > > Function "print": > > > > > > Don't the two results contradict each other? Or do I have a terrible > misunderstanding of what comprises a generic function? print() is an S3 generic while showMethods() shows S4 generics. To get a list of possible S3 generics, use methods("print"). The S4 generic corresponding to print() is called show(). Uwe Ligges > > > Thx, > > > > Nick Sabbe > > -- > > ping: nick.sabbe at ugent.be > > link: http://biomath.ugent.be > > wink: A1.056, Coupure Links 653, 9000 Gent > > ring: 09/264.59.36 > > > > -- Do Not Disapprove > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From wesleycmathew at gmail.com Fri Mar 4 18:15:43 2011 From: wesleycmathew at gmail.com (wesley mathew) Date: Fri, 4 Mar 2011 17:15:43 +0000 Subject: [R] r.dll Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From gunter.berton at gene.com Fri Mar 4 18:16:46 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Fri, 4 Mar 2011 09:16:46 -0800 Subject: [R] Coefficient of Determination for nonlinear function In-Reply-To: References: <1299246001.1764.17.camel@pollux> Message-ID: Andy, You may well be right. I assumed "fitting an equation" means that he had data to which the equation was being fitted. Maybe that's wrong -- re-reading the post still does not clarify the point for me. In any case, either way, fitting R^2 makes no sense. -- Bert On Fri, Mar 4, 2011 at 8:44 AM, Liaw, Andy wrote: > As far as I can tell, Uwe is not even fitting a model, but instead just > solving a nonlinear equation, so I don't know why he wants a R^2. ?I > don't see a statistical model here, so I don't know why one would want a > statistical measure. > > Andy > >> -----Original Message----- >> From: r-help-bounces at r-project.org >> [mailto:r-help-bounces at r-project.org] On Behalf Of Bert Gunter >> Sent: Friday, March 04, 2011 11:21 AM >> To: uwe.wolfram at uni-ulm.de; r-help at r-project.org >> Subject: Re: [R] Coefficient of Determination for nonlinear function >> >> The coefficient of determination, R^2, is a measure of how well your >> model fits versus a "NULL" model, which is that the data are constant. >> In nonlinear models, as opposed to linear models, such a null model >> rarely makes sense. Therefore the coefficient of determination is >> generally not meaningful in nonlinear modeling. >> >> Yet another way in which linear and nonlinear models >> fundamentally differ. >> >> -- Bert >> >> On Fri, Mar 4, 2011 at 5:40 AM, Uwe Wolfram >> wrote: >> > Dear Subscribers, >> > >> > I did fit an equation of the form 1 = f(x1,x2,x3) using a >> minimization >> > scheme. Now I want to compute the coefficient of >> determination. Normally >> > I would compute it as >> > >> > r_square = 1- sserr/sstot with sserr = sum_i (y_i - f_i) and sstot = >> > sum_i (y_i - mean(y)) >> > >> > sserr is clear to me but how can I compute sstot when there >> is no such >> > thing than differing y_i. These are all one. Thus >> mean(y)=1. Therefore, >> > sstot is 0. >> > >> > Thank you very much for your efforts, >> > >> > Uwe >> > -- >> > Uwe Wolfram >> > Dipl.-Ing. (Ph.D Student) >> > __________________________________________________ >> > Institute of Orthopaedic Research and Biomechanics >> > Director and Chair: Prof. Dr. Anita Ignatius >> > Center of Musculoskeletal Research Ulm >> > University Hospital Ulm >> > Helmholtzstr. 14 >> > 89081 Ulm, Germany >> > Phone: +49 731 500-55301 >> > Fax: +49 731 500-55302 >> > http://www.biomechanics.de >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> >> >> -- >> Bert Gunter >> Genentech Nonclinical Biostatistics >> 467-7374 >> http://devo.gene.com/groups/devo/depts/ncb/home.shtml >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > Notice: ?This e-mail message, together with any attachments, contains > information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, > New Jersey, USA 08889), and/or its affiliates Direct contact information > for affiliates is available at > http://www.merck.com/contact/contacts.html) that may be confidential, > proprietary copyrighted and/or legally privileged. It is intended solely > for the use of the individual or entity named on this message. If you are > not the intended recipient, and have received this message in error, > please notify us immediately by reply e-mail and then delete it from > your system. > > -- Bert Gunter Genentech Nonclinical Biostatistics From ligges at statistik.tu-dortmund.de Fri Mar 4 18:17:42 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Fri, 04 Mar 2011 18:17:42 +0100 Subject: [R] r.dll In-Reply-To: References: Message-ID: <4D711EB6.2070601@statistik.tu-dortmund.de> On 04.03.2011 18:15, wesley mathew wrote: > Dear all > I have some problem to execute jri package. R.dll file has to copped to jri > directory for the execution of jar file in eclips. But R.dll file is not > available in the R version 2.12.1 . It is, at least in the Windows binary distribution. Uwe Ligges > Is there any chance to get this > file. Thanks in advanced > > Kind regards > W. Mathew > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Fri Mar 4 18:19:52 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Fri, 04 Mar 2011 18:19:52 +0100 Subject: [R] cv.lm syntax error In-Reply-To: <1299229875043-3334889.post@n4.nabble.com> References: <1299229875043-3334889.post@n4.nabble.com> Message-ID: <4D711F38.4080005@statistik.tu-dortmund.de> On 04.03.2011 10:11, agent dunham wrote: > Dear all, > > I've tried a multiple regression, and now I want to try a cross-validation. > > I obtain this error (it must be sth related to df) that I don't understand, > any help would be appreciated. > > cv.lm(df= dat, lm2.52f, m=3) > > Error en `[.data.frame`(df, , ynam) : undefined columns selected > > lm2.52f is my lm object, If this is the cv.lm from the DAAG package (unstated), then please read its help page and find that you need to specify a formula as the second argument rather than an already fitted lm object. Uwe Ligges > dat is a dataframe where the variables involved in > .lm are > > I tried CVlm also but the same error > > Thanks, user at host.com > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/cv-lm-syntax-error-tp3334889p3334889.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From saschaview at gmail.com Fri Mar 4 18:22:49 2011 From: saschaview at gmail.com (Sascha Vieweg) Date: Fri, 4 Mar 2011 18:22:49 +0100 (CET) Subject: [R] S. function calculating x +- y Message-ID: Hello, I am looking for an elegant one-liner for the following operation: x <- rnorm(10) y <- runif(10) c(mean(x)-mean(y), mean(x)+mean(y)) I thought about apply(data.frame(x, y), 2, mean) but I don't know how to apply the +- operation on the result of apply. Thanks, *S* -- Sascha Vieweg, saschaview at gmail.com From bt_jannis at yahoo.de Fri Mar 4 18:24:30 2011 From: bt_jannis at yahoo.de (Jannis) Date: Fri, 04 Mar 2011 18:24:30 +0100 Subject: [R] retrieve x y coordinates of points in current plot Message-ID: <4D71204E.2030109@yahoo.de> Dear list, is it somehow possible to retrieve the x and y coordinates of points in a scatterplot after it has been plotted? identify() somehow seems to manage this, so I was wondering whether it is possible? I am asking as I wrote a function that identifies points inside a polygon and I would like it to work without supplying the x and y coordinates again but by retrieving them from the device. Cheers Jannis From ligges at statistik.tu-dortmund.de Fri Mar 4 18:34:49 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Fri, 04 Mar 2011 18:34:49 +0100 Subject: [R] S. function calculating x +- y In-Reply-To: References: Message-ID: <4D7122B9.4040701@statistik.tu-dortmund.de> On 04.03.2011 18:22, Sascha Vieweg wrote: > Hello, I am looking for an elegant one-liner for the following operation: > > x <- rnorm(10) > y <- runif(10) > c(mean(x)-mean(y), mean(x)+mean(y)) > > I thought about > > apply(data.frame(x, y), 2, mean) > > but I don't know how to apply the +- operation on the result of apply. > Thanks, *S* > The most elegant way probably is the way you had above for this setting, otherwise you could do, e.g.: df <- data.frame(x, y) sapply(c("-", "+"), function(f, dat) do.call(get(f), as.list(colMeans(dat))), dat = df) Uwe Ligges From lists at revelle.net Fri Mar 4 18:33:11 2011 From: lists at revelle.net (William Revelle) Date: Fri, 4 Mar 2011 11:33:11 -0600 Subject: [R] PCA - scores In-Reply-To: <419BB6F5-F409-4880-8E3D-DA932AE4789C@ualberta.ca> References: <131E1151-2802-4793-9901-5F8E7EBB3FB8@ualberta.ca> <419BB6F5-F409-4880-8E3D-DA932AE4789C@ualberta.ca> Message-ID: At 9:52 AM -0700 3/4/11, Shari Clare wrote: >Hi Bill and Josh: > >When I run any "principal" code with scores=TRUE, I get the following Error: > >Error in principal (my.data,3,scores=TRUE) : unused argument (scores=TRUE) > > >Thoughts? What version of psych are you using? Does it work on the example I sent (see below)? > >Thanks, >Shari > > > > > > >On 3-Mar-11, at 9:42 PM, William Revelle wrote: > >>Shari, >> Josh partly answered your question, but his example did not >>include rotation because he took out just one factor. >> >>Try: >> >>require(psych) >>mt.pc <- principal(mtcars,3,scores=TRUE) #this gives you the >>varimax rotated first 3 principal components >>#pc.scores <- mt.pc$scores #here are the scores >> >>biplot(mt.pc) #show the data as well as the principal components >>in a biplot >> >> >> >>Bill >> >> >>At 5:15 PM -0800 3/3/11, Joshua Wiley wrote: >> >>>Hi Shari, >>> >>> >>>Yes, please look at the documentation for principal. You can access >>> >>>this (assuming you have loaded psych) by typing at the console: >>> >>> >>>?principal >>> >>> >>>note the logical argument "scores". >>> >>> >>>Here is a small example: >>> >>> >>>############################## >>> >>>require(psych) >>> >>>require(GPArotation) >>> >>> >>>dat <- principal(mtcars[, c("mpg", "hp", "wt")], nfactors = 1, >>> >>> rotate = "oblimin", scores = TRUE) >>> >>> >>>dat$scores >>> >>>############################## >>> >>> >>>Cheerio, >>> >>> >>>Josh >>> >>> >>>On Thu, Mar 3, 2011 at 1:02 PM, Shari Clare >>><sclare at ualberta.ca> wrote: >>> >>>>I am running a PCA, but would like to rotate my data and limit the >>>> >>>>number of factors that are analyzed. I can do this using the >>>> >>>>"principal" command from the psych package [principal(my.data, >>>> >>>>nfactors=3,rotate="varimax")], but the issue is that this does not >>>> >>>>report scores for the Principal Components the way "princomp" does. >>>> >>>> >>>>My question is: >>>> >>>> >>>>Can you get an output of scores using "principal" OR, is there a way >>>> >>>>to limit the number of factors that are included when you use >>>> >>>>"princomp"? >>>> >>>> >>>>Thanks, >>>> >>>>Shari Clare >>>> >>>> >>>>PhD Candidate >>>> >>>>Department of Renewable Resources >>>> >>>>University of Alberta >>>> >>>>sclare at ualberta.ca >>>> >>>>780-492-2540 >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> >>>>______________________________________________ >>>> >>>>R-help at r-project.org mailing list >>>> >>>>https://stat.ethz.ch/mailman/listinfo/r-help >>>> >>>>PLEASE do read the posting guide >>>>http://www.R-project.org/posting-guide.html >>>> >>>>and provide commented, minimal, self-contained, reproducible code. >>>> >>>> >>> >>> >>> >>>-- >>> >>>Joshua Wiley >>> >>>Ph.D. Student, Health Psychology >>> >>>University of California, Los Angeles >>> >>>http://www.joshuawiley.com/ >>> >>> >>>______________________________________________ >>> >>>R-help at r-project.org mailing list >>> >>>https://stat.ethz.ch/mailman/listinfo/r-help >>> >>>PLEASE do read the posting guide >>>http://www.R-project.org/posting-guide.html >>> >>>and provide commented, minimal, self-contained, reproducible code. From ehlers at ucalgary.ca Fri Mar 4 18:37:22 2011 From: ehlers at ucalgary.ca (P Ehlers) Date: Fri, 04 Mar 2011 09:37:22 -0800 Subject: [R] How two compare two matrixes In-Reply-To: <219195.1782.qm@web120115.mail.ne1.yahoo.com> References: <219195.1782.qm@web120115.mail.ne1.yahoo.com> Message-ID: <4D712352.5090809@ucalgary.ca> Alaios wrote: > That's the problem > Even a 10*10 matrix does not fit to the screen (10 columns do not fit in one screen's row) and thus I do not get a well aligned matrix printed. > I don't see why you would want to do this, but you could always invoke two instances of R and create one matrix in one and the other in the second. Peter Ehlers > This is that makes comparisons not that easy to the eye. > From the other hand with edit(mymatrix) I get scrolls so I can scroll to one row and see only the area I want to focus in. Problem with edit is that it blocks cli and thus I can not have two edits running at the same time. > > I would like to thank you in advacne for your help > > Regards > Alex > > --- On Fri, 3/4/11, Philipp Pagel wrote: > >> From: Philipp Pagel >> Subject: Re: [R] How two compare two matrixes >> To: r-help at r-project.org >> Date: Friday, March 4, 2011, 8:04 AM >>> Dear all I have two 10*10 >> matrixes and I would like to compare >>> theirs contents. By the word content I mean to check >> visually (not >>> with any mathematical formulation) how similar are the >> contents. >> >> If they are really only 10x10 you can simply print them >> both to the >> screen and look at them. I'm not sure what else you could >> do if you >> are not interested in a specific distance emasure etc. >> >> cu >> Philipp >> >> -- >> Dr. Philipp Pagel >> Lehrstuhl f?r Genomorientierte Bioinformatik >> Technische Universit?t M?nchen >> Wissenschaftszentrum Weihenstephan >> Maximus-von-Imhof-Forum 3 >> 85354 Freising, Germany >> http://webclu.bio.wzw.tum.de/~pagel/ >> >> ______________________________________________ >> R-help at r-project.org >> mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, >> reproducible code. >> > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jwiley.psych at gmail.com Fri Mar 4 18:43:19 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Fri, 4 Mar 2011 09:43:19 -0800 Subject: [R] S. function calculating x +- y In-Reply-To: References: Message-ID: Hi Sascha, As Uwe said, I am not sure you will get more elegant. If you want it to be simple because you do it a lot and the typing is a burden, consider writing a function. Here is a little example: ########## f <- function(x, y, ...) { mx <- mean(x, ...) my <- mean(y, ...) * c(-1, 1) mx + my } f(rnorm(10), runif(10)) ########## On Fri, Mar 4, 2011 at 9:22 AM, Sascha Vieweg wrote: > Hello, I am looking for an elegant one-liner for the following operation: > > x <- rnorm(10) > y <- runif(10) > c(mean(x)-mean(y), mean(x)+mean(y)) > > I thought about > > apply(data.frame(x, y), 2, mean) > > but I don't know how to apply the +- operation on the result of apply. > Thanks, *S* > > -- > Sascha Vieweg, saschaview at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From ggrothendieck at gmail.com Fri Mar 4 18:43:14 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Fri, 4 Mar 2011 12:43:14 -0500 Subject: [R] S. function calculating x +- y In-Reply-To: References: Message-ID: On Fri, Mar 4, 2011 at 12:22 PM, Sascha Vieweg wrote: > Hello, I am looking for an elegant one-liner for the following operation: > > x <- rnorm(10) > y <- runif(10) > c(mean(x)-mean(y), mean(x)+mean(y)) > > I thought about > > apply(data.frame(x, y), 2, mean) > > but I don't know how to apply the +- operation on the result of apply. Try: mean(x) + c(-1, 1) * mean(y) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From dieter.menne at menne-biomed.de Fri Mar 4 18:55:17 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Fri, 4 Mar 2011 09:55:17 -0800 (PST) Subject: [R] retrieve x y coordinates of points in current plot In-Reply-To: <4D71204E.2030109@yahoo.de> References: <4D71204E.2030109@yahoo.de> Message-ID: <1299261317820-3335692.post@n4.nabble.com> jannis-2 wrote: > > > is it somehow possible to retrieve the x and y coordinates of points in > a scatterplot after it has been plotted? identify() somehow seems to > manage this, so I was wondering whether it is possible? > locator might be the more basic function you are looking for. Dieter -- View this message in context: http://r.789695.n4.nabble.com/retrieve-x-y-coordinates-of-points-in-current-plot-tp3335642p3335692.html Sent from the R help mailing list archive at Nabble.com. From rmh at temple.edu Fri Mar 4 18:55:27 2011 From: rmh at temple.edu (Richard M. Heiberger) Date: Fri, 4 Mar 2011 12:55:27 -0500 Subject: [R] Reading in and manipulating multiple data sets from the same input file In-Reply-To: <1299255525698-3335523.post@n4.nabble.com> References: <1299255525698-3335523.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wdunlap at tibco.com Fri Mar 4 19:02:05 2011 From: wdunlap at tibco.com (William Dunlap) Date: Fri, 4 Mar 2011 10:02:05 -0800 Subject: [R] How two compare two matrixes In-Reply-To: <85081.47838.qm@web120107.mail.ne1.yahoo.com> References: <85081.47838.qm@web120107.mail.ne1.yahoo.com> Message-ID: <77EB52C6DD32BA4D87471DCD70C8D70003F74E8F@NA-PA-VBE03.na.tibco.com> I sometimes use the enclosed sideBySide() function to look at two printouts (of any sort of objects) in parallel. Perhaps that would help. sideBySide <- function (a, b, argNames) { oldWidth <- options(width = getOption("width")/2 - 4) on.exit(options(oldWidth)) if (missing(argNames)) { argNames <- c(deparse(substitute(a))[1], deparse(substitute(b))[1]) } pa <- capture.output(print(a)) pb <- capture.output(print(b)) nlines <- max(length(pa), length(pb)) length(pa) <- nlines length(pb) <- nlines pb[is.na(pb)] <- "" pa[is.na(pa)] <- "" retval <- cbind(pa, pb, deparse.level = 0) dimnames(retval) <- list(rep("", nrow(retval)), argNames) noquote(retval) } Try: > x1 <- matrix(sort(rnorm(100)),10,10) > x2 <- matrix(sort(rnorm(100)),10,10) > sideBySide(x1,x2) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Alaios > Sent: Thursday, March 03, 2011 11:42 PM > To: R-help at r-project.org > Subject: [R] How two compare two matrixes > > Dear all I have two 10*10 matrixes and I would like to > compare theirs contents. By the word content I mean to check > visually (not with any mathematical formulation) how similar > are the contents. > > I also know edit that prints my matrix in the scree but still > one edit blocks the prompt to launch a second edit() screen. > > What is the best way to compare these two matrices? > > I would like to thank you in avdance for your help > > Regards > Alex > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From dieter.menne at menne-biomed.de Fri Mar 4 19:03:27 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Fri, 4 Mar 2011 10:03:27 -0800 (PST) Subject: [R] Question in Chi-squared test, can I do it with percentage data? In-Reply-To: References: Message-ID: <1299261807772-3335708.post@n4.nabble.com> Jeff wrote: > > I know Chi-squared test can be done with the frequency data by R function > "chisq.test()", but I am not sure if it can be applied to the percentage > data ? The example of my data is as follow: > > ############################################# > > KSL MHL MWS CLGC LYGC > independent (%) 96.22 92.18 68.54 93.80 85.74 > > ############################################# > > No. If this are "measured" number, eg. percent concentration, the test has no idea if you data are measured with 0.05 or 10% standard error (which anyway is tricky, because we are so close to 100%). If these are counts, it makes a difference if these are 96 out of 100 or 960000 out of 1000000. Dieter -- View this message in context: http://r.789695.n4.nabble.com/Question-in-Chi-squared-test-can-I-do-it-with-percentage-data-tp3335312p3335708.html Sent from the R help mailing list archive at Nabble.com. From dieter.menne at menne-biomed.de Fri Mar 4 19:09:27 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Fri, 4 Mar 2011 10:09:27 -0800 (PST) Subject: [R] Coefficient of Determination for nonlinear function In-Reply-To: <1299246144.1764.18.camel@pollux> References: <1299246144.1764.18.camel@pollux> Message-ID: <1299262167353-3335719.post@n4.nabble.com> Uwe Wolfram wrote: > > > I did fit an equation of the form 1 = f(x1,x2,x3) using a minimization > scheme. Now I want to compute the coefficient of determination. Normally > I would compute it as > > r_square = 1- sserr/sstot with sserr = sum_i (y_i - f_i) and sstot = > sum_i (y_i - mean(y)) > > sserr is clear to me but how can I compute sstot when there is no such > thing than differing y_i. These are all one. Thus mean(y)=1. Therefore, > sstot is 0. > > Try http://r-project.markmail.org/search/?q=r+square+nonlinear to find heated debates on this subject. But I fear you supervisor or the reviewer wants it anyway. Dieter -- View this message in context: http://r.789695.n4.nabble.com/Coefficient-of-Determination-for-nonlinear-function-tp3335236p3335719.html Sent from the R help mailing list archive at Nabble.com. From Greg.Snow at imail.org Fri Mar 4 19:22:14 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Fri, 4 Mar 2011 11:22:14 -0700 Subject: [R] Regression with many independent variables In-Reply-To: References: Message-ID: Here is one possible way (you will need to change the dataset and condition, etc.): tmp1 <- combn(names(iris)[1:4], 2, function(x) { if( any( iris[[ x[1] ]] * iris[[ x[2] ]] < .25 )) { NA } else { paste(x, collapse=':') }} ) tmp1 <- tmp1[ !is.na(tmp1) ] paste(tmp1, collapse=' + ') -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: Matthew Douglas [mailto:matt.douglas01 at gmail.com] > Sent: Thursday, March 03, 2011 3:43 PM > To: Greg Snow > Cc: r-help at r-project.org > Subject: Re: [R] Regression with many independent variables > > Thanks for getting back to me so quickly greg. Im not quite sure how > to do what you just said, is there an example that you can show? > > I understand how to create the string with a formula in it but im not > sure how to loop through the pairs of variables? How do I first get > these 2way interaction variables, I can no longer use the "^" right? > > Sorry for so many questions, > > Matt > On Thu, Mar 3, 2011 at 4:16 PM, Greg Snow wrote: > > What you might need to do is create a character string with your > formula in it (looping through pairs of variables and using paste or > sprint) then convert that to a formula using the as.formula function. > > > > -- > > Gregory (Greg) L. Snow Ph.D. > > Statistical Data Center > > Intermountain Healthcare > > greg.snow at imail.org > > 801.408.8111 > > > > > >> -----Original Message----- > >> From: Matthew Douglas [mailto:matt.douglas01 at gmail.com] > >> Sent: Thursday, March 03, 2011 2:09 PM > >> To: Greg Snow > >> Cc: r-help at r-project.org > >> Subject: Re: [R] Regression with many independent variables > >> > >> Thanks greg, > >> > >> ?that formula was exactly what I was looking for. Except now when I > >> run it on my data I get the following error: > >> > >> "Error in model.matrix.default(mt, mf, contrasts) : cannot allocate > >> vector of length 2043479998" > >> > >> I know there are probably many 2-way interactions that are zero so I > >> thought I could save space by removing these. Is there some way that > >> can just delete all the two way interactions that are zero and keep > >> the columns that have non-zero entries? I think that will > >> significantly cut down the memory needed. Or is there just another > way > >> to get around this? > >> > >> thanks, > >> Matt > >> > >> On Tue, Mar 1, 2011 at 3:56 PM, Greg Snow > wrote: > >> > You can use ^2 to get all 2 way interactions and ^3 to get all 3 > way > >> interactions, e.g.: > >> > > >> > lm(Sepal.Width ~ (. - Sepal.Length)^2, data=iris) > >> > > >> > The lm.fit function is what actually does the fitting, so you > could > >> go directly there, but then you lose the benefits of using . and ^. > >> ?The Matrix package has ways of dealing with sparse matricies, but I > >> don't know if ?that would help here or not. > >> > > >> > You could also just create x'x and x'y matricies directly since > the > >> variables are 0/1 then use solve. ?A lot depends on what you are > doing > >> and what questions you are trying to answer. > >> > > >> > -- > >> > Gregory (Greg) L. Snow Ph.D. > >> > Statistical Data Center > >> > Intermountain Healthcare > >> > greg.snow at imail.org > >> > 801.408.8111 > >> > > >> > > >> >> -----Original Message----- > >> >> From: Matthew Douglas [mailto:matt.douglas01 at gmail.com] > >> >> Sent: Tuesday, March 01, 2011 1:09 PM > >> >> To: Greg Snow > >> >> Cc: r-help at r-project.org > >> >> Subject: Re: [R] Regression with many independent variables > >> >> > >> >> Hi Greg, > >> >> > >> >> Thanks for the help, it works perfectly. To answer your question, > >> >> there are 339 independent variables but only 10 will be used at > one > >> >> time . So at any given line of the data set there will be 10 non > >> zero > >> >> entries for the independent variables and the rest will be zeros. > >> >> > >> >> One more question: > >> >> > >> >> 1. I still want to find a way to look at the interactions of the > >> >> independent variables. > >> >> > >> >> the regression would look like this: > >> >> > >> >> y = b12*X1X2 + b23*X2X3 +...+ bk-1k*Xk-1Xk > >> >> > >> >> so I think the regression in R would look like this: > >> >> > >> >> lm(MARGIN, P235:P236+P236:P237+....,weights = Poss, data = > adj0708), > >> >> > >> >> my problem is that since I have technically 339 independent > >> variables, > >> >> when I do this regression I would have 339 Choose 2 = approx > 57000 > >> >> independent variables (a vast majority will be 0s though) so I > dont > >> >> want to have to write all of these out. Is there a way to do this > >> >> quickly in R? > >> >> > >> >> Also just a curious question that I cant seem to find to online: > >> >> is there a more efficient model other than lm() that is better > for > >> >> very sparse data sets like mine? > >> >> > >> >> Thanks, > >> >> Matt > >> >> > >> >> > >> >> On Mon, Feb 28, 2011 at 4:30 PM, Greg Snow > >> wrote: > >> >> > Don't put the name of the dataset in the formula, use the data > >> >> argument to lm to provide that. ?A single period (".") on the > right > >> >> hand side of the formula will represent all the columns in the > data > >> set > >> >> that are not on the left hand side (you can then use "-" to > remove > >> any > >> >> other columns that you don't want included on the RHS). > >> >> > > >> >> > For example: > >> >> > > >> >> >> lm(Sepal.Width ~ . - Sepal.Length, data=iris) > >> >> > > >> >> > Call: > >> >> > lm(formula = Sepal.Width ~ . - Sepal.Length, data = iris) > >> >> > > >> >> > Coefficients: > >> >> > ? ? ?(Intercept) ? ? ? Petal.Length ? ? ? ?Petal.Width > >> >> ?Speciesversicolor > >> >> > ? ? ? ? ? 3.0485 ? ? ? ? ? ? 0.1547 ? ? ? ? ? ? 0.6234 > >> ?- > >> >> 1.7641 > >> >> > ?Speciesvirginica > >> >> > ? ? ? ? ?-2.1964 > >> >> > > >> >> > > >> >> > But, are you sure that a regression model with 339 predictors > will > >> be > >> >> meaningful? > >> >> > > >> >> > -- > >> >> > Gregory (Greg) L. Snow Ph.D. > >> >> > Statistical Data Center > >> >> > Intermountain Healthcare > >> >> > greg.snow at imail.org > >> >> > 801.408.8111 > >> >> > > >> >> > > >> >> >> -----Original Message----- > >> >> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > >> >> >> project.org] On Behalf Of Matthew Douglas > >> >> >> Sent: Monday, February 28, 2011 1:32 PM > >> >> >> To: r-help at r-project.org > >> >> >> Subject: [R] Regression with many independent variables > >> >> >> > >> >> >> Hi, > >> >> >> > >> >> >> I am trying use lm() on some data, the code works fine but I > >> would > >> >> >> like to use a more efficient way to do this. > >> >> >> > >> >> >> The data looks like this (the data is very sparse with a few > 1s, > >> -1s > >> >> >> and the rest 0s): > >> >> >> > >> >> >> > head(adj0708) > >> >> >> ? ? ? MARGIN Poss P235 P247 P703 P218 P430 P489 P83 P307 > P337.... > >> >> >> 1 ? 64.28571 ? 29 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 > >> ?0 > >> >> >> 0 ? ?0 ? ?0 > >> >> >> 2 -100.00000 ? ?6 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?1 ? ?0 > >> ?0 > >> >> >> 0 ? ?0 ? ?0 > >> >> >> 3 ?100.00000 ? ?4 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?1 ? ?0 > >> ?0 > >> >> >> 0 ? ?0 ? ?0 > >> >> >> 4 ?-33.33333 ? ?7 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 > >> ?0 > >> >> >> 0 ? ?0 ? ?0 > >> >> >> 5 ?200.00000 ? ?2 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 > >> ?0 > >> >> >> -1 ? ?0 ? ?0 > >> >> >> 6 ?-83.33333 ? 12 ? ?0 ? ?-1 ? ?0 ? ?0 ? ?0 ? ?0 ? 0 ? ?0 ? ?0 > >> ?0 > >> >> >> 0 ? ?0 ? ?0 > >> >> >> > >> >> >> adj0708 is actually a 35657x341 data set. Each column after > >> "Poss" > >> >> is > >> >> >> an independent variable, the dependent variable is "MARGIN" > and > >> it > >> >> is > >> >> >> weighted by "Poss" > >> >> >> > >> >> >> > >> >> >> The regression is below: > >> >> >> fit.adj0708 <- lm( adj0708$MARGIN~adj0708$P235 + adj0708$P247 > + > >> >> >> adj0708$P703 + adj0708$P430 + adj0708$P489 + adj0708$P218 + > >> >> >> adj0708$P605 + adj0708$P337 + .... + > >> >> >> adj0708$P510,weights=adj0708$Poss) > >> >> >> > >> >> >> I have two questions: > >> >> >> > >> >> >> 1. Is there a way to to condense how I write the independent > >> >> variables > >> >> >> in the lm(), instead of having such a long line of code (I > have > >> 339 > >> >> >> independent variables to be exact)? > >> >> >> 2. I would like to pair the data to look a regression of the > >> >> >> interactions between two independent variables. I think it > would > >> >> look > >> >> >> something like this.... > >> >> >> fit.adj0708 <- lm( adj0708$MARGIN~adj0708$P235:adj0708$P247 + > >> >> >> adj0708$P703:adj0708$P430 + adj0708$P489:adj0708$P218 + > >> >> >> adj0708$P605:adj0708$P337 + ....,weights=adj0708$Poss) > >> >> >> but there will be 339 Choose 2 combinations, so a lot of > >> independent > >> >> >> variables! Is there a more efficient way of writing this code. > Is > >> >> >> there a way I can do this? > >> >> >> > >> >> >> Thanks, > >> >> >> Matt > >> >> >> > >> >> >> ______________________________________________ > >> >> >> R-help at r-project.org mailing list > >> >> >> https://stat.ethz.ch/mailman/listinfo/r-help > >> >> >> PLEASE do read the posting guide http://www.R- > >> project.org/posting- > >> >> >> guide.html > >> >> >> and provide commented, minimal, self-contained, reproducible > >> code. > >> >> > > >> > > > From Greg.Snow at imail.org Fri Mar 4 19:39:21 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Fri, 4 Mar 2011 11:39:21 -0700 Subject: [R] sum of digits or how to slice a number into its digits In-Reply-To: <4D70EE96.2010206@googlemail.com> References: <4D70E68B.2010703@googlemail.com> <4D70E841.80804@erasmusmc.nl> <4D70EE96.2010206@googlemail.com> Message-ID: Here is another way to do it without converting back and forth to character strings: digits <- function(x) { if(length(x) > 1 ) { lapply(x, digits) } else { n <- nchar(x) rev( x %/% 10^seq(0, length.out=n) %% 10 ) } } > digits(10010) [1] 1 0 0 1 0 > digits( c(123, 4567, 273) ) [[1]] [1] 1 2 3 [[2]] [1] 4 5 6 7 [[3]] [1] 2 7 3 This version only works on integers, so modification would be needed if you want to pass floating point numbers (but you would first need to decide what you would want the output to be). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of drflxms > Sent: Friday, March 04, 2011 6:52 AM > To: d.rizopoulos at erasmusmc.nl > Cc: r-help at r-project.org > Subject: Re: [R] sum of digits or how to slice a number into its digits > > Hi Dimitris, > > thank you very much for your quick an efficient help! Your solution is > perfect for me. Does exactly what I was looking for if combined with > unlist and as.numeric before using sum. > > Now I can keep on with my real problem ;)... > Thanx Again!!! > > Best, Felix > > Am 04.03.2011 14:25, schrieb Dimitris Rizopoulos: > > one way is using function strsplit(), e.g., > > > > x <- c("100100110", "1001001", "1101", "00101") > > sapply(strsplit(x, ""), function (x) sum(x == 1)) > > > > > > I hope it helps. > > > > Best, > > Dimitris > > > > > > On 3/4/2011 2:18 PM, drflxms wrote: > >> Dear R colleagues, > >> > >> I face a seemingly simple problem I couldn't find a solution for > myself > >> so far: > >> > >> I have to sum the digits of numbers. Example: 1010 ->2 100100110 -> > 4 > >> Unfortunately there seems not to be a function for this task. So my > idea > >> was to use sum(x) for it. But I did not figure out how to slice a > number > >> to a vector of its digits. Example (continued from above): 1010 -> > >> c(1,0,1,0) 100100110 -> (1,0,0,1,0,0,1,1,0). > >> > >> Does anyone know either a function for calculating the sum of the > digits > >> of a bumber, or how to slice a number into a vector of its digits as > >> described above? > >> > >> I'd appreciate any kind of help very much! > >> Thanx in advance and greetings from cloudy Munich, > >> Felix > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From wesleycmathew at gmail.com Fri Mar 4 19:40:25 2011 From: wesleycmathew at gmail.com (wesley mathew) Date: Fri, 4 Mar 2011 18:40:25 +0000 Subject: [R] Fwd: r.dll In-Reply-To: <4D711EB6.2070601@statistik.tu-dortmund.de> References: <4D711EB6.2070601@statistik.tu-dortmund.de> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Michael.Folkes at dfo-mpo.gc.ca Fri Mar 4 19:36:42 2011 From: Michael.Folkes at dfo-mpo.gc.ca (Folkes, Michael) Date: Fri, 4 Mar 2011 10:36:42 -0800 Subject: [R] Floating points and floor() ? References: <63F107BCC37AEA49A75FD94AA3E07CB007A8BBE1@pacpbsex01.pac.dfo-mpo.ca> <36180405F8418449918AD20618D110FC095BFA6E02@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Message-ID: <63F107BCC37AEA49A75FD94AA3E07CB004AFD6FD@pacpbsex01.pac.dfo-mpo.ca> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sclare at ualberta.ca Fri Mar 4 17:52:48 2011 From: sclare at ualberta.ca (Shari Clare) Date: Fri, 4 Mar 2011 09:52:48 -0700 Subject: [R] PCA - scores In-Reply-To: References: <131E1151-2802-4793-9901-5F8E7EBB3FB8@ualberta.ca> Message-ID: <419BB6F5-F409-4880-8E3D-DA932AE4789C@ualberta.ca> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bbolker at gmail.com Fri Mar 4 17:40:17 2011 From: bbolker at gmail.com (Ben Bolker) Date: Fri, 4 Mar 2011 16:40:17 +0000 Subject: [R] AIC on GLMM pscl package References: <1299250467152-3335371.post@n4.nabble.com> Message-ID: Caribu gmail.com> writes: > > Hello, > > I'm using GLMM on the pscl package and i'm not getting the AIC on the > summary. > [snip] glmmPQL is in the MASS package, not the pscl package. Because it uses a quasi-likelihood approach, it does not provide an AIC value (which technically does not exist in this case). It is in principle possible to get a "quasi-AIC", but it is (a) delicate to define and (b) moderately challenging to extract the information from glmmPQL. Ben Bolker From dieter.menne at menne-biomed.de Fri Mar 4 18:52:56 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Fri, 4 Mar 2011 09:52:56 -0800 (PST) Subject: [R] make an own (different) color legend with spplot() In-Reply-To: <4D711178.9080101@gmx.at> References: <4D711178.9080101@gmx.at> Message-ID: <1299261176124-3335690.post@n4.nabble.com> Marcel J. wrote: > > > Is there a way to manually costumize the color legend in an spplot() - > especially where to draw ticks and labels for the ticks? > spplot calls levelplot, so the documentation there gives some help. In theory, the simplest approach would be to add a colorkey/ticknumber combination as shown below, but it's always a bit of trial-and-error what works. In similar cases, I have been bitten several times in the past by believing the whole approach is wrong, when only certain combinations don't work. In the example, setting the width of the legend works, but trying to position it with space="right" fails, and tick.number alone also. Setting the items by hand, works. You will probably find a better labeling, I only added the hx to show you how-to. Dieter library(sp) data(meuse.grid) gridded(meuse.grid) = ~x+y meuse.grid$random <- rnorm(nrow(meuse.grid), 7, 2) # generate random data meuse.grid$random[meuse.grid$random < 0] <- 0 # make sure there is no meuse.grid$random[meuse.grid$random > 10] <- 10 # and bigger than ten meuse.grid$random <- cut(meuse.grid$random, seq(0, 10, 0.1)) # here I labelat = seq(0,100,by=10) labeltext = paste("Hx",labelat) spplot(meuse.grid, c("random"), col.regions = rainbow(100, start = 4/6, end = 1), colorkey=list(width=0.3, # works space="right", # not honoured tick.number=5, # not honoured, can be left out labels=list( # so we must do it by hand at=labelat, labels=labeltext ))) -- View this message in context: http://r.789695.n4.nabble.com/make-an-own-different-color-legend-with-spplot-tp3335552p3335690.html Sent from the R help mailing list archive at Nabble.com. From adam at adamlilith.net Fri Mar 4 19:54:35 2011 From: adam at adamlilith.net (Adam B. Smith) Date: Fri, 04 Mar 2011 11:54:35 -0700 Subject: [R] Probabilities outside [0, 1] using Support Vector Machines (SVM) in e1071 Message-ID: <20110304115435.8373b8ee7c8fe46182fc6f67acb65638.52c49e3afe.wbe@email10.secureserver.net> Hi All, I'm attempting to use eps-regression or nu-regression SVM to compute probabilities but the predict function applied to an svm model object returns values outside [0, 1]: Variable Data looks like: Present X02 X03 X05 X06 X07 X13 X14 X15 X18 1 0 1634 48 2245.469 -1122.0750 3367.544 11105.013 2017.306 40 23227 2 0 1402 40 2611.519 -811.2500 3422.769 10499.425 1800.475 40 13822 3 0 1379 40 2576.150 -842.8500 3419.000 10166.237 2328.756 37 14200 4 0 1869 51 2645.794 -982.2938 3628.088 9610.037 1699.656 43 20762 ... and bgEnv looks similar: X02 X03 X05 X06 X07 X13 X14 X15 X18 1 1001 39 2521.406 -38.0875 2559.494 48507.312 3925.7563 63 20616 2 1587 39 3148.056 -895.0187 4043.075 5937.669 910.9062 55 15156 3 1610 40 4116.918 172.6812 3944.237 2287.431 196.0312 51 2739 4 1495 43 3678.381 236.9250 3441.456 3298.625 23.9875 86 281 5 1564 43 3010.988 -623.6063 3634.594 3416.350 819.6375 34 3848 ... modelFormula <- as.formula(Present ~ X02 + X03 + X05 + X06 + X07 + X13 + X14 + X15 + X18) Model <- svm( modelFormula, data=Data, gamma=0.25, cost=4, nu=0.10, kernel='radial', scale=TRUE, type='nu-regression', na.action=na.omit, probability=TRUE ) bgPreds <- predict( Model, newdata=bgEnv, type='nu-regression', probability=TRUE ) bgPreds looks like: 11 12 13 14 15 16 17 18 0.54675813 0.37587560 0.39526542 0.67043587 -0.03079247 0.16696996 0.04714134 0.06989950 19 20 0.07615735 0.14923408 Notice the negative value. I can also get values >1. I had thought argument probability=TRUE would give probabilities. Any help would be greatly appreciated! Adam Adam B. Smith University of California, Berkeley From Greg.Snow at imail.org Fri Mar 4 19:57:51 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Fri, 4 Mar 2011 11:57:51 -0700 Subject: [R] R usage survey In-Reply-To: References: Message-ID: > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Harsh > Sent: Thursday, March 03, 2011 3:53 AM > To: r-help at r-project.org > Subject: [R] R usage survey > > Hi R users, > I request members of the R community to consider filling a short survey > regarding the use of R. > The survey can be found at http://goo.gl/jw1ig > > Please accept my apologies for posting here for a non-technical reason. > > The data collected will be suitably analyzed and I'll post a link to > the > results in the coming weeks. Ok, I am very interested in what methods you plan to use that would be fit under the description "suitably analyzed" for voluntary response data. From my training and experience the only suitable thing to do with voluntary response data is to put it through the shredder, into the recycle bin, or use as an example of what not to do in introductory textbooks. Treating voluntary response data (especially given the responses to your post you have seen so far) as if it came from a proper random probability sample does not fit the idea of suitable analysis. > Thank you all for your interest and for sharing your R usage > information. > > Regards, > Harsh Singhal > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 From Ulrich.Halekoh at agrsci.dk Fri Mar 4 20:03:42 2011 From: Ulrich.Halekoh at agrsci.dk (Ulrich Halekoh) Date: Fri, 4 Mar 2011 20:03:42 +0100 Subject: [R] glht: Problem with symbolic contrast for factors with number-levels Message-ID: <9F0721FDD4F12D4B95AD894274F388EC020C6410EF88@DJFEXMBX01.djf.agrsci.dk> Using a factor with 'number' levels the straightforward symbolic formulation of a contrast in 'glht' of the 'multcomp' package fails. How can this problem be resolved without having to redefine the factor levels? Example: #A is a factor with 'number' levels #B similar factor with 'letter' levels dat<-data.frame(y=1:4,A=factor(c(1,1,2,2)), B=factor(c('e','e','f','f')) ) motA<-lm(y~A,data=dat) motB<-lm(y~B,data=dat) library(multcomp) #does not work glht(motA,linfct=mcp(A=c("2 - 1 = 0 "))) #the error message is # Error in coefs(ex[[3]]) : # cannot interpret expression '1' as linear function #works glht(motB,linfct=mcp(B=c("f - e = 0"))) regards Ulrich Halekoh Aarhus University e-mail: Ulrich.Halekoh at agrsci.dk I use R.2.12.2, package version of multcomp : 1.2-5 From Greg.Snow at imail.org Fri Mar 4 20:05:54 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Fri, 4 Mar 2011 12:05:54 -0700 Subject: [R] How two compare two matrixes In-Reply-To: <219195.1782.qm@web120115.mail.ne1.yahoo.com> References: <20110304080441.GA3897@maker> <219195.1782.qm@web120115.mail.ne1.yahoo.com> Message-ID: ?View -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Alaios > Sent: Friday, March 04, 2011 2:49 AM > To: r-help at r-project.org; Philipp Pagel > Subject: Re: [R] How two compare two matrixes > > That's the problem > Even a 10*10 matrix does not fit to the screen (10 columns do not fit > in one screen's row) and thus I do not get a well aligned matrix > printed. > > This is that makes comparisons not that easy to the eye. > From the other hand with edit(mymatrix) I get scrolls so I can scroll > to one row and see only the area I want to focus in. Problem with edit > is that it blocks cli and thus I can not have two edits running at the > same time. > > I would like to thank you in advacne for your help > > Regards > Alex > > --- On Fri, 3/4/11, Philipp Pagel wrote: > > > From: Philipp Pagel > > Subject: Re: [R] How two compare two matrixes > > To: r-help at r-project.org > > Date: Friday, March 4, 2011, 8:04 AM > > > Dear all I have two 10*10 > > matrixes and I would like to compare > > > theirs contents. By the word content I mean to check > > visually (not > > > with any mathematical formulation) how similar are the > > contents. > > > > If they are really only 10x10 you can simply print them > > both to the > > screen and look at them. I'm not sure what else you could > > do if you > > are not interested in a specific distance emasure etc. > > > > cu > > ??? Philipp > > > > -- > > Dr. Philipp Pagel > > Lehrstuhl f?r Genomorientierte Bioinformatik > > Technische Universit?t M?nchen > > Wissenschaftszentrum Weihenstephan > > Maximus-von-Imhof-Forum 3 > > 85354 Freising, Germany > > http://webclu.bio.wzw.tum.de/~pagel/ > > > > ______________________________________________ > > R-help at r-project.org > > mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > > and provide commented, minimal, self-contained, > > reproducible code. > > > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From singhalblr at gmail.com Fri Mar 4 20:20:20 2011 From: singhalblr at gmail.com (Harsh) Date: Sat, 5 Mar 2011 00:50:20 +0530 Subject: [R] R usage survey In-Reply-To: <36180405F8418449918AD20618D110FC095BFA7290@USETCMSXMB02.NAFTA.SYNGENTA.ORG> References: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A95@EXNSW-MBX03.nexus.csiro.au> <36180405F8418449918AD20618D110FC095BFA7290@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rmh at temple.edu Fri Mar 4 20:33:33 2011 From: rmh at temple.edu (Richard M. Heiberger) Date: Fri, 4 Mar 2011 14:33:33 -0500 Subject: [R] glht: Problem with symbolic contrast for factors with number-levels In-Reply-To: <9F0721FDD4F12D4B95AD894274F388EC020C6410EF88@DJFEXMBX01.djf.agrsci.dk> References: <9F0721FDD4F12D4B95AD894274F388EC020C6410EF88@DJFEXMBX01.djf.agrsci.dk> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wdunlap at tibco.com Fri Mar 4 20:33:45 2011 From: wdunlap at tibco.com (William Dunlap) Date: Fri, 4 Mar 2011 11:33:45 -0800 Subject: [R] glht: Problem with symbolic contrast for factors withnumber-levels In-Reply-To: <9F0721FDD4F12D4B95AD894274F388EC020C6410EF88@DJFEXMBX01.djf.agrsci.dk> References: <9F0721FDD4F12D4B95AD894274F388EC020C6410EF88@DJFEXMBX01.djf.agrsci.dk> Message-ID: <77EB52C6DD32BA4D87471DCD70C8D70003F74EED@NA-PA-VBE03.na.tibco.com> > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Ulrich Halekoh > Sent: Friday, March 04, 2011 11:04 AM > To: r-help at r-project.org > Subject: [R] glht: Problem with symbolic contrast for factors > withnumber-levels > > Using a factor with 'number' levels the straightforward > symbolic formulation of a contrast in 'glht' of > the 'multcomp' package fails. > > How can this problem be resolved without having to redefine > the factor levels? > > Example: > #A is a factor with 'number' levels > #B similar factor with 'letter' levels > dat<-data.frame(y=1:4,A=factor(c(1,1,2,2)), > B=factor(c('e','e','f','f')) ) > motA<-lm(y~A,data=dat) > motB<-lm(y~B,data=dat) > library(multcomp) > #does not work > glht(motA,linfct=mcp(A=c("2 - 1 = 0 "))) > > #the error message is > # Error in coefs(ex[[3]]) : > # cannot interpret expression '1' as linear function Try putting backquotes around the factor levels so the R parser makes names out of them. > glht(motA,linfct=mcp(A=c("`2` - `1` = 0 "))) General Linear Hypotheses Multiple Comparisons of Means: User-defined Contrasts Linear Hypotheses: Estimate `2` - `1` == 0 2 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > #works > glht(motB,linfct=mcp(B=c("f - e = 0"))) > > regards > > Ulrich Halekoh > Aarhus University > e-mail: Ulrich.Halekoh at agrsci.dk > > I use R.2.12.2, package version of multcomp : 1.2-5 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From izahn at psych.rochester.edu Fri Mar 4 20:37:57 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Fri, 4 Mar 2011 14:37:57 -0500 Subject: [R] R usage survey In-Reply-To: References: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A95@EXNSW-MBX03.nexus.csiro.au> <36180405F8418449918AD20618D110FC095BFA7290@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Message-ID: Now hold on a second Harsh! I was fairly neutral up to this point, but this response is totally uncalled for. The problem is that despite repeated requests you never clarified the purpose of your research! That is all you were asked to do, but rather than responding to this inquirly in a straightforward and honest manner you kept dodging the question. The most charitable explanation is that you just don't understand what information you were being asked to provide, which is frustrating but understandable; your last response on the other hand is completly out of line. Research participants have a right to know the purpose for which their data is being collected, and as a researcher you have a responsibility to tell them. Rex, thank you for generating this discussion. When I first say Harsh's original email I was just getting ready to fill out the survey. When I saw your response I delayed. Boy am I glad I did! Best, Ista On Fri, Mar 4, 2011 at 2:20 PM, Harsh wrote: > Rex, > You're just paranoid and I'm in no way answerable to you. Your constant name > calling presupposes your own naivete. > > The survey has a disclaimer and those who wish to respond can do so at their > own discretion. > > Judging by the nature (and number) of respondents, there seem to be a lot of > highly qualified people who have no qualms about sharing information > regarding their R usage patterns. > > You can believe what you want and can continue to spin your imaginative > tales of "industrial espionage" while assuming a position of apparent > authority on survey design, Oscar gowns and data security AND my ?apparent > ulterior and "outrageous" motives. > > You also seem to be an ignorant and misinformed person. Google forms, using > which the survey was created DOES NOT log IP addresses of the respondents. > > And exactly which question in the survey would contribute to endangering the > professional or personal safety and security of people responding to the > survey. The information sought is freely available on LinkedIn. I merely > want to get more descriptive information directly from R users. > > If you haven't looked at the Survey questions, then refrain from making > misconstrued remarks. > > I apologize to the other users of this list for prolonging this frivolous > debate here. This will be my last response on the list regarding this topic. > > > If anyone has an issue pertaining to the Survey, its outcome and my motives, > they can get in touch with me independently and off the list. All forms of > constructive comments are also welcome. > > For those interested in sharing their R usage information please visit > goo.gl/jw1ig > > - H > > > > > On 04-Mar-2011 10:34 PM, wrote: >> You still don't say what organization you are associated with. Your domain > name and e-mail address give no hint. How do we know that "Harsh Singhal" is > even a real person? An e-mail address at a university (for example) would go > a long way to establish that. Gmail doesn't cut it for me. >> The preponderance of evidence is that you're just a na?ve person who would > give your own information to anyone who asked. On the other hand, it's > possible that you are conducting industrial espionage by recording IP > addresses and associating "use cases" with companies. In my opinion, the > onus is on you to show your bona fides, and you haven't done it. >> That's all I have to say... >> >> >> From: Harsh [mailto:singhalblr at gmail.com] >> Sent: Friday, March 04, 2011 4:19 AM >> To: Bill.Venables at csiro.au >> Cc: Dwyer Rex USRE; r-help at r-project.org >> Subject: Re: [R] R usage survey >> >> The R usage survey goo.gl/jw1ig has been updated with > the following changes: >> >> Addition of - >> Disclaimer : >> This data will not be used for any commercial purposes >> Do not include any personally identifiable information >> Contact: Harsh Singhal (singhalblr AT gmail DOT com) for any queries >> >> Removal of - >> Name field >> >> My primary purpose in conducting this survey is - >> - Find multiple use cases for various R packages >> - Understand the nature of work when R is being used in Academia / > Commercial settings >> - The kind of technologies that are being used in conjunction with R > (popularity of usage of Python with R, and what purpose does using Python > solve) >> >> The outcome of this analysis will be published on my blog (in the process > of being created). >> There is absolutely no commercial purpose behind collecting this > information and as earlier stated, this information will not be shared with > personally identifiable information. >> >> Thank you once again Mr. Dwyer and Mr. Venables for raising very import > questions. >> >> I thank the R users who have already filled in the survey goo.gl/jw1ig< > http://goo.gl/jw1ig> and request more to do so. >> >> Regards, >> Harsh Singhal >> >> >> >> >> >> >> >> On Fri, Mar 4, 2011 at 7:41 AM, wrote: >> No. That's not answering the question. ALL surveys are for collecting > information. >> >> The substantive issue is what purpose do you have in seeking this > information in the first place and what are you going to do with it when you > get it? >> >> Do you have some commercial purpose in mind? If so, what is it? >> >> -----Original Message----- >> From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] > On Behalf Of Harsh >> Sent: Friday, 4 March 2011 1:13 AM >> To: rex.dwyer at syngenta.com >> Cc: r-help at r-project.org >> Subject: Re: [R] R usage survey >> >> Hi Rex and useRs, >> >> The purpose of the survey has been mentioned on the survey link > goo.gl/jw1ig >> but I will also reproduce it here. >> - Geographical distribution of R users >> - Application areas where R is being used >> - Supporting technology being used along with R >> - Academic background distribution of R users >> >> The potential personally identifiable information such as name and > employer >> name are optional fields. Actually all the fields in the survey are >> optional. >> >> Some of the analysis output(s) could be along the lines of :- >> - Usage statistics of various R packages >> - Distribution of R users across countries/cities >> - Mapping various applications to packages >> - Text Mining of the responses to create informative word clouds >> >> Personally, I am excited about the kind of data I will receive through > this >> survey and the various insights that could be derived. As already > mentioned, >> the results will be shared with the community. >> >> Thank you Rex for raising an important point. It is indeed necessary for > me >> to personally assure the user community that the results will be shared in > a >> manner that will not contain any personally identifiable information. >> >> Those who wish to gain access to the raw data will be provided with all > the >> fields but not the name and employer name fields. >> >> Just out of curiosity : It is possible to get name, employer name, > location, >> usage information and academic background details when searching for R > users >> on LinkedIn and the many R related groups there. >> Does this also provide potential opportunities for misuse and "outrageous" >> analyses, since almost anyone can get onto LinkedIn and access user > profiles >> ? >> >> Thank you for your interest and support. >> Regards, >> Harsh >> >> >> >> >> >> >> >> >> >> >> >> >> On Thu, Mar 3, 2011 at 8:02 PM, rex.dwyer at syngenta.com>> wrote: >> >>> Harsh, "Suitably analyzed" for whose purposes? One man's "suitable" is >>> another's "outrageous". That's why people want to see the gowns at the >>> Oscars. Under what auspices are you conducting this survey? What do you >>> intend to do with it? You don't give any assurance that the results you >>> post won't have personally identifiable information. I don't get the >>> impression that you know much about survey design. >>> >>> -----Original Message----- >>> From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] >>> On Behalf Of Harsh >>> Sent: Thursday, March 03, 2011 5:53 AM >>> To: r-help at r-project.org >>> Subject: [R] R usage survey >>> >>> Hi R users, >>> I request members of the R community to consider filling a short survey >>> regarding the use of R. >>> The survey can be found at http://goo.gl/jw1ig >>> >>> Please accept my apologies for posting here for a non-technical reason. >>> >>> The data collected will be suitably analyzed and I'll post a link to the >>> results in the coming weeks. >>> >>> Thank you all for your interest and for sharing your R usage information. >>> >>> Regards, >>> Harsh Singhal >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> >>> >>> message may contain confidential information. If you are not the > designated >>> recipient, please notify the sender immediately, and delete the original > and >>> any copies. Any use of the message by you is prohibited. >>> >>> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> >> >> >> message may contain confidential information. If you are not the > designated recipient, please notify the sender immediately, and delete the > original and any copies. Any use of the message by you is prohibited. > > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From Greg.Snow at imail.org Fri Mar 4 20:48:50 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Fri, 4 Mar 2011 12:48:50 -0700 Subject: [R] embed latex beamer sans serif default font into R plot In-Reply-To: <454776AB56045147BB90DDFC184509F901A636B7@hawkeye-2.rei.edu> References: <454776AB56045147BB90DDFC184509F901A636B7@hawkeye-2.rei.edu> Message-ID: Probably the best way to do this is by using the tikzDevice package to create your graphs. This creates TeX commands to create the graph using the same fonts and settings as the rest of the document. The pgfSweave package may also be of interest (it uses tikzDevice to do sweaving). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Kwok, Heemun > Sent: Thursday, March 03, 2011 12:00 PM > To: r-help at r-project.org > Subject: [R] embed latex beamer sans serif default font into R plot > > Hello, > I have seen instructions on how to embed Latex Computer Modern fonts > into R, but these are the default serif fonts. I am trying to embed the > default font used for Latex beamer (theme Warsaw), which is a sans > serif font and may be the default LateX Computer Modern sans serif > font. Does anyone know the names of the font files? > > Thanks > Heemun > > > ------------------------------------------------- > Heemun Kwok, M.D. > Research Fellow > Harbor-UCLA Department of Emergency Medicine > 1000 West Carson Street, Box 21 > Torrance, CA 90509-2910 > office 310-222-3501, fax 310-212-6101 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From peter.langfelder at gmail.com Fri Mar 4 20:50:42 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Fri, 4 Mar 2011 11:50:42 -0800 Subject: [R] Problem with tcltk In-Reply-To: References: Message-ID: On Fri, Mar 4, 2011 at 8:02 AM, Arnaud Mosnier wrote: > Dear all, > > Since I installed the x64 version of R (v2.12.1), I got a problem with tcltk > that I did not achieve to resolve. > When loading the library, it gives me the following error message: > > Loading Tcl/Tk interface ...Error : .onLoad failed in loadNamespace() for > 'tcltk', details: > ?call: inDL(x, as.logical(local), as.logical(now), ...) > ?error: unable to load shared object 'C:/PROGRA~1/R/R-212~1.1/ > library/tcltk/libs/x64/tcltk.dll': > ?LoadLibrary failure: ?This application has failed to start because the > application configuration is incorrect. Reinstalling the application may fix > this problem. > > Error: package/namespace load failed for 'tcltk' > > Any suggestions ? As the hint said, try installing or re-installing Tcl/Tk (outside of R). Note that you may need the 64-bit version of it. The R package tcltk is only an interface to the actual Tcl/Tk that is a toolkit (library) that lives outside of R. HTH, Peter From spencer.graves at structuremonitoring.com Fri Mar 4 20:57:00 2011 From: spencer.graves at structuremonitoring.com (Spencer Graves) Date: Fri, 04 Mar 2011 11:57:00 -0800 Subject: [R] R usage survey In-Reply-To: References: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A95@EXNSW-MBX03.nexus.csiro.au> <36180405F8418449918AD20618D110FC095BFA7290@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Message-ID: <4D71440C.6010306@structuremonitoring.com> Most surveys done in the US today are done during election season, to determine how to package candidates to attract votes. Officials elected under such circumstances spend half their time in office servicing the bribes that they accepted to pay for the surveys and the resulting advertising (and the other half soliciting more bribes er contributions for their next campaign). The best reference on this I know is Thomas Ferguson (1995) Golden Rule (U. Chicago Pr.). It's by now somewhat old but is still cited by leading researchers. People have a right to be cautious of surveys, because too rarely today are surveys used for legitimate scientific purposes. Most often, they are used to defraud the public into doing things that are contrary to their best interests. Spencer Graves On 3/4/2011 11:37 AM, Ista Zahn wrote: > Now hold on a second Harsh! I was fairly neutral up to this point, but > this response is totally uncalled for. The problem is that despite > repeated requests you never clarified the purpose of your research! > That is all you were asked to do, but rather than responding to this > inquirly in a straightforward and honest manner you kept dodging the > question. The most charitable explanation is that you just don't > understand what information you were being asked to provide, which is > frustrating but understandable; your last response on the other hand > is completly out of line. Research participants have a right to know > the purpose for which their data is being collected, and as a > researcher you have a responsibility to tell them. > > Rex, thank you for generating this discussion. When I first say > Harsh's original email I was just getting ready to fill out the > survey. When I saw your response I delayed. Boy am I glad I did! > > Best, > Ista > > On Fri, Mar 4, 2011 at 2:20 PM, Harsh wrote: >> Rex, >> You're just paranoid and I'm in no way answerable to you. Your constant name >> calling presupposes your own naivete. >> >> The survey has a disclaimer and those who wish to respond can do so at their >> own discretion. >> >> Judging by the nature (and number) of respondents, there seem to be a lot of >> highly qualified people who have no qualms about sharing information >> regarding their R usage patterns. >> >> You can believe what you want and can continue to spin your imaginative >> tales of "industrial espionage" while assuming a position of apparent >> authority on survey design, Oscar gowns and data security AND my apparent >> ulterior and "outrageous" motives. >> >> You also seem to be an ignorant and misinformed person. Google forms, using >> which the survey was created DOES NOT log IP addresses of the respondents. >> >> And exactly which question in the survey would contribute to endangering the >> professional or personal safety and security of people responding to the >> survey. The information sought is freely available on LinkedIn. I merely >> want to get more descriptive information directly from R users. >> >> If you haven't looked at the Survey questions, then refrain from making >> misconstrued remarks. >> >> I apologize to the other users of this list for prolonging this frivolous >> debate here. This will be my last response on the list regarding this topic. >> >> >> If anyone has an issue pertaining to the Survey, its outcome and my motives, >> they can get in touch with me independently and off the list. All forms of >> constructive comments are also welcome. >> >> For those interested in sharing their R usage information please visit >> goo.gl/jw1ig >> >> - H >> >> >> >> >> On 04-Mar-2011 10:34 PM, wrote: >>> You still don't say what organization you are associated with. Your domain >> name and e-mail address give no hint. How do we know that "Harsh Singhal" is >> even a real person? An e-mail address at a university (for example) would go >> a long way to establish that. Gmail doesn't cut it for me. >>> The preponderance of evidence is that you're just a na?ve person who would >> give your own information to anyone who asked. On the other hand, it's >> possible that you are conducting industrial espionage by recording IP >> addresses and associating "use cases" with companies. In my opinion, the >> onus is on you to show your bona fides, and you haven't done it. >>> That's all I have to say... >>> >>> >>> From: Harsh [mailto:singhalblr at gmail.com] >>> Sent: Friday, March 04, 2011 4:19 AM >>> To: Bill.Venables at csiro.au >>> Cc: Dwyer Rex USRE; r-help at r-project.org >>> Subject: Re: [R] R usage survey >>> >>> The R usage survey goo.gl/jw1ig has been updated with >> the following changes: >>> Addition of - >>> Disclaimer : >>> This data will not be used for any commercial purposes >>> Do not include any personally identifiable information >>> Contact: Harsh Singhal (singhalblr AT gmail DOT com) for any queries >>> >>> Removal of - >>> Name field >>> >>> My primary purpose in conducting this survey is - >>> - Find multiple use cases for various R packages >>> - Understand the nature of work when R is being used in Academia / >> Commercial settings >>> - The kind of technologies that are being used in conjunction with R >> (popularity of usage of Python with R, and what purpose does using Python >> solve) >>> The outcome of this analysis will be published on my blog (in the process >> of being created). >>> There is absolutely no commercial purpose behind collecting this >> information and as earlier stated, this information will not be shared with >> personally identifiable information. >>> Thank you once again Mr. Dwyer and Mr. Venables for raising very import >> questions. >>> I thank the R users who have already filled in the survey goo.gl/jw1ig< >> http://goo.gl/jw1ig> and request more to do so. >>> Regards, >>> Harsh Singhal >>> >>> >>> >>> >>> >>> >>> >>> On Fri, Mar 4, 2011 at 7:41 AM, wrote: >>> No. That's not answering the question. ALL surveys are for collecting >> information. >>> The substantive issue is what purpose do you have in seeking this >> information in the first place and what are you going to do with it when you >> get it? >>> Do you have some commercial purpose in mind? If so, what is it? >>> >>> -----Original Message----- >>> From: r-help-bounces at r-project.org >> [mailto:r-help-bounces at r-project.org] >> On Behalf Of Harsh >>> Sent: Friday, 4 March 2011 1:13 AM >>> To: rex.dwyer at syngenta.com >>> Cc: r-help at r-project.org >>> Subject: Re: [R] R usage survey >>> >>> Hi Rex and useRs, >>> >>> The purpose of the survey has been mentioned on the survey link >> goo.gl/jw1ig >>> but I will also reproduce it here. >>> - Geographical distribution of R users >>> - Application areas where R is being used >>> - Supporting technology being used along with R >>> - Academic background distribution of R users >>> >>> The potential personally identifiable information such as name and >> employer >>> name are optional fields. Actually all the fields in the survey are >>> optional. >>> >>> Some of the analysis output(s) could be along the lines of :- >>> - Usage statistics of various R packages >>> - Distribution of R users across countries/cities >>> - Mapping various applications to packages >>> - Text Mining of the responses to create informative word clouds >>> >>> Personally, I am excited about the kind of data I will receive through >> this >>> survey and the various insights that could be derived. As already >> mentioned, >>> the results will be shared with the community. >>> >>> Thank you Rex for raising an important point. It is indeed necessary for >> me >>> to personally assure the user community that the results will be shared in >> a >>> manner that will not contain any personally identifiable information. >>> >>> Those who wish to gain access to the raw data will be provided with all >> the >>> fields but not the name and employer name fields. >>> >>> Just out of curiosity : It is possible to get name, employer name, >> location, >>> usage information and academic background details when searching for R >> users >>> on LinkedIn and the many R related groups there. >>> Does this also provide potential opportunities for misuse and "outrageous" >>> analyses, since almost anyone can get onto LinkedIn and access user >> profiles >>> ? >>> >>> Thank you for your interest and support. >>> Regards, >>> Harsh >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> On Thu, Mar 3, 2011 at 8:02 PM,> rex.dwyer at syngenta.com>> wrote: >>>> Harsh, "Suitably analyzed" for whose purposes? One man's "suitable" is >>>> another's "outrageous". That's why people want to see the gowns at the >>>> Oscars. Under what auspices are you conducting this survey? What do you >>>> intend to do with it? You don't give any assurance that the results you >>>> post won't have personally identifiable information. I don't get the >>>> impression that you know much about survey design. >>>> >>>> -----Original Message----- >>>> From: r-help-bounces at r-project.org >> [mailto:r-help-bounces at r-project.org] >>>> On Behalf Of Harsh >>>> Sent: Thursday, March 03, 2011 5:53 AM >>>> To: r-help at r-project.org >>>> Subject: [R] R usage survey >>>> >>>> Hi R users, >>>> I request members of the R community to consider filling a short survey >>>> regarding the use of R. >>>> The survey can be found at http://goo.gl/jw1ig >>>> >>>> Please accept my apologies for posting here for a non-technical reason. >>>> >>>> The data collected will be suitably analyzed and I'll post a link to the >>>> results in the coming weeks. >>>> >>>> Thank you all for your interest and for sharing your R usage information. >>>> >>>> Regards, >>>> Harsh Singhal >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> >>>> >>>> >>>> message may contain confidential information. If you are not the >> designated >>>> recipient, please notify the sender immediately, and delete the original >> and >>>> any copies. Any use of the message by you is prohibited. >>>> >>>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> >>> >>> message may contain confidential information. If you are not the >> designated recipient, please notify the sender immediately, and delete the >> original and any copies. Any use of the message by you is prohibited. >> >> [[alternative HTML version deleted]] >> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> From singhalblr at gmail.com Fri Mar 4 21:20:24 2011 From: singhalblr at gmail.com (Harsh) Date: Sat, 5 Mar 2011 01:50:24 +0530 Subject: [R] R usage survey In-Reply-To: <4D71440C.6010306@structuremonitoring.com> References: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A95@EXNSW-MBX03.nexus.csiro.au> <36180405F8418449918AD20618D110FC095BFA7290@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <4D71440C.6010306@structuremonitoring.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From hadley at rice.edu Fri Mar 4 21:27:30 2011 From: hadley at rice.edu (Hadley Wickham) Date: Fri, 4 Mar 2011 14:27:30 -0600 Subject: [R] R usage survey In-Reply-To: References: Message-ID: > Ok, I am very interested in what methods you plan to use that would be fit under the description "suitably analyzed" for voluntary response data. ?From my training and experience the only suitable thing to do with voluntary response data is to put it through the shredder, into the recycle bin, or use as an example of what not to do in introductory textbooks. ?Treating voluntary response data (especially given the responses to your post you have seen so far) as if it came from a proper random probability sample does not fit the idea of suitable analysis. Come on, that's a bit strong. In real life, it's not always possible to take a perfectly random sample and assume (at best) that missing responses are completely at random. Even descriptive analysis on a flawed sample is better than nothing at all. Of course you need to be extremely careful about making inferences about the wider population, but it's not true that the only thing you can do with survey data is to throw it in the trash. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ From pomchip at free.fr Fri Mar 4 21:37:28 2011 From: pomchip at free.fr (=?ISO-8859-1?Q?S=E9bastien_Bihorel?=) Date: Fri, 4 Mar 2011 15:37:28 -0500 Subject: [R] Units and dimensions in grid object Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From izahn at psych.rochester.edu Fri Mar 4 21:41:14 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Fri, 4 Mar 2011 15:41:14 -0500 Subject: [R] R usage survey In-Reply-To: <29387_1299270029_4D71498C_29387_162627_1_AANLkTin_j-n3HnO6ie7jHgjF-QKK5=AFehZWJAfgnrce@mail.gmail.com> References: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A95@EXNSW-MBX03.nexus.csiro.au> <36180405F8418449918AD20618D110FC095BFA7290@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <4D71440C.6010306@structuremonitoring.com> <29387_1299270029_4D71498C_29387_162627_1_AANLkTin_j-n3HnO6ie7jHgjF-QKK5=AFehZWJAfgnrce@mail.gmail.com> Message-ID: On Fri, Mar 4, 2011 at 3:20 PM, Harsh wrote: > Hi Ista, Spencer and Greg, > The information being collected is purely out of personal interest and I > have mentioned this earlier. No, I don't think you did actually. This is the key thing we wanted to know up-front, and it's a shame that it took the better part of the day before we finally understand why you are conducting the survey. There is no commercial interest involved. > > Is it possible that I am interested in this sort of information to better > understand R's usage patterns ? In doing so, the survey I am conducting > would seem an appropriate way for my requirements. > > And how does belittling someone on a mailing list help ? > > If anyone wants the kind of information I am collecting, are there > suggestions of better ways of finding it besides the method that I have > adopted ? Sure I could scrape the data of LinkedIn pages, or find other ways > of doing it, but I found this suitable. > > > > On Sat, Mar 5, 2011 at 1:27 AM, Spencer Graves > wrote: >> >> ? ? ?Most surveys done in the US today are done during election season, to >> determine how to package candidates to attract votes. ?Officials elected >> under such circumstances spend half their time in office servicing the >> bribes that they accepted to pay for the surveys and the resulting >> advertising (and the other half soliciting more bribes er contributions for >> their next campaign). ?The best reference on this I know is Thomas Ferguson >> (1995) Golden Rule (U. Chicago Pr.). ?It's by now somewhat old but is still >> cited by leading researchers. >> >> >> ? ? ?People have a right to be cautious of surveys, because too rarely >> today are surveys used for legitimate scientific purposes. ?Most often, they >> are used to defraud the public into doing things that are contrary to their >> best interests. >> >> >> ? ? ?Spencer Graves >> >> >> On 3/4/2011 11:37 AM, Ista Zahn wrote: >>> >>> Now hold on a second Harsh! I was fairly neutral up to this point, but >>> this response is totally uncalled for. The problem is that despite >>> repeated requests you never clarified the purpose of your research! >>> That is all you were asked to do, but rather than responding to this >>> inquirly in a straightforward and honest manner you kept dodging the >>> question. The most charitable explanation is that you just don't >>> understand what information you were being asked to provide, which is >>> frustrating but understandable; your last response on the other hand >>> is completly out of line. Research participants have a right to know >>> the purpose for which their data is being collected, and as a >>> researcher you have a responsibility to tell them. >>> >>> Rex, thank you for generating this discussion. When I first say >>> Harsh's original email I was just getting ready to fill out the >>> survey. When I saw your response I delayed. Boy am I glad I did! >>> >>> Best, >>> Ista >>> >>> On Fri, Mar 4, 2011 at 2:20 PM, Harsh ?wrote: >>>> >>>> Rex, >>>> You're just paranoid and I'm in no way answerable to you. Your constant >>>> name >>>> calling presupposes your own naivete. >>>> >>>> The survey has a disclaimer and those who wish to respond can do so at >>>> their >>>> own discretion. >>>> >>>> Judging by the nature (and number) of respondents, there seem to be a >>>> lot of >>>> highly qualified people who have no qualms about sharing information >>>> regarding their R usage patterns. >>>> >>>> You can believe what you want and can continue to spin your imaginative >>>> tales of "industrial espionage" while assuming a position of apparent >>>> authority on survey design, Oscar gowns and data security AND my >>>> ?apparent >>>> ulterior and "outrageous" motives. >>>> >>>> You also seem to be an ignorant and misinformed person. Google forms, >>>> using >>>> which the survey was created DOES NOT log IP addresses of the >>>> respondents. >>>> >>>> And exactly which question in the survey would contribute to endangering >>>> the >>>> professional or personal safety and security of people responding to the >>>> survey. The information sought is freely available on LinkedIn. I merely >>>> want to get more descriptive information directly from R users. >>>> >>>> If you haven't looked at the Survey questions, then refrain from making >>>> misconstrued remarks. >>>> >>>> I apologize to the other users of this list for prolonging this >>>> frivolous >>>> debate here. This will be my last response on the list regarding this >>>> topic. >>>> >>>> >>>> If anyone has an issue pertaining to the Survey, its outcome and my >>>> motives, >>>> they can get in touch with me independently and off the list. All forms >>>> of >>>> constructive comments are also welcome. >>>> >>>> For those interested in sharing their R usage information please visit >>>> goo.gl/jw1ig >>>> >>>> - H >>>> >>>> >>>> >>>> >>>> On 04-Mar-2011 10:34 PM, ?wrote: >>>>> >>>>> You still don't say what organization you are associated with. Your >>>>> domain >>>> >>>> name and e-mail address give no hint. How do we know that "Harsh >>>> Singhal" is >>>> even a real person? An e-mail address at a university (for example) >>>> would go >>>> a long way to establish that. Gmail doesn't cut it for me. >>>>> >>>>> The preponderance of evidence is that you're just a na?ve person who >>>>> would >>>> >>>> give your own information to anyone who asked. On the other hand, it's >>>> possible that you are conducting industrial espionage by recording IP >>>> addresses and associating "use cases" with companies. In my opinion, the >>>> onus is on you to show your bona fides, and you haven't done it. >>>>> >>>>> That's all I have to say... >>>>> >>>>> >>>>> From: Harsh [mailto:singhalblr at gmail.com] >>>>> Sent: Friday, March 04, 2011 4:19 AM >>>>> To: Bill.Venables at csiro.au >>>>> Cc: Dwyer Rex USRE; r-help at r-project.org >>>>> Subject: Re: [R] R usage survey >>>>> >>>>> The R usage survey goo.gl/jw1ig ?has been updated >>>>> with >>>> >>>> the following changes: >>>>> >>>>> Addition of - >>>>> Disclaimer : >>>>> This data will not be used for any commercial purposes >>>>> Do not include any personally identifiable information >>>>> Contact: Harsh Singhal (singhalblr AT gmail DOT com) for any queries >>>>> >>>>> Removal of - >>>>> Name field >>>>> >>>>> My primary purpose in conducting this survey is - >>>>> - Find multiple use cases for various R packages >>>>> - Understand the nature of work when R is being used in Academia / >>>> >>>> Commercial settings >>>>> >>>>> - The kind of technologies that are being used in conjunction with R >>>> >>>> (popularity of usage of Python with R, and what purpose does using >>>> Python >>>> solve) >>>>> >>>>> The outcome of this analysis will be published on my blog (in the >>>>> process >>>> >>>> of being created). >>>>> >>>>> There is absolutely no commercial purpose behind collecting this >>>> >>>> information and as earlier stated, this information will not be shared >>>> with >>>> personally identifiable information. >>>>> >>>>> Thank you once again Mr. Dwyer and Mr. Venables for raising very import >>>> >>>> questions. >>>>> >>>>> I thank the R users who have already filled in the survey goo.gl/jw1ig< >>>> >>>> http://goo.gl/jw1ig> ?and request more to do so. >>>>> >>>>> Regards, >>>>> Harsh Singhal >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Mar 4, 2011 at 7:41 AM, ?wrote: >>>>> No. That's not answering the question. ALL surveys are for collecting >>>> >>>> information. >>>>> >>>>> The substantive issue is what purpose do you have in seeking this >>>> >>>> information in the first place and what are you going to do with it when >>>> you >>>> get it? >>>>> >>>>> Do you have some commercial purpose in mind? If so, what is it? >>>>> >>>>> -----Original Message----- >>>>> From: r-help-bounces at r-project.org >>>> >>>> >>>> [mailto:r-help-bounces at r-project.org] >>>> On Behalf Of Harsh >>>>> >>>>> Sent: Friday, 4 March 2011 1:13 AM >>>>> To: rex.dwyer at syngenta.com >>>>> Cc: r-help at r-project.org >>>>> Subject: Re: [R] R usage survey >>>>> >>>>> Hi Rex and useRs, >>>>> >>>>> The purpose of the survey has been mentioned on the survey link >>>> >>>> goo.gl/jw1ig >>>>> >>>>> but I will also reproduce it here. >>>>> - Geographical distribution of R users >>>>> - Application areas where R is being used >>>>> - Supporting technology being used along with R >>>>> - Academic background distribution of R users >>>>> >>>>> The potential personally identifiable information such as name and >>>> >>>> employer >>>>> >>>>> name are optional fields. Actually all the fields in the survey are >>>>> optional. >>>>> >>>>> Some of the analysis output(s) could be along the lines of :- >>>>> - Usage statistics of various R packages >>>>> - Distribution of R users across countries/cities >>>>> - Mapping various applications to packages >>>>> - Text Mining of the responses to create informative word clouds >>>>> >>>>> Personally, I am excited about the kind of data I will receive through >>>> >>>> this >>>>> >>>>> survey and the various insights that could be derived. As already >>>> >>>> mentioned, >>>>> >>>>> the results will be shared with the community. >>>>> >>>>> Thank you Rex for raising an important point. It is indeed necessary >>>>> for >>>> >>>> me >>>>> >>>>> to personally assure the user community that the results will be shared >>>>> in >>>> >>>> a >>>>> >>>>> manner that will not contain any personally identifiable information. >>>>> >>>>> Those who wish to gain access to the raw data will be provided with all >>>> >>>> the >>>>> >>>>> fields but not the name and employer name fields. >>>>> >>>>> Just out of curiosity : It is possible to get name, employer name, >>>> >>>> location, >>>>> >>>>> usage information and academic background details when searching for R >>>> >>>> users >>>>> >>>>> on LinkedIn and the many R related groups there. >>>>> Does this also provide potential opportunities for misuse and >>>>> "outrageous" >>>>> analyses, since almost anyone can get onto LinkedIn and access user >>>> >>>> profiles >>>>> >>>>> ? >>>>> >>>>> Thank you for your interest and support. >>>>> Regards, >>>>> Harsh >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Mar 3, 2011 at 8:02 PM,>>> >>>> rex.dwyer at syngenta.com>> ?wrote: >>>>>> >>>>>> Harsh, "Suitably analyzed" for whose purposes? One man's "suitable" is >>>>>> another's "outrageous". That's why people want to see the gowns at the >>>>>> Oscars. Under what auspices are you conducting this survey? What do >>>>>> you >>>>>> intend to do with it? You don't give any assurance that the results >>>>>> you >>>>>> post won't have personally identifiable information. I don't get the >>>>>> impression that you know much about survey design. >>>>>> >>>>>> -----Original Message----- >>>>>> From: >>>>>> r-help-bounces at r-project.org >>>> >>>> >>>> [mailto:r-help-bounces at r-project.org] >>>>>> >>>>>> On Behalf Of Harsh >>>>>> Sent: Thursday, March 03, 2011 5:53 AM >>>>>> To: r-help at r-project.org >>>>>> Subject: [R] R usage survey >>>>>> >>>>>> Hi R users, >>>>>> I request members of the R community to consider filling a short >>>>>> survey >>>>>> regarding the use of R. >>>>>> The survey can be found at http://goo.gl/jw1ig >>>>>> >>>>>> Please accept my apologies for posting here for a non-technical >>>>>> reason. >>>>>> >>>>>> The data collected will be suitably analyzed and I'll post a link to >>>>>> the >>>>>> results in the coming weeks. >>>>>> >>>>>> Thank you all for your interest and for sharing your R usage >>>>>> information. >>>>>> >>>>>> Regards, >>>>>> Harsh Singhal >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> ______________________________________________ >>>>>> R-help at r-project.org ?mailing list >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> message may contain confidential information. If you are not the >>>> >>>> designated >>>>>> >>>>>> recipient, please notify the sender immediately, and delete the >>>>>> original >>>> >>>> and >>>>>> >>>>>> any copies. Any use of the message by you is prohibited. >>>>>> >>>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> ______________________________________________ >>>>> R-help at r-project.org ?mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>> >>>> http://www.R-project.org/posting-guide.html >>>>> >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>>> >>>>> >>>>> >>>>> message may contain confidential information. If you are not the >>>> >>>> designated recipient, please notify the sender immediately, and delete >>>> the >>>> original and any copies. Any use of the message by you is prohibited. >>>> >>>> ? ? ? ?[[alternative HTML version deleted]] >>>> >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> > > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From dwinsemius at comcast.net Fri Mar 4 21:42:38 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Mar 2011 15:42:38 -0500 Subject: [R] Rstudio question In-Reply-To: References: Message-ID: On Mar 4, 2011, at 9:05 AM, Robert Kinley wrote: > I really like RStudio ... > > ... but I wish it wouldn't automatically reload the last .RData it > had. > > Anyone know how to fix this ... ? > That is the default behavior of the GUIs provided for both Mac and Windows (and I think also for an R session started simply as `R` from a command line invocation) and in those situation, the answer is first delete .Rdata and then do not save when quitting. -- David Winsemius, MD Heritage Laboratories West Hartford, CT From singhalblr at gmail.com Fri Mar 4 22:13:15 2011 From: singhalblr at gmail.com (Harsh) Date: Sat, 5 Mar 2011 02:43:15 +0530 Subject: [R] R usage survey In-Reply-To: References: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A95@EXNSW-MBX03.nexus.csiro.au> <36180405F8418449918AD20618D110FC095BFA7290@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <4D71440C.6010306@structuremonitoring.com> <29387_1299270029_4D71498C_29387_162627_1_AANLkTin_j-n3HnO6ie7jHgjF-QKK5=AFehZWJAfgnrce@mail.gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From baptiste.auguie at googlemail.com Fri Mar 4 22:15:26 2011 From: baptiste.auguie at googlemail.com (baptiste auguie) Date: Sat, 5 Mar 2011 10:15:26 +1300 Subject: [R] Creating a .png with just an expression() in it In-Reply-To: <4D70E9B6.8040800@gmx.de> References: <4D70E9B6.8040800@gmx.de> Message-ID: Hi, I think the easiest way is to use grid graphics, library(grid) a = 0.3 b = pi e = bquote(y[alpha] == .(a) * x[beta]+ .(round(b,2))) grid.newpage() grid.text(e) ## if you wanted a snug fit with the device window e2 = expression(integral(frac(1, alpha + x[beta])*dx, -infinity, +infinity)) g = textGrob(e2) w = grobWidth(g) h = grobHeight(g) dev.new(width=convertUnit(w, "in", valueOnly=TRUE), height = convertUnit(h, "in", "y", valueOnly=TRUE)) grid.draw(g) HTH, baptiste On 5 March 2011 02:31, Alexx Hardt wrote: > Hey, > I'm trying to create an image file with the results of a regression > analysis. In TeX, the line would be something like: > $ size = 0.34 + 4.3 var_1 $ > > Can I create a plot window with just this line in it? I tried playing around > with plot.new() or dev.new(), but didn't really find something that worked. > > Thanks in advance, > ?Alex > > -- > alexx at alexx-fett:~$ vi .emacs > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From vassalosmichael at uky.edu Fri Mar 4 21:00:47 2011 From: vassalosmichael at uky.edu (mike) Date: Fri, 4 Mar 2011 12:00:47 -0800 (PST) Subject: [R] Lepage Test Message-ID: <1299268847806-3335861.post@n4.nabble.com> Hey everyone, I am interest in running a Lepage multi sample test and i would like to ask if there is any code availabe for that. Thank you in advance -- View this message in context: http://r.789695.n4.nabble.com/Lepage-Test-tp3335861p3335861.html Sent from the R help mailing list archive at Nabble.com. From ryan.steven.garner at gmail.com Fri Mar 4 21:55:13 2011 From: ryan.steven.garner at gmail.com (Ryan Garner) Date: Fri, 4 Mar 2011 12:55:13 -0800 (PST) Subject: [R] X11 graphics windows under CMD BATCH In-Reply-To: <723CD82D-59E0-41AF-B8E0-AFC0B2D673B1@gmail.com> References: <723CD82D-59E0-41AF-B8E0-AFC0B2D673B1@gmail.com> Message-ID: <1299272113015-3335922.post@n4.nabble.com> I had this same issue. My quick and dirty solution was to create an infinite loop at the end of my R plotting script and then manually kill the job with Ctrl+C once I was done looking at the plot. -- View this message in context: http://r.789695.n4.nabble.com/X11-graphics-windows-under-CMD-BATCH-tp838015p3335922.html Sent from the R help mailing list archive at Nabble.com. From jfox at mcmaster.ca Fri Mar 4 22:20:03 2011 From: jfox at mcmaster.ca (John Fox) Date: Fri, 4 Mar 2011 16:20:03 -0500 Subject: [R] Rstudio question In-Reply-To: References: Message-ID: <00cc01cbdab1$edefbf90$c9cf3eb0$@mcmaster.ca> Dear David and Robert, I noticed the same behaviour from RStudio (with Windows 7, R 2.12.2, and RStudio 0.92.38). At the start of the session: [Workspace restored from ~/.RData] > ls(all.names=TRUE) [1] ".Random.seed" > getwd() [1] "C:/Users/John Fox/Documents" Oddly, I never saved a workspace in C:/Users/John Fox/Documents and there is no .RData file in that directory. As well, the random seed, which is the only object in the current workspace at the beginning of the session, does not appear to be preserved from session to session when I exit without saving the workspace. It's not clear to me what's going on, but it does seem to be innocuous. Best, John -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of David Winsemius > Sent: March-04-11 3:43 PM > To: Robert Kinley > Cc: r-help at r-project.org > Subject: Re: [R] Rstudio question > > > On Mar 4, 2011, at 9:05 AM, Robert Kinley wrote: > > > I really like RStudio ... > > > > ... but I wish it wouldn't automatically reload the last .RData it > > had. > > > > Anyone know how to fix this ... ? > > > > That is the default behavior of the GUIs provided for both Mac and > Windows (and I think also for an R session started simply as `R` from a > command line invocation) and in those situation, the answer is first > delete .Rdata and then do not save when quitting. > > -- > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From izahn at psych.rochester.edu Fri Mar 4 22:23:26 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Fri, 4 Mar 2011 21:23:26 +0000 Subject: [R] Lepage Test In-Reply-To: <1299268847806-3335861.post@n4.nabble.com> References: <1299268847806-3335861.post@n4.nabble.com> Message-ID: Hi Mike, RSiteSearch("Lepage") looks promising. Best, Ista On Fri, Mar 4, 2011 at 8:00 PM, mike wrote: > Hey everyone, > > I am interest in running a Lepage multi sample test and i would like to ask > if there is any code availabe for that. > > Thank you in advance > > -- > View this message in context: http://r.789695.n4.nabble.com/Lepage-Test-tp3335861p3335861.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From djmuser at gmail.com Fri Mar 4 23:11:49 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Fri, 4 Mar 2011 14:11:49 -0800 Subject: [R] Lepage Test In-Reply-To: <1299268847806-3335861.post@n4.nabble.com> References: <1299268847806-3335861.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Greg.Snow at imail.org Fri Mar 4 23:21:08 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Fri, 4 Mar 2011 15:21:08 -0700 Subject: [R] R usage survey In-Reply-To: References: Message-ID: Thanks Hadley, Your response here and some I received offline made me look back at what I said and others and I could have phrased things better. My phrase of "voluntary response" was a bit vague (but is what is used in the course materials I teach from, so that is what came to mind). I specifically meant surveys without random selection where the respondents go to some effort to select themselves into the sample. I feel that this survey fits that category, though definitely not as bad as those where people have to call a phone number and pay a fee to respond. Hash's survey still looks like it is going to suffer from undercoverage and there could be serious bias from that. There are methods for adjusting for undercoverage, but I don't see how Hash will have the information needed to do those kinds of corrections (however I am still learning and would be interested if there is the type of info and methods available for his). Also looking back, I mistakenly assumed that he was planning on doing inference, but I don't see anywhere in his posts that he stated that, so that was my fault and I came on a bit strong based on that mistake, I apologize for that. Offline he told me that he is planning on just doing descriptives and as long as he is up front about the limitations of the data and limits himself to descriptive, then the survey could be reasonable. Statements like "at least 3 people from region X use R" don't require probability samples (just assumptions that those 3 people were honest about being from region X), inference about how many others in region X use R would need more/better information. Hopefully that clarifies my position and people can start liking me again, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: h.wickham at gmail.com [mailto:h.wickham at gmail.com] On Behalf Of > Hadley Wickham > Sent: Friday, March 04, 2011 1:28 PM > To: Greg Snow > Cc: Harsh; r-help at r-project.org > Subject: Re: [R] R usage survey > > > Ok, I am very interested in what methods you plan to use that would > be fit under the description "suitably analyzed" for voluntary response > data. ?From my training and experience the only suitable thing to do > with voluntary response data is to put it through the shredder, into > the recycle bin, or use as an example of what not to do in introductory > textbooks. ?Treating voluntary response data (especially given the > responses to your post you have seen so far) as if it came from a > proper random probability sample does not fit the idea of suitable > analysis. > > Come on, that's a bit strong. In real life, it's not always possible > to take a perfectly random sample and assume (at best) that missing > responses are completely at random. Even descriptive analysis on a > flawed sample is better than nothing at all. Of course you need to be > extremely careful about making inferences about the wider population, > but it's not true that the only thing you can do with survey data is > to throw it in the trash. > > Hadley > > -- > Assistant Professor / Dobelman Family Junior Chair > Department of Statistics / Rice University > http://had.co.nz/ From singhalblr at gmail.com Fri Mar 4 23:42:53 2011 From: singhalblr at gmail.com (Harsh) Date: Sat, 5 Mar 2011 04:12:53 +0530 Subject: [R] R usage survey In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rex.dwyer at syngenta.com Sat Mar 5 00:19:49 2011 From: rex.dwyer at syngenta.com (rex.dwyer at syngenta.com) Date: Fri, 4 Mar 2011 18:19:49 -0500 Subject: [R] R usage survey In-Reply-To: References: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A95@EXNSW-MBX03.nexus.csiro.au> <36180405F8418449918AD20618D110FC095BFA7290@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <4D71440C.6010306@structuremonitoring.com> <29387_1299270029_4D71498C_29387_162627_1_AANLkTin_j-n3HnO6ie7jHgjF-QKK5=AFehZWJAfgnrce@mail.gmail.com> Message-ID: <36180405F8418449918AD20618D110FC095C0036BA@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Harsh, not to worry, but you were wrong to assert that I engaged in any name calling, let alone constant name calling. I also didn't and don't claim to be an authority on survey design. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Harsh Sent: Friday, March 04, 2011 4:13 PM To: Ista Zahn Cc: r-help at r-project.org Subject: Re: [R] R usage survey Rex, Please accept my apologies for my inappropriate and utterly juvenile remarks. I got carried away by what I thought was criticism and was quick to respond in a scathing manner. I do accept and apologize for my inability in understanding what was essentially being asked of me. Thanks to Ista and other members for clarifying what I failed to understand. I'm now aware that I must submit to appropriately answering questions from potential respondents of the survey. I must reiterate that "This survey is not sponsored or approved by any organization or company. The purpose of the survey is to satisfy my personal curiosity regarding R usage patterns. Results will be posted to a publicly available weblog; the data will not be used for any other purpose". (Thanks Ista for wording this out. I couldn't have done it better) Regards, Harsh Singhal http://in.linkedin.com/in/harshsinghal On Sat, Mar 5, 2011 at 2:11 AM, Ista Zahn wrote: > On Fri, Mar 4, 2011 at 3:20 PM, Harsh wrote: > > Hi Ista, Spencer and Greg, > > > The information being collected is purely out of personal interest > > and I have mentioned this earlier. > > No, I don't think you did actually. This is the key thing we wanted to > know up-front, and it's a shame that it took the better part of the > day before we finally understand why you are conducting the survey. > > There is no commercial interest involved. > > > > Is it possible that I am interested in this sort of information to > > better understand R's usage patterns ? In doing so, the survey I am > > conducting would seem an appropriate way for my requirements. > > > > And how does belittling someone on a mailing list help ? > > > > If anyone wants the kind of information I am collecting, are there > > suggestions of better ways of finding it besides the method that I > > have adopted ? Sure I could scrape the data of LinkedIn pages, or > > find other > ways > > of doing it, but I found this suitable. > > > > > > > > On Sat, Mar 5, 2011 at 1:27 AM, Spencer Graves > > wrote: > >> > >> Most surveys done in the US today are done during election > >> season, > to > >> determine how to package candidates to attract votes. Officials > >> elected under such circumstances spend half their time in office > >> servicing the bribes that they accepted to pay for the surveys and > >> the resulting advertising (and the other half soliciting more > >> bribes er contributions > for > >> their next campaign). The best reference on this I know is Thomas > Ferguson > >> (1995) Golden Rule (U. Chicago Pr.). It's by now somewhat old but > >> is > still > >> cited by leading researchers. > >> > >> > >> People have a right to be cautious of surveys, because too > >> rarely today are surveys used for legitimate scientific purposes. > >> Most often, > they > >> are used to defraud the public into doing things that are contrary > >> to > their > >> best interests. > >> > >> > >> Spencer Graves > >> > >> > >> On 3/4/2011 11:37 AM, Ista Zahn wrote: > >>> > >>> Now hold on a second Harsh! I was fairly neutral up to this point, > >>> but this response is totally uncalled for. The problem is that > >>> despite repeated requests you never clarified the purpose of your research! > >>> That is all you were asked to do, but rather than responding to > >>> this inquirly in a straightforward and honest manner you kept > >>> dodging the question. The most charitable explanation is that you > >>> just don't understand what information you were being asked to > >>> provide, which is frustrating but understandable; your last > >>> response on the other hand is completly out of line. Research > >>> participants have a right to know the purpose for which their data > >>> is being collected, and as a researcher you have a responsibility to tell them. > >>> > >>> Rex, thank you for generating this discussion. When I first say > >>> Harsh's original email I was just getting ready to fill out the > >>> survey. When I saw your response I delayed. Boy am I glad I did! > >>> > >>> Best, > >>> Ista > >>> > >>> On Fri, Mar 4, 2011 at 2:20 PM, Harsh wrote: > >>>> > >>>> Rex, > >>>> You're just paranoid and I'm in no way answerable to you. Your > constant > >>>> name > >>>> calling presupposes your own naivete. > >>>> > >>>> The survey has a disclaimer and those who wish to respond can do > >>>> so at their own discretion. > >>>> > >>>> Judging by the nature (and number) of respondents, there seem to > >>>> be a lot of highly qualified people who have no qualms about > >>>> sharing information regarding their R usage patterns. > >>>> > >>>> You can believe what you want and can continue to spin your > imaginative > >>>> tales of "industrial espionage" while assuming a position of > >>>> apparent authority on survey design, Oscar gowns and data > >>>> security AND my apparent ulterior and "outrageous" motives. > >>>> > >>>> You also seem to be an ignorant and misinformed person. Google > >>>> forms, using which the survey was created DOES NOT log IP > >>>> addresses of the respondents. > >>>> > >>>> And exactly which question in the survey would contribute to > endangering > >>>> the > >>>> professional or personal safety and security of people responding > >>>> to > the > >>>> survey. The information sought is freely available on LinkedIn. I > merely > >>>> want to get more descriptive information directly from R users. > >>>> > >>>> If you haven't looked at the Survey questions, then refrain from > making > >>>> misconstrued remarks. > >>>> > >>>> I apologize to the other users of this list for prolonging this > >>>> frivolous debate here. This will be my last response on the list > >>>> regarding this topic. > >>>> > >>>> > >>>> If anyone has an issue pertaining to the Survey, its outcome and > >>>> my motives, they can get in touch with me independently and off > >>>> the list. All > forms > >>>> of > >>>> constructive comments are also welcome. > >>>> > >>>> For those interested in sharing their R usage information please > >>>> visit goo.gl/jw1ig > >>>> > >>>> - H > >>>> > >>>> > >>>> > >>>> > >>>> On 04-Mar-2011 10:34 PM, wrote: > >>>>> > >>>>> You still don't say what organization you are associated with. > >>>>> Your domain > >>>> > >>>> name and e-mail address give no hint. How do we know that "Harsh > >>>> Singhal" is even a real person? An e-mail address at a university > >>>> (for example) would go a long way to establish that. Gmail > >>>> doesn't cut it for me. > >>>>> > >>>>> The preponderance of evidence is that you're just a na?ve person > >>>>> who would > >>>> > >>>> give your own information to anyone who asked. On the other hand, > >>>> it's possible that you are conducting industrial espionage by > >>>> recording IP addresses and associating "use cases" with > >>>> companies. In my opinion, > the > >>>> onus is on you to show your bona fides, and you haven't done it. > >>>>> > >>>>> That's all I have to say... > >>>>> > >>>>> > >>>>> From: Harsh [mailto:singhalblr at gmail.com] > >>>>> Sent: Friday, March 04, 2011 4:19 AM > >>>>> To: Bill.Venables at csiro.au > >>>>> Cc: Dwyer Rex USRE; r-help at r-project.org > >>>>> Subject: Re: [R] R usage survey > >>>>> > >>>>> The R usage survey goo.gl/jw1ig has been > updated > >>>>> with > >>>> > >>>> the following changes: > >>>>> > >>>>> Addition of - > >>>>> Disclaimer : > >>>>> This data will not be used for any commercial purposes Do not > >>>>> include any personally identifiable information > >>>>> Contact: Harsh Singhal (singhalblr AT gmail DOT com) for any > >>>>> queries > >>>>> > >>>>> Removal of - > >>>>> Name field > >>>>> > >>>>> My primary purpose in conducting this survey is - > >>>>> - Find multiple use cases for various R packages > >>>>> - Understand the nature of work when R is being used in Academia > >>>>> / > >>>> > >>>> Commercial settings > >>>>> > >>>>> - The kind of technologies that are being used in conjunction > >>>>> with R > >>>> > >>>> (popularity of usage of Python with R, and what purpose does > >>>> using Python > >>>> solve) > >>>>> > >>>>> The outcome of this analysis will be published on my blog (in > >>>>> the process > >>>> > >>>> of being created). > >>>>> > >>>>> There is absolutely no commercial purpose behind collecting this > >>>> > >>>> information and as earlier stated, this information will not be > >>>> shared with personally identifiable information. > >>>>> > >>>>> Thank you once again Mr. Dwyer and Mr. Venables for raising very > import > >>>> > >>>> questions. > >>>>> > >>>>> I thank the R users who have already filled in the survey > goo.gl/jw1ig< > >>>> > >>>> http://goo.gl/jw1ig> and request more to do so. > >>>>> > >>>>> Regards, > >>>>> Harsh Singhal > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On Fri, Mar 4, 2011 at 7:41 AM, wrote: > >>>>> No. That's not answering the question. ALL surveys are for > >>>>> collecting > >>>> > >>>> information. > >>>>> > >>>>> The substantive issue is what purpose do you have in seeking > >>>>> this > >>>> > >>>> information in the first place and what are you going to do with > >>>> it > when > >>>> you > >>>> get it? > >>>>> > >>>>> Do you have some commercial purpose in mind? If so, what is it? > >>>>> > >>>>> -----Original Message----- > >>>>> From: r-help-bounces at r-project.org r-help-bounces at r-project.org> > >>>> > >>>> > >>>> [mailto:r-help-bounces at r-project.org r-help-bounces at r-project.org>] > >>>> On Behalf Of Harsh > >>>>> > >>>>> Sent: Friday, 4 March 2011 1:13 AM > >>>>> To: rex.dwyer at syngenta.com > >>>>> Cc: r-help at r-project.org > >>>>> Subject: Re: [R] R usage survey > >>>>> > >>>>> Hi Rex and useRs, > >>>>> > >>>>> The purpose of the survey has been mentioned on the survey link > >>>> > >>>> goo.gl/jw1ig > >>>>> > >>>>> but I will also reproduce it here. > >>>>> - Geographical distribution of R users > >>>>> - Application areas where R is being used > >>>>> - Supporting technology being used along with R > >>>>> - Academic background distribution of R users > >>>>> > >>>>> The potential personally identifiable information such as name > >>>>> and > >>>> > >>>> employer > >>>>> > >>>>> name are optional fields. Actually all the fields in the survey > >>>>> are optional. > >>>>> > >>>>> Some of the analysis output(s) could be along the lines of :- > >>>>> - Usage statistics of various R packages > >>>>> - Distribution of R users across countries/cities > >>>>> - Mapping various applications to packages > >>>>> - Text Mining of the responses to create informative word clouds > >>>>> > >>>>> Personally, I am excited about the kind of data I will receive > through > >>>> > >>>> this > >>>>> > >>>>> survey and the various insights that could be derived. As > >>>>> already > >>>> > >>>> mentioned, > >>>>> > >>>>> the results will be shared with the community. > >>>>> > >>>>> Thank you Rex for raising an important point. It is indeed > >>>>> necessary for > >>>> > >>>> me > >>>>> > >>>>> to personally assure the user community that the results will be > shared > >>>>> in > >>>> > >>>> a > >>>>> > >>>>> manner that will not contain any personally identifiable information. > >>>>> > >>>>> Those who wish to gain access to the raw data will be provided > >>>>> with > all > >>>> > >>>> the > >>>>> > >>>>> fields but not the name and employer name fields. > >>>>> > >>>>> Just out of curiosity : It is possible to get name, employer > >>>>> name, > >>>> > >>>> location, > >>>>> > >>>>> usage information and academic background details when searching > >>>>> for > R > >>>> > >>>> users > >>>>> > >>>>> on LinkedIn and the many R related groups there. > >>>>> Does this also provide potential opportunities for misuse and > >>>>> "outrageous" > >>>>> analyses, since almost anyone can get onto LinkedIn and access > >>>>> user > >>>> > >>>> profiles > >>>>> > >>>>> ? > >>>>> > >>>>> Thank you for your interest and support. > >>>>> Regards, > >>>>> Harsh > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On Thu, Mar 3, 2011 at 8:02 PM, >>>> > >>>> rex.dwyer at syngenta.com>> wrote: > >>>>>> > >>>>>> Harsh, "Suitably analyzed" for whose purposes? One man's "suitable" > is > >>>>>> another's "outrageous". That's why people want to see the gowns > >>>>>> at > the > >>>>>> Oscars. Under what auspices are you conducting this survey? > >>>>>> What do you intend to do with it? You don't give any assurance > >>>>>> that the results you post won't have personally identifiable > >>>>>> information. I don't get the impression that you know much > >>>>>> about survey design. > >>>>>> > >>>>>> -----Original Message----- > >>>>>> From: > >>>>>> r-help-bounces at r-project.org >>>>>> g> > >>>> > >>>> > >>>> [mailto:r-help-bounces at r-project.org r-help-bounces at r-project.org>] > >>>>>> > >>>>>> On Behalf Of Harsh > >>>>>> Sent: Thursday, March 03, 2011 5:53 AM > >>>>>> To: r-help at r-project.org > >>>>>> Subject: [R] R usage survey > >>>>>> > >>>>>> Hi R users, > >>>>>> I request members of the R community to consider filling a > >>>>>> short survey regarding the use of R. > >>>>>> The survey can be found at http://goo.gl/jw1ig > >>>>>> > >>>>>> Please accept my apologies for posting here for a non-technical > >>>>>> reason. > >>>>>> > >>>>>> The data collected will be suitably analyzed and I'll post a > >>>>>> link to the results in the coming weeks. > >>>>>> > >>>>>> Thank you all for your interest and for sharing your R usage > >>>>>> information. > >>>>>> > >>>>>> Regards, > >>>>>> Harsh Singhal > >>>>>> > >>>>>> [[alternative HTML version deleted]] > >>>>>> > >>>>>> ______________________________________________ > >>>>>> R-help at r-project.org mailing list > >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>> PLEASE do read the posting guide > >>>>>> http://www.R-project.org/posting-guide.html > >>>>>> and provide commented, minimal, self-contained, reproducible code. > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> message may contain confidential information. If you are not > >>>>>> the > >>>> > >>>> designated > >>>>>> > >>>>>> recipient, please notify the sender immediately, and delete the > >>>>>> original > >>>> > >>>> and > >>>>>> > >>>>>> any copies. Any use of the message by you is prohibited. > >>>>>> > >>>>>> > >>>>> [[alternative HTML version deleted]] > >>>>> > >>>>> ______________________________________________ > >>>>> R-help at r-project.org mailing list > >>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>> PLEASE do read the posting guide > >>>> > >>>> http://www.R-project.org/posting-guide.html > >>>>> > >>>>> and provide commented, minimal, self-contained, reproducible code. > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> message may contain confidential information. If you are not the > >>>> > >>>> designated recipient, please notify the sender immediately, and > >>>> delete the original and any copies. Any use of the message by you > >>>> is prohibited. > >>>> > >>>> [[alternative HTML version deleted]] > >>>> > >>>> > >>>> ______________________________________________ > >>>> R-help at r-project.org mailing list > >>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>> PLEASE do read the posting guide > >>>> http://www.R-project.org/posting-guide.html > >>>> and provide commented, minimal, self-contained, reproducible code. > >>>> > > > > > > > > -- > Ista Zahn > Graduate student > University of Rochester > Department of Clinical and Social Psychology http://yourpsyche.org > [[alternative HTML version deleted]] message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. From singhalblr at gmail.com Sat Mar 5 00:25:35 2011 From: singhalblr at gmail.com (Harsh) Date: Sat, 5 Mar 2011 04:55:35 +0530 Subject: [R] R usage survey In-Reply-To: <36180405F8418449918AD20618D110FC095C0036BA@USETCMSXMB02.NAFTA.SYNGENTA.ORG> References: <36180405F8418449918AD20618D110FC095BFA6545@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <1BDAE2969943D540934EE8B4EF68F95FB27F0D4A95@EXNSW-MBX03.nexus.csiro.au> <36180405F8418449918AD20618D110FC095BFA7290@USETCMSXMB02.NAFTA.SYNGENTA.ORG> <4D71440C.6010306@structuremonitoring.com> <29387_1299270029_4D71498C_29387_162627_1_AANLkTin_j-n3HnO6ie7jHgjF-QKK5=AFehZWJAfgnrce@mail.gmail.com> <36180405F8418449918AD20618D110FC095C0036BA@USETCMSXMB02.NAFTA.SYNGENTA.ORG> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jwiley.psych at gmail.com Sat Mar 5 01:29:08 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Fri, 4 Mar 2011 16:29:08 -0800 Subject: [R] apply.rolling() to a multi column timeSeries In-Reply-To: <137183.14703.qm@web27905.mail.ukl.yahoo.com> References: <137183.14703.qm@web27905.mail.ukl.yahoo.com> Message-ID: Hi Will, Time series are not my strength, but it seems like this is a case where it is easiest to use rollapply() directly rather than the wrapper in PerformanceAnalytics. Here is an example using one of the provided datasets. The first element of the output list is created using regular apply() to go through the columns and apply.rolling(), the second is made uing just rollapply(). ############################ require(PerformanceAnalytics) data(managers) f <- function(xIn) {prod(1 + xIn)} out <- list(AppRoll = apply(managers[,c(1, 3, 4), drop = FALSE], 2, FUN = function(x) {apply.rolling(x, FUN= f, width=36)})[36:132, ], RollApp = rollapply(data = managers[, c(1,3,4), drop = FALSE], width = 36, FUN = f, align = "right")) head(out$AppRoll) head(out$RollApp) ############################ Note that rollapply() is in package "zoo". HTH, Josh On Fri, Mar 4, 2011 at 7:47 AM, William Mok wrote: > Hello there, > > > I am trying to compute the 3 months return momentum with the timeSeries x.ts, > which is just a subset of simple returns from a much bigger series, > >> class(x.ts) > [1] "timeSeries" > attr(,"package") > [1] "timeSeries" > >> dim(x.ts) > [1] 20 ?3 > >> x.ts[1:8,] > GMT > ? ? ? ? ? ? ? ? MS.US ? ? ?AAPL.US ? ? ? CA.FP > 1996-01-31 ?0.15159065 -0.133391894 ?0.10602243 > 1996-02-29 -0.00692633 -0.004488850 ?0.03986648 > 1996-03-29 ?0.06511157 -0.106763636 ?0.07930919 > 1996-04-30 -0.04803468 -0.007653477 ?0.09490285 > 1996-05-31 ?0.08715949 ?0.071709879 ?0.05126406 > 1996-06-28 -0.03586503 -0.196141479 ? 0.01908068 > 1996-07-31 -0.10941283 ?0.047619048 -0.04993095 > 1996-08-30 -0.01720023 ?0.102363636 -0.06605725 > > Then, I ran the following, > > f <- function(xIn) {prod(1 + xIn)} > tmp.ts <- apply.rolling(x.ts[,, drop = FALSE], FUN=f, width=3) > xMom.ts <- tmp.ts - 1 > > where, > >> xMom.ts[1:8,] > GMT > ? ? ? ? ? ? ? ? ?calcs > 1996-01-31 ? ? ? ? ? NA > 1996-02-29 ? ? ? ? ? NA > 1996-03-29 ?0.218076872 > 1996-04-30 ?0.006926330 > 1996-05-31 ?0.102324581 > 1996-06-28 -0.002179951 > 1996-07-31 -0.066514593 > 1996-08-30 -0.156122673 > > It seems that apply.rolling() only executed for the first column "MS.US" but not > > column 2 nor 3. > > Q: Apart from looping through the column index manually via a for loop, which is > > not ideal in R, is there any other way to execute the function for ?every column > > in this setup? > > Many thx. > > Will > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From ggrothendieck at gmail.com Sat Mar 5 01:43:48 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Fri, 4 Mar 2011 19:43:48 -0500 Subject: [R] apply.rolling() to a multi column timeSeries In-Reply-To: <137183.14703.qm@web27905.mail.ukl.yahoo.com> References: <137183.14703.qm@web27905.mail.ukl.yahoo.com> Message-ID: On Fri, Mar 4, 2011 at 10:47 AM, William Mok wrote: > Hello there, > > > I am trying to compute the 3 months return momentum with the timeSeries x.ts, > which is just a subset of simple returns from a much bigger series, > >> class(x.ts) > [1] "timeSeries" > attr(,"package") > [1] "timeSeries" > >> dim(x.ts) > [1] 20 ?3 > >> x.ts[1:8,] > GMT > ? ? ? ? ? ? ? ? MS.US ? ? ?AAPL.US ? ? ? CA.FP > 1996-01-31 ?0.15159065 -0.133391894 ?0.10602243 > 1996-02-29 -0.00692633 -0.004488850 ?0.03986648 > 1996-03-29 ?0.06511157 -0.106763636 ?0.07930919 > 1996-04-30 -0.04803468 -0.007653477 ?0.09490285 > 1996-05-31 ?0.08715949 ?0.071709879 ?0.05126406 > 1996-06-28 -0.03586503 -0.196141479 ? 0.01908068 > 1996-07-31 -0.10941283 ?0.047619048 -0.04993095 > 1996-08-30 -0.01720023 ?0.102363636 -0.06605725 > > Then, I ran the following, > > f <- function(xIn) {prod(1 + xIn)} > tmp.ts <- apply.rolling(x.ts[,, drop = FALSE], FUN=f, width=3) > xMom.ts <- tmp.ts - 1 > > where, > >> xMom.ts[1:8,] > GMT > ? ? ? ? ? ? ? ? ?calcs > 1996-01-31 ? ? ? ? ? NA > 1996-02-29 ? ? ? ? ? NA > 1996-03-29 ?0.218076872 > 1996-04-30 ?0.006926330 > 1996-05-31 ?0.102324581 > 1996-06-28 -0.002179951 > 1996-07-31 -0.066514593 > 1996-08-30 -0.156122673 > > It seems that apply.rolling() only executed for the first column "MS.US" but not > > column 2 nor 3. > > Q: Apart from looping through the column index manually via a for loop, which is > > not ideal in R, is there any other way to execute the function for ?every column > > in this setup? > The rollapply function in zoo works by column by default: Lines <- "Date MS.US AAPL.US CA.FP 1996-01-31 0.15159065 -0.133391894 0.10602243 1996-02-29 -0.00692633 -0.004488850 0.03986648 1996-03-29 0.06511157 -0.106763636 0.07930919 1996-04-30 -0.04803468 -0.007653477 0.09490285 1996-05-31 0.08715949 0.071709879 0.05126406 1996-06-28 -0.03586503 -0.196141479 0.01908068 1996-07-31 -0.10941283 0.047619048 -0.04993095 1996-08-30 -0.01720023 0.102363636 -0.06605725" library(zoo) z <- read.zoo(textConnection(Lines), header = TRUE) f <- function(xIn) prod(1 + xIn) rollapply(z, 3, f, na.pad = TRUE, align = "right") The last line produces: > rollapply(z, 3, f, na.pad = TRUE, align = "right") MS.US AAPL.US CA.FP 1996-01-31 NA NA NA 1996-02-29 NA NA NA 1996-03-29 1.2180769 0.7706111 1.2413304 1996-04-30 1.0069263 0.8824211 1.2288505 1996-05-31 1.1023246 0.9499636 1.2423194 1996-06-28 0.9978200 0.8549096 1.1729945 1996-07-31 0.9334854 0.9025271 1.0178307 1996-08-30 0.8438773 0.9283418 0.9042406 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From mxkuhn at gmail.com Sat Mar 5 02:46:20 2011 From: mxkuhn at gmail.com (Max Kuhn) Date: Fri, 4 Mar 2011 20:46:20 -0500 Subject: [R] Course: R for Predictive Modeling: A Hands-On Introduction Message-ID: R for Predictive Modeling: A Hands-On Introduction Predictive Analytics World in San Francisco Sunday March 13, 9am to 4:30pm This one-day session provides a hands-on introduction to R, the well-known open-source platform for data analysis. Real examples are employed in order to methodically expose attendees to best practices driving R and its rich set of predictive modeling packages, providing hands-on experience and know-how. R is compared to other data analysis platforms, and common pitfalls in using R are addressed. From aquanyc at gmail.com Sat Mar 5 00:57:56 2011 From: aquanyc at gmail.com (rivercode) Date: Fri, 4 Mar 2011 15:57:56 -0800 (PST) Subject: [R] xts POSIXct index format Message-ID: <1299283076151-3336136.post@n4.nabble.com> Hi, I cannot figure out how to change the index format when displaying POSIXct objects. Would like the xts index to display as %H:%M:%OS3 when doing viewing the xts object. Think I am missing the obvious. Cheers, Chris -- View this message in context: http://r.789695.n4.nabble.com/xts-POSIXct-index-format-tp3336136p3336136.html Sent from the R help mailing list archive at Nabble.com. From emptican at gmail.com Fri Mar 4 22:39:40 2011 From: emptican at gmail.com (Steve Hong) Date: Fri, 4 Mar 2011 15:39:40 -0600 Subject: [R] grouping data Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From hrbuilder at hotmail.com Sat Mar 5 01:37:41 2011 From: hrbuilder at hotmail.com (Al Roark) Date: Fri, 4 Mar 2011 18:37:41 -0600 Subject: [R] Repeating the same calculation across multiple pairs of variables Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pgauthi1 at lakeheadu.ca Sat Mar 5 00:11:26 2011 From: pgauthi1 at lakeheadu.ca (PatGauthier) Date: Fri, 4 Mar 2011 15:11:26 -0800 (PST) Subject: [R] Trimmed Spearman-Karber Message-ID: <1299280286268-3336075.post@n4.nabble.com> Hello All, I am searching for a package which allows me to run an LC50 (LD50) estimate using the trimmer spearman-karber method. Has any one developed code for this analysis? In my case, probit or logit analyses are less accurate. Please help! I will post code if I end up finding/writing it. Thank you, Pat -- View this message in context: http://r.789695.n4.nabble.com/Trimmed-Spearman-Karber-tp3336075p3336075.html Sent from the R help mailing list archive at Nabble.com. From sravassi at gmail.com Sat Mar 5 00:01:19 2011 From: sravassi at gmail.com (santiagorf) Date: Fri, 4 Mar 2011 15:01:19 -0800 (PST) Subject: [R] integrate a fuction Message-ID: <1299279679529-3336066.post@n4.nabble.com> I'm having a function of the form 1> f<-function(x){ 1+ 1+ return(x^p) 1+ 1+ } ,and I would like to integrate it with respect to x, where p should be any constant. One way would be to set a value for p globally and then call integrate function: p=2 integrate(f, lower = -1, upper = 1) However, I would like to use 'integrate' inside a function, so I could call it passing p as a parameter. I tried something like this: 1> p=1 1> integral<-function(p){ 1+ integrate(f, lower = -1, upper = 1) 1+ 1+ } 1> 1> integral(2) 0 with absolute error < 1.1e-14 ,but it doesn't work as the integral of f is evaluated with p=1 (the value of the global variable p) and not with the value of p=2 when the function integral is called. Does anyone knows how can I solve this problem? Thanks in advance santiagorf -- View this message in context: http://r.789695.n4.nabble.com/integrate-a-fuction-tp3336066p3336066.html Sent from the R help mailing list archive at Nabble.com. From vassalosmichael at uky.edu Sat Mar 5 02:04:41 2011 From: vassalosmichael at uky.edu (mike) Date: Fri, 4 Mar 2011 17:04:41 -0800 (PST) Subject: [R] Lepage Test In-Reply-To: References: <1299268847806-3335861.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jwiley.psych at gmail.com Sat Mar 5 03:52:39 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Fri, 4 Mar 2011 18:52:39 -0800 Subject: [R] grouping data In-Reply-To: References: Message-ID: Hi Steve, Just test whether y is greater than the predicted y (i.e., your line). ## function using the model coefficients* f <- function(x) {82.9996 + (.5589 * x)} ## Find group membership group <- ifelse(y > foo(x), "A", "B") *Note that depending how accurate this needs to be, you will probably want to use the model itself rather than just reading from the printout like I did. If you need to do that, take a look at ?predict For future reference, it would be easier for readers if you provided your data via something like: dput(x) that can be copied directly into the R console. Also, if you are generating random data (rnorm()), you can use set.seed() so that we can replicate exactly what you get. HTH, Josh On Fri, Mar 4, 2011 at 1:39 PM, Steve Hong wrote: > Hi R-list, > > I have a data set with plot locations and observations and want to label > them based on locations. ?For example, I have GPS information (x and y) as > follows: [snip] >> (fm1 <- lm(ysim~xsim)) > Call: > lm(formula = ysim ~ xsim) > Coefficients: > (Intercept) ? ? ? ? xsim > ? ?82.9996 ? ? ? 0.5589 > > I overlapped fitted line on the plot. > >> abline(fm1) > My question is: > As you can see in the plot, how can I label (or re-group) those in upper > diagonal as (say) 'A' and the others in lower diagonal as 'B'? > > Thanks a lot in advance!!! > > Steve > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From rhelpacc at gmail.com Sat Mar 5 03:52:41 2011 From: rhelpacc at gmail.com (Robert A'gata) Date: Fri, 4 Mar 2011 21:52:41 -0500 Subject: [R] loess function takes long to estimate Message-ID: Hi, I have 2 questions regarding using loess function in stats package. 1. I have about 1 million points. I did loess with alpha=0.5 and degree=1 and 2 regressors. It has run for more than 5 hrs on 64bit 16CPU and 64GB server (with no other process running). I am wondering if this is usual? And if there is anyway to make it run faster? 2. I noticed a difference in RAM consumption when specifying loess function differently. The first case is I ran loess(z ~ x+y,data=X,degree=1,alpha=.5) where my X has about 50 columns. The second case I trimmed X to only contain x,y and z. Then run loess(z ~ x+y, data=SmallX, degree=1, alpha=.5). I find that the second case consumes only 4% of RAM whereas the first case uses up to 33%. I'd like to know how loess handles input data under the hood? It seems to me it attaches the whole data into memory and hence results in what I observed. Is my understanding correct? Thank you in advance. Regards, Robert From jwiley.psych at gmail.com Sat Mar 5 04:00:16 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Fri, 4 Mar 2011 19:00:16 -0800 Subject: [R] Repeating the same calculation across multiple pairs of variables In-Reply-To: References: Message-ID: Hi Al, Assuming that the order of the matrices resulting from selecting "avars" and "bvars" is identical (it is at least in the example you gave), then you can do: dat <- data.frame(a1=1:10, a2=11:20, a3=21:30, b1=101:110, b2=111:120, b3=121:130) avars <- paste("a", 1:3, sep = '') bvars <- paste("b", 1:3, sep = '') cvars <- paste("c", 1:3, sep = '') dat[, cvars] <- dat[, avars] / dat[, bvars] If you are using character strings for the names, you need to use [ rather than $. For documentation, see ?"[" Hope this helps, Josh On Fri, Mar 4, 2011 at 4:37 PM, Al Roark wrote: > > Hi all, > > I frequently encounter datasets that require me to repeat the same calculation across many variables. For example, given a dataset with total employment variables and manufacturing employment variables for the years 1990-2010, I might have to calculate manufacturing's share of total employment in each year. I find it cumbersome to have to manually define a share for each year and would like to know how others might handle this kind of task. > > For example, given the data frame: > > df<-data.frame(a1=1:10, a2=11:20, a3=21:30, b1=101:110, b2=111:120, b3=121:130) > > I'd like to append new variables--c1, c2, and c3--to the data frame that are the result of a1/b1, a2/b2, and a3/b3, respectively. > > When there are only a few of these variables, I don't really have a problem, but it becomes a chore when the number of variables increases. Is there a way I can do this kind of processing using a loop? I tried defining a vector to hold the names for the "c variables" (e.g. c1,c2, ... cn) and creating new variables in a loop using code like: > > avars<-c("a1","a2","a3") > bvars<-c("b1","b2","b3") > cvars<-c("c1","c2","c3") > for(i in 1:3){ > ?df$cvars[i]<-df$avars[i]/df$bvars[i] > } > > But the variable references don't resolve properly with this particular syntax. > > Any help would be much appreciated. Cheers. > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From dwinsemius at comcast.net Sat Mar 5 04:14:47 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 4 Mar 2011 22:14:47 -0500 Subject: [R] integrate a fuction In-Reply-To: <1299279679529-3336066.post@n4.nabble.com> References: <1299279679529-3336066.post@n4.nabble.com> Message-ID: On Mar 4, 2011, at 6:01 PM, santiagorf wrote: > I'm having a function of the form > > 1> f<-function(x){ > 1+ > 1+ return(x^p) > 1+ > 1+ } > > ,and I would like to integrate it with respect to x, where p should > be any > constant. > > One way would be to set a value for p globally and then call integrate > function: > p=2 > integrate(f, lower = -1, upper = 1) > > However, I would like to use 'integrate' inside a function, so I > could call > it passing p as a parameter. I tried something like this: > > 1> p=1 > 1> integral<-function(p){ > 1+ integrate(f, lower = -1, upper = 1) > 1+ > 1+ } > 1> > 1> integral(2) > 0 with absolute error < 1.1e-14 > > ,but it doesn't work as the integral of f is evaluated with p=1 (the > value > of the global variable p) and not with the value of p=2 when the > function > integral is called. Functions carry with them environments in whaich they are defined. > > Does anyone knows how can I solve this problem? Build `f` to have two parameters. -- David > Thanks in advance > santiagorf > > > > -- > View this message in context: http://r.789695.n4.nabble.com/integrate-a-fuction-tp3336066p3336066.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From stgries at gmail.com Sat Mar 5 05:15:10 2011 From: stgries at gmail.com (Stefan Th. Gries) Date: Fri, 4 Mar 2011 20:15:10 -0800 Subject: [R] pvclust crashing R on Ubuntu 10.10 Message-ID: Hi all I am writing to you with a question regarding the pvclust package. And yes, before the usual people produce their usual contact-the-package-maintainers line, ye, I tried that but the emails one can find on the web either bounce or are not responded to. Also, yes, this error has already been reported as a bug but been shot down as not reproducible (). Thus, here's now the version that I can reproduce. When I run this: ### library(pvclust); data(iris) x <- pvclust(iris[,1:4], method.dist="canberra", method.hclust="ward", nboot=100) plot(x) ### I get this: ### address 0x7fffa22b5000, cause 'memory not mapped' ### Here is some info that might help: ### Ubuntu 10.10 Linux kernel: 2.6.35-27 generic > R.version platform x86_64-pc-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 2 minor 12.2 year 2011 month 02 day 25 svn rev 54585 language R version.string R version 2.12.2 (2011-02-25) ### Paradoxically enough, R for Windows installed on this machine (running with Wine) causes no problems, but on Linux ... Any ideas what that could be? Thanks a lot, STG -- Stefan Th. Gries ----------------------------------------------- University of California, Santa Barbara http://www.linguistics.ucsb.edu/faculty/stgries From erinm.hodgess at gmail.com Sat Mar 5 05:34:50 2011 From: erinm.hodgess at gmail.com (Erin Hodgess) Date: Fri, 4 Mar 2011 22:34:50 -0600 Subject: [R] generating factors from the edit function Message-ID: Dear R People: If I use the fix or edit function for a new data frame, I would like to have my character data as factors. Is there a "built in" way to do this, please? Here is what I did: > test2.df <- data.frame() > test2.df <- fix(test2.df,factor.mode="character") > str(test2.df) 'data.frame': 5 obs. of 2 variables: $ var1: num 1 2 3 4 5 $ var2: chr "a" "a" "g" "g" ... > The character data is simply that. I would like create factors when I enter character data. Thank you! Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodgess at gmail.com From izahn at psych.rochester.edu Sat Mar 5 05:42:21 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Sat, 5 Mar 2011 04:42:21 +0000 Subject: [R] generating factors from the edit function In-Reply-To: References: Message-ID: Hi Erin, I would set up the data.frame the way you want it before calling fix(). Something like test2df <- data.frame(v1=numeric(), v2=factor()) test2df <- fix(test2df) Best, Ista On Sat, Mar 5, 2011 at 4:34 AM, Erin Hodgess wrote: > Dear R People: > > If I use the fix or edit function for a new data frame, I would like > to have my character data as factors. > > Is there a "built in" way to do this, please? > > Here is what I did: > >> test2.df <- data.frame() >> test2.df <- fix(test2.df,factor.mode="character") >> str(test2.df) > 'data.frame': ? 5 obs. of ?2 variables: > ?$ var1: num ?1 2 3 4 5 > ?$ var2: chr ?"a" "a" "g" "g" ... >> > The character data is simply that. ?I would like create factors when I > enter character data. > > Thank you! > > Sincerely, > > Erin > > > -- > Erin Hodgess > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: erinm.hodgess at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From erinm.hodgess at gmail.com Sat Mar 5 05:45:02 2011 From: erinm.hodgess at gmail.com (Erin Hodgess) Date: Fri, 4 Mar 2011 22:45:02 -0600 Subject: [R] generating factors from the edit function In-Reply-To: References: Message-ID: Here is a little function that I put together: > fact1 function(x) { n <- ncol(x) for(i in 1:n) { if(mode(x[,i])=="character")x[,i] <- factor(x[,i]) } return(x) } > It does the trick. I'm sure that there are better ways, but this seems ok. Thank you!!! Sincerely, Erin On Fri, Mar 4, 2011 at 10:42 PM, Ista Zahn wrote: > Hi Erin, > I would set up the data.frame the way you want it before calling > fix(). Something like > > test2df <- data.frame(v1=numeric(), v2=factor()) > test2df <- fix(test2df) > > Best, > Ista > > On Sat, Mar 5, 2011 at 4:34 AM, Erin Hodgess wrote: >> Dear R People: >> >> If I use the fix or edit function for a new data frame, I would like >> to have my character data as factors. >> >> Is there a "built in" way to do this, please? >> >> Here is what I did: >> >>> test2.df <- data.frame() >>> test2.df <- fix(test2.df,factor.mode="character") >>> str(test2.df) >> 'data.frame': ? 5 obs. of ?2 variables: >> ?$ var1: num ?1 2 3 4 5 >> ?$ var2: chr ?"a" "a" "g" "g" ... >>> >> The character data is simply that. ?I would like create factors when I >> enter character data. >> >> Thank you! >> >> Sincerely, >> >> Erin >> >> >> -- >> Erin Hodgess >> Associate Professor >> Department of Computer and Mathematical Sciences >> University of Houston - Downtown >> mailto: erinm.hodgess at gmail.com >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Ista Zahn > Graduate student > University of Rochester > Department of Clinical and Social Psychology > http://yourpsyche.org > -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodgess at gmail.com From djmuser at gmail.com Sat Mar 5 06:33:28 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Fri, 4 Mar 2011 21:33:28 -0800 Subject: [R] Repeating the same calculation across multiple pairs of variables In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jwiley.psych at gmail.com Sat Mar 5 07:25:33 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Fri, 4 Mar 2011 22:25:33 -0800 Subject: [R] generating factors from the edit function In-Reply-To: References: Message-ID: Hi Erin, I am assuming this is for pedagogical purposes (i.e., make R less intimidating by not reading data in). If that is true, you may just want to automate the whole process. I am typically twitchy about using assign(), but fix does it and again if only for introducing students to R..... ## mostly automated, still needs assignment fact1 <- function() { dat <- data.frame() dat <- edit(dat) as.data.frame(as.list(dat)) } test2.df <- fact1() ## assigns to users' workspace too sfix <- function(x) { dat <- data.frame() dat <- edit(dat) assign(x, as.data.frame(as.list(dat)), envir = .GlobalEnv) } sfix("test3.df") Cheers, Josh On Fri, Mar 4, 2011 at 8:45 PM, Erin Hodgess wrote: > Here is a little function that I put together: > >> fact1 > function(x) { > n <- ncol(x) > for(i in 1:n) { > if(mode(x[,i])=="character")x[,i] <- factor(x[,i]) > } > return(x) > } >> > > It does the trick. ?I'm sure that there are better ways, but this seems ok. > > Thank you!!! > > Sincerely, > Erin > > > On Fri, Mar 4, 2011 at 10:42 PM, Ista Zahn wrote: >> Hi Erin, >> I would set up the data.frame the way you want it before calling >> fix(). Something like >> >> test2df <- data.frame(v1=numeric(), v2=factor()) >> test2df <- fix(test2df) >> >> Best, >> Ista >> >> On Sat, Mar 5, 2011 at 4:34 AM, Erin Hodgess wrote: >>> Dear R People: >>> >>> If I use the fix or edit function for a new data frame, I would like >>> to have my character data as factors. >>> >>> Is there a "built in" way to do this, please? >>> >>> Here is what I did: >>> >>>> test2.df <- data.frame() >>>> test2.df <- fix(test2.df,factor.mode="character") >>>> str(test2.df) >>> 'data.frame': ? 5 obs. of ?2 variables: >>> ?$ var1: num ?1 2 3 4 5 >>> ?$ var2: chr ?"a" "a" "g" "g" ... >>>> >>> The character data is simply that. ?I would like create factors when I >>> enter character data. >>> >>> Thank you! >>> >>> Sincerely, >>> >>> Erin >>> >>> >>> -- >>> Erin Hodgess >>> Associate Professor >>> Department of Computer and Mathematical Sciences >>> University of Houston - Downtown >>> mailto: erinm.hodgess at gmail.com >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> -- >> Ista Zahn >> Graduate student >> University of Rochester >> Department of Clinical and Social Psychology >> http://yourpsyche.org >> > > > > -- > Erin Hodgess > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: erinm.hodgess at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From dimitrij.kudriavcev at ntsg.lt Sat Mar 5 06:02:17 2011 From: dimitrij.kudriavcev at ntsg.lt (Dmitrij Kudriavcev) Date: Sat, 5 Mar 2011 16:02:17 +1100 Subject: [R] How to copy data from data.frame to matrix In-Reply-To: <4D70FB0E.7070305@statistik.tu-dortmund.de> References: <4D70BEBA.7030402@uni-hamburg.de> <4D70FB0E.7070305@statistik.tu-dortmund.de> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From john.bullock at yale.edu Sat Mar 5 06:28:40 2011 From: john.bullock at yale.edu (John G. Bullock) Date: Sat, 05 Mar 2011 00:28:40 -0500 Subject: [R] lattice: drawing strips for single-panel plots Message-ID: <4D71CA08.5010309@yale.edu> The strip argument to panel.xyplot seems to be ignored for single-panel plots. Here is an example: data(Chem97, package = "mlmRev") myStrip <- function(...) { ltext(.5, .5, 'strip text') } densityplot(~ gcsescore, data = Chem97, strip=myStrip) The figure is printed with no strip. The strip.default documentation suggests that Deepayan intended this behavior. Still, it would help to be able to use the strip argument for single-panel plots. Is there a simple way to do this? Thank you, John From dieter.menne at menne-biomed.de Sat Mar 5 10:48:13 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Sat, 5 Mar 2011 01:48:13 -0800 (PST) Subject: [R] lattice: drawing strips for single-panel plots In-Reply-To: <4D71CA08.5010309@yale.edu> References: <4D71CA08.5010309@yale.edu> Message-ID: <1299318493316-3336450.post@n4.nabble.com> John G. Bullock-2 wrote: > > The strip argument to panel.xyplot seems to be ignored for single-panel > plots. Here is an example: > > data(Chem97, package = "mlmRev") > myStrip <- function(...) { ltext(.5, .5, 'strip text') } > densityplot(~ gcsescore, data = Chem97, strip=myStrip) > > The figure is printed with no strip. > The workaround is to give it a one-level factor: library(lattice) data(Chem97, package = "mlmRev") Chem97$what = as.factor("strip text") densityplot(~ gcsescore|what, data = Chem97) -- View this message in context: http://r.789695.n4.nabble.com/lattice-drawing-strips-for-single-panel-plots-tp3336364p3336450.html Sent from the R help mailing list archive at Nabble.com. From dieter.menne at menne-biomed.de Sat Mar 5 10:56:44 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Sat, 5 Mar 2011 01:56:44 -0800 (PST) Subject: [R] generating factors from the edit function In-Reply-To: References: Message-ID: <1299319004359-3336455.post@n4.nabble.com> Erin Hodgess-2 wrote: > > Here is a little function that I put together: > >> fact1 > function(x) { > n <- ncol(x) > for(i in 1:n) { > if(mode(x[,i])=="character")x[,i] <- factor(x[,i]) > } > return(x) > } >> > > See http://www.mail-archive.com/r-help at stat.math.ethz.ch/msg22459.html for a more R-ish approach. I find the d[] <- an FFQ (frequently forgotten solution) Dieter -- View this message in context: http://r.789695.n4.nabble.com/generating-factors-from-the-edit-function-tp3336273p3336455.html Sent from the R help mailing list archive at Nabble.com. From mails4me at gmx.at Sat Mar 5 13:06:22 2011 From: mails4me at gmx.at (Marcel J.) Date: Sat, 05 Mar 2011 13:06:22 +0100 Subject: [R] Change panel background color in spplot() Message-ID: <4D72273E.60704@gmx.at> Hi! How does one change the background color of the map-panel in spplot()? Example: library(sp) data(meuse.grid) gridded(meuse.grid) = ~x+y spplot(meuse.grid, "part.a") How would I get another background-color for the map-panel (but not for the whole plot) here? Thank you! Marcel From izahn at psych.rochester.edu Sat Mar 5 15:07:04 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Sat, 5 Mar 2011 14:07:04 +0000 Subject: [R] pvclust crashing R on Ubuntu 10.10 In-Reply-To: References: Message-ID: I confirm this bug exists and is 100% replicable on R version 2.12.2 (2011-02-25) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 [7] LC_PAPER=en_US.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] pvclust_1.2-1 loaded via a namespace (and not attached): [1] tools_2.12.2 interestingly, plot(x$hclust) works fine, which is strange because this is the main function called by plot.pvclust (everything else just looks like setting up labels and such). HTH, Ista On Sat, Mar 5, 2011 at 4:15 AM, Stefan Th. Gries wrote: > Hi all > > I am writing to you with a question regarding the pvclust package. And > yes, before the usual people produce their usual > contact-the-package-maintainers line, ye, I tried that but the emails > one can find on the web either bounce or are not responded to. Also, > yes, this error has already been reported as a bug but been shot down > as not reproducible > (). Thus, > here's now the version that I can reproduce. When I run this: > > ### > library(pvclust); data(iris) > x <- pvclust(iris[,1:4], method.dist="canberra", method.hclust="ward", > nboot=100) > plot(x) > ### > > I get this: > > ### > address 0x7fffa22b5000, cause 'memory not mapped' > ### > > Here is some info that might help: > > ### > Ubuntu 10.10 > Linux kernel: 2.6.35-27 generic > >> R.version > platform x86_64-pc-linux-gnu > arch x86_64 > os linux-gnu > system x86_64, linux-gnu > status > major 2 > minor 12.2 > year 2011 > month 02 > day 25 > svn rev 54585 > language R > version.string R version 2.12.2 (2011-02-25) > ### > > Paradoxically enough, R for Windows installed on this machine (running > with Wine) causes no problems, but on Linux ... Any ideas what that > could be? > > Thanks a lot, > STG > -- > Stefan Th. Gries > ----------------------------------------------- > University of California, Santa Barbara > http://www.linguistics.ucsb.edu/faculty/stgries > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From bt_jannis at yahoo.de Sat Mar 5 15:23:15 2011 From: bt_jannis at yahoo.de (Jannis) Date: Sat, 05 Mar 2011 15:23:15 +0100 Subject: [R] Change panel background color in spplot() In-Reply-To: <4D72273E.60704@gmx.at> References: <4D72273E.60704@gmx.at> Message-ID: <4D724753.6070309@yahoo.de> I would guess it works the same as for standard trellis graphs. Googleing: trellis R change panel background gives you links to some discussions about these issues. HTH Jannis On 03/05/2011 01:06 PM, Marcel J. wrote: > Hi! > > How does one change the background color of the map-panel in spplot()? > > Example: > > library(sp) > > data(meuse.grid) > gridded(meuse.grid) = ~x+y > > spplot(meuse.grid, "part.a") > > How would I get another background-color for the map-panel (but not > for the whole plot) here? > > Thank you! > > Marcel > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From bt_jannis at yahoo.de Sat Mar 5 15:28:46 2011 From: bt_jannis at yahoo.de (Jannis) Date: Sat, 05 Mar 2011 15:28:46 +0100 Subject: [R] retrieve x y coordinates of points in current plot In-Reply-To: <1299261317820-3335692.post@n4.nabble.com> References: <4D71204E.2030109@yahoo.de> <1299261317820-3335692.post@n4.nabble.com> Message-ID: <4D72489E.8050507@yahoo.de> Thanks for your replies, Dieter and Richard. I am aware of these two functions. I wanted, however, to write a similar function to identify by selecting large clouds of points. For this I would need to rertrieve their coordinates after the plot was called and created. As identify() is able to do this, I was wondering whether my function could do this as well in a similar manner. Until now, I have to supply the x and y coordinates to my function as well. I would like to avoid this. Jannis On 03/04/2011 06:55 PM, Dieter Menne wrote: > jannis-2 wrote: >> >> is it somehow possible to retrieve the x and y coordinates of points in >> a scatterplot after it has been plotted? identify() somehow seems to >> manage this, so I was wondering whether it is possible? >> > locator might be the more basic function you are looking for. > > Dieter > > > -- > View this message in context: http://r.789695.n4.nabble.com/retrieve-x-y-coordinates-of-points-in-current-plot-tp3335642p3335692.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From jrkrideau at yahoo.ca Sat Mar 5 16:04:06 2011 From: jrkrideau at yahoo.ca (John Kane) Date: Sat, 5 Mar 2011 07:04:06 -0800 (PST) Subject: [R] Contingency table in R In-Reply-To: Message-ID: <567999.85159.qm@web38407.mail.mud.yahoo.com> Did you get a reply on this? We really need to see your code and some sample data. Have a look at ?dput tko supply some data. At the moment it looks like R is not recognizing the dara set. If ls( libts the dataset, are you using mydara$excat --- On Wed, 3/2/11, Laura Clasemann wrote: > From: Laura Clasemann > Subject: [R] Contingency table in R > To: r-help at r-project.org > Received: Wednesday, March 2, 2011, 9:13 AM > > Hi, > > I have a table in R with data I needed and need to create a > contingency table out of it. The table I have so far looks > like this: > > > ? ? ? ? ? ? ? ? > ???Binger > r > DietType? ???No Yes > ? Dangerous? 15? 12 > ? Healthy? ? 52???9 > ? None? ? ? 134? 24 > ? Unhealthy? 72? 23 > > These are the error messages that I keep getting whenever I > try to get a contingency table. I'm not sure why it won't > work for me, any help would be appreciated! > > nametable<-table(excat,recat) > Error in table(excat, recat) : object 'excat' not found > ??? > ???????? > ?????? ??? > ? > ??? [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > From mails4me at gmx.at Sat Mar 5 16:22:52 2011 From: mails4me at gmx.at (Marcel J.) Date: Sat, 05 Mar 2011 16:22:52 +0100 Subject: [R] Change panel background color in spplot() In-Reply-To: <4D724753.6070309@yahoo.de> References: <4D72273E.60704@gmx.at> <4D724753.6070309@yahoo.de> Message-ID: <4D72554C.2060601@gmx.at> Thank you, Jannis! I came as far as that: library(sp) data(meuse.grid) gridded(meuse.grid) = ~x+y spplot(meuse.grid, zcol = "part.a", sp.layout= list("panel.fill", "grey")) but here not only the background is grey. Instead the whole panel turns grey... Help would be appreciated! Thank you, Marcel Am 2011-03-05 15:23, schrieb Jannis: > I would guess it works the same as for standard trellis graphs. > Googleing: > > trellis R change panel background > > gives you links to some discussions about these issues. > > > > > HTH > Jannis > > > On 03/05/2011 01:06 PM, Marcel J. wrote: >> Hi! >> >> How does one change the background color of the map-panel in spplot()? >> >> Example: >> >> library(sp) >> >> data(meuse.grid) >> gridded(meuse.grid) = ~x+y >> >> spplot(meuse.grid, "part.a") >> >> How would I get another background-color for the map-panel (but not >> for the whole plot) here? >> >> Thank you! >> >> Marcel >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > From john.bullock at yale.edu Sat Mar 5 13:59:43 2011 From: john.bullock at yale.edu (John G. Bullock) Date: Sat, 05 Mar 2011 07:59:43 -0500 Subject: [R] lattice: drawing strips for single-panel plots In-Reply-To: <4D71CA08.5010309@yale.edu> References: <4D71CA08.5010309@yale.edu> Message-ID: <4D7233BF.30502@yale.edu> > >> The strip argument to panel.xyplot seems to be ignored for >> single-panel plots. > The workaround is to give it a one-level factor: > > library(lattice) > data(Chem97, package = "mlmRev") > Chem97$what = as.factor("strip text") > densityplot(~ gcsescore|what, data = Chem97) Thank you. That works. John From nafsar at alfaisal.edu Sat Mar 5 12:56:24 2011 From: nafsar at alfaisal.edu (Nasir Afsar) Date: Sat, 5 Mar 2011 14:56:24 +0300 Subject: [R] R Statistical Package Installation Message-ID: <2B94B173C8F2BD459043153FF8529F322823E3AA8B@AUSV-MBX.alfaisal.edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sravassi at gmail.com Sat Mar 5 15:51:00 2011 From: sravassi at gmail.com (santiagorf) Date: Sat, 5 Mar 2011 06:51:00 -0800 (PST) Subject: [R] integrate one single variable functions with constant parameters In-Reply-To: <1299279679529-3336066.post@n4.nabble.com> References: <1299279679529-3336066.post@n4.nabble.com> Message-ID: <1299336660386-3336693.post@n4.nabble.com> If f is a two-parameter function, how can I integrate it with respect to p? -- View this message in context: http://r.789695.n4.nabble.com/integrate-one-single-variable-functions-with-constant-parameters-tp3336066p3336693.html Sent from the R help mailing list archive at Nabble.com. From sravassi at gmail.com Sat Mar 5 15:57:31 2011 From: sravassi at gmail.com (santiagorf) Date: Sat, 5 Mar 2011 06:57:31 -0800 (PST) Subject: [R] integrate one single variable functions with constant parameters In-Reply-To: <1299336660386-3336693.post@n4.nabble.com> References: <1299279679529-3336066.post@n4.nabble.com> <1299336660386-3336693.post@n4.nabble.com> Message-ID: <1299337051022-3336702.post@n4.nabble.com> I received the solution... Hi: This is what David means: f <- function(x, p) x^p integrate(f, lower = -1, upper = 1, p = 2) 0.6666667 with absolute error < 7.4e-15 integrate(f, lower = -1, upper = 1, p = 3) 0 with absolute error < 5.6e-15 # this is correct -- View this message in context: http://r.789695.n4.nabble.com/integrate-one-single-variable-functions-with-constant-parameters-tp3336066p3336702.html Sent from the R help mailing list archive at Nabble.com. From statconsult90 at gmail.com Sat Mar 5 13:38:18 2011 From: statconsult90 at gmail.com (Stat Consult) Date: Sat, 5 Mar 2011 16:08:18 +0330 Subject: [R] subscript out of bounds Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From statconsult90 at gmail.com Sat Mar 5 13:41:40 2011 From: statconsult90 at gmail.com (Stat Consult) Date: Sat, 5 Mar 2011 16:11:40 +0330 Subject: [R] subscript out of bounds In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From statconsult90 at gmail.com Sat Mar 5 13:44:20 2011 From: statconsult90 at gmail.com (Stat Consult) Date: Sat, 5 Mar 2011 16:14:20 +0330 Subject: [R] out of bound Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Sat Mar 5 16:36:21 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 5 Mar 2011 10:36:21 -0500 Subject: [R] subscript out of bounds In-Reply-To: References: Message-ID: On Mar 5, 2011, at 7:41 AM, Stat Consult wrote: > On Sat, Mar 5, 2011 at 4:08 PM, Stat Consult > wrote: > >> Dear ALL >> >> I cannot run this line >> >> stat.obs <- apply (GS, 2, function(z) Hott2(t(DATA[which(z==1),]), >> cl)) >> >> Error in Hott2 (t(DATA[which(z == 1), ]), cl) : subscript out of >> bounds >> >> I will be glade if you guide me. >> >> >> ******************************************************************************* >> >> *GS is a matrix 1857*200 >> >> *DATA is a matrix 1857*79 >> >> *cl <- as.factor(y) >> >> >> *y<- >> *c >> (1,0,1,0,1,0,0,0,0,0,0,0,1,1,0,0,0,1,0,1,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,1,1,0,0,1,1,0,0,0,1,0,1,0 >> *,0,1,1,1,0,1,1,1,1,0,1,0,0,0,1,0,1,1,1,1,0,1,0,1,1) >> >> *Hott2 <- function(x, y, var.equal=TRUE) What were you intending that Hott2 return? At the moment I see no function body. -- David Winsemius, MD Heritage Laboratories West Hartford, CT From dwinsemius at comcast.net Sat Mar 5 16:40:22 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 5 Mar 2011 10:40:22 -0500 Subject: [R] R Statistical Package Installation In-Reply-To: <2B94B173C8F2BD459043153FF8529F322823E3AA8B@AUSV-MBX.alfaisal.edu> References: <2B94B173C8F2BD459043153FF8529F322823E3AA8B@AUSV-MBX.alfaisal.edu> Message-ID: <8AF7C9C5-1537-4A33-9BD3-8F5CE4ADB4EA@comcast.net> On Mar 5, 2011, at 6:56 AM, Nasir Afsar wrote: > Dear R-project team, > I have tried but could not install the R statistical package (http://cran.ms.unimelb.edu.au/ > ) even after the help of my institute's IT personnel. The setup > file could not be downloaded. The latest file R-2.12.2.tar.gz > does not start installation wizard. Kindly extend the technical > support. That is a source file. Are you properly prepared for a source installation on what ever OS (which I do not see mentioned) you might be targetting? (I have trimmed my reply to go only to r-help. It is considered impolite to send to both r-help and r-devel for a basic question like this.) > > Best regards. > > Dr. Nasir Ali Afsar MBBS, M.Phil. > Senior Lecturer in Pharmacology, > College of Medicine, -- David Winsemius, MD Heritage Laboratories West Hartford, CT From ronggui.huang at gmail.com Sat Mar 5 16:36:52 2011 From: ronggui.huang at gmail.com (Wincent) Date: Sat, 5 Mar 2011 23:36:52 +0800 Subject: [R] R Statistical Package Installation In-Reply-To: <2B94B173C8F2BD459043153FF8529F322823E3AA8B@AUSV-MBX.alfaisal.edu> References: <2B94B173C8F2BD459043153FF8529F322823E3AA8B@AUSV-MBX.alfaisal.edu> Message-ID: I guess your OS is windows. In that case, you need to download the R binaries for windows (e.g. http://ftp.ctex.org/mirrors/CRAN/bin/windows/) rather than the source tar ball. Regards, Ronggui On 5 March 2011 19:56, Nasir Afsar wrote: > Dear R-project team, > I have tried but could not install the R statistical package (http://cran.ms.unimelb.edu.au/ ) even after the help of my institute's IT personnel. The setup file could not be downloaded. The latest file R-2.12.2.tar.gz does not start installation wizard. Kindly extend the technical support. > > Best regards. > > Dr. Nasir Ali Afsar ?MBBS, M.Phil. > Senior Lecturer in Pharmacology, > College of Medicine, > Alfaisal University, > Al-Takhassusi Road, > P.O. Box 50927, > Riyadh-11533, Saudi Arabia. > Tel: +966-1-2157679. > nafsar at alfaisal.edu > drnasirpk at yahoo.com > www.alfaisal.edu > > > > ________________________________ > DISCLAIMER: This electronic mail transmission contains c...{{dropped:16}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Wincent Ronggui HUANG Sociology Department of Fudan University PhD of City University of Hong Kong http://asrr.r-forge.r-project.org/rghuang.html From marc_schwartz at me.com Sat Mar 5 16:43:38 2011 From: marc_schwartz at me.com (Marc Schwartz) Date: Sat, 05 Mar 2011 09:43:38 -0600 Subject: [R] R Statistical Package Installation In-Reply-To: <2B94B173C8F2BD459043153FF8529F322823E3AA8B@AUSV-MBX.alfaisal.edu> References: <2B94B173C8F2BD459043153FF8529F322823E3AA8B@AUSV-MBX.alfaisal.edu> Message-ID: On Mar 5, 2011, at 5:56 AM, Nasir Afsar wrote: > Dear R-project team, > I have tried but could not install the R statistical package (http://cran.ms.unimelb.edu.au/ ) even after the help of my institute's IT personnel. The setup file could not be downloaded. The latest file R-2.12.2.tar.gz does not start installation wizard. Kindly extend the technical support. Nasir, You are trying to install and run a compressed archive that contains the SOURCE CODE for R, not an executable version of the application. You need to download the binary executable version of R for installation. You don't indicate your operating system, but go here: http://cran.ms.unimelb.edu.au/ and look at the section entitled: Download and Install R where binaries for Windows and OSX are available by following the appropriate link. If by chance you are on a Linux variant, there are binaries available for selected distributions, but we would need more information to guide you further. HTH, Marc Schwartz P.S. I dropped the R-devel e-mail cc:. Please only send queries to one list. See the Posting Guide: http://www.R-project.org/posting-guide.html From dwinsemius at comcast.net Sat Mar 5 17:22:46 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 5 Mar 2011 11:22:46 -0500 Subject: [R] xts POSIXct index format In-Reply-To: <1299283076151-3336136.post@n4.nabble.com> References: <1299283076151-3336136.post@n4.nabble.com> Message-ID: <186BD096-6149-4539-92A6-25A092115122@comcast.net> On Mar 4, 2011, at 6:57 PM, rivercode wrote: > Hi, > > I cannot figure out how to change the index format when displaying > POSIXct > objects. > > Would like the xts index to display as %H:%M:%OS3 when doing viewing > the xts > object. If you are not satisfied with the default format, you have (at least) two options: a) specify a format when you call print b) redefine the default. The default is determined within the print.xts function by a call to indexFormat > indexFormat function (x) { attr(x, ".indexFORMAT") } So it looks as though you would need to either replace indexFormat or change the attribute of your xts objects. Read further at: ?indexFormat > > Think I am missing the obvious. If you mean looking at the source code, then perhaps you were missing the obvious, or at least merely readily accessible. I don't actually see the option of including a format string in the attribute, so modifying the function may be needed. -- David Winsemius, MD Heritage Laboratories West Hartford, CT From josh.m.ulrich at gmail.com Sat Mar 5 17:28:13 2011 From: josh.m.ulrich at gmail.com (Joshua Ulrich) Date: Sat, 5 Mar 2011 10:28:13 -0600 Subject: [R] xts POSIXct index format In-Reply-To: <1299283076151-3336136.post@n4.nabble.com> References: <1299283076151-3336136.post@n4.nabble.com> Message-ID: Hi Chris, Perhaps something like this? require(xts) ds <- options(digits.secs=6) # so we can see sub-seconds x <- xts(1:10, as.POSIXct("2011-01-21") + c(1,1,1,2:8)/1e3) x indexFormat(x) <- "%H:%M:%OS3" x Hope that helps, -- Joshua Ulrich ?| ?FOSS Trading: www.fosstrading.com On Fri, Mar 4, 2011 at 5:57 PM, rivercode wrote: > Hi, > > I cannot figure out how to change the index format when displaying POSIXct > objects. > > Would like the xts index to display as %H:%M:%OS3 when doing viewing the xts > object. > > Think I am missing the obvious. > > Cheers, > Chris > > -- > View this message in context: http://r.789695.n4.nabble.com/xts-POSIXct-index-format-tp3336136p3336136.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From jhnedwards603 at gmail.com Sat Mar 5 17:36:48 2011 From: jhnedwards603 at gmail.com (John Edwards) Date: Sat, 5 Mar 2011 10:36:48 -0600 Subject: [R] How to show non user defined data set such as cu.summary (from rpart)? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From izahn at psych.rochester.edu Sat Mar 5 17:50:26 2011 From: izahn at psych.rochester.edu (Ista Zahn) Date: Sat, 5 Mar 2011 16:50:26 +0000 Subject: [R] How to show non user defined data set such as cu.summary (from rpart)? In-Reply-To: References: Message-ID: Hi John, Not sure how to do it with ls(), but easy enough with data(): data(package=library(base)) You'll need to consult both ?data and ?library to see why this works. labrary(base) could just as easily be library(stats) or whatever. Best, Ista On Sat, Mar 5, 2011 at 4:36 PM, John Edwards wrote: > Hi All, > > ls() doesn't show cu.summary. ?ls says "When invoked with no argument at the > top level prompt, \u2018ls\u2019 shows what data sets and functions a user > has defined." Therefore, the reason ls() doesn't show cu.summary is because > cu.summary is from a package but not user defined. Is there a way to show > not only user defined data sets but also data sets from loaded packages? > >> library(rpart) >> str(cu.summary) > 'data.frame': 117 obs. of ?5 variables: > ?$ Price ? ? ?: num ?11950 6851 6995 8895 7402 ... > ?$ Country ? ?: Factor w/ 10 levels "Brazil","England",..: 5 5 10 10 10 7 5 > 6 6 7 ... > ?$ Reliability: Ord.factor w/ 5 levels "Much worse"<"worse"<..: 5 NA 1 4 2 4 > NA 5 5 2 ... > ?$ Mileage ? ?: num ?NA NA NA 33 33 37 NA NA 32 NA ... > ?$ Type ? ? ? : Factor w/ 6 levels "Compact","Large",..: 4 4 4 4 4 4 4 4 4 4 > ... >> ?ls >> ls() > character(0) > > -John > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org From to_vaib at yahoo.com Sat Mar 5 18:36:57 2011 From: to_vaib at yahoo.com (vaibhav dua) Date: Sat, 5 Mar 2011 09:36:57 -0800 (PST) Subject: [R] extractModelParameters HELP!!! Message-ID: <828331.83424.qm@web39321.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ligges at statistik.tu-dortmund.de Sat Mar 5 18:50:18 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sat, 05 Mar 2011 18:50:18 +0100 Subject: [R] PCA - scores In-Reply-To: <419BB6F5-F409-4880-8E3D-DA932AE4789C@ualberta.ca> References: <131E1151-2802-4793-9901-5F8E7EBB3FB8@ualberta.ca> <419BB6F5-F409-4880-8E3D-DA932AE4789C@ualberta.ca> Message-ID: <4D7277DA.7060309@statistik.tu-dortmund.de> On 04.03.2011 17:52, Shari Clare wrote: > Hi Bill and Josh: > > When I run any "principal" code with scores=TRUE, I get the following > Error: > > Error in principal (my.data,3,scores=TRUE) : unused argument > (scores=TRUE) > > Thoughts? Your psych version (and probably also your R version) is outdated? Please upgrade both R and your packages. Best, Uwe Ligges > > Thanks, > Shari > > > > > > > On 3-Mar-11, at 9:42 PM, William Revelle wrote: > >> Shari, >> Josh partly answered your question, but his example did not include >> rotation because he took out just one factor. >> >> Try: >> >> require(psych) >> mt.pc<- principal(mtcars,3,scores=TRUE) #this gives you the >> varimax rotated first 3 principal components >> #pc.scores<- mt.pc$scores #here are the scores >> >> biplot(mt.pc) #show the data as well as the principal components >> in a biplot >> >> >> >> Bill >> >> >> At 5:15 PM -0800 3/3/11, Joshua Wiley wrote: >>> Hi Shari, >>> >>> Yes, please look at the documentation for principal. You can access >>> this (assuming you have loaded psych) by typing at the console: >>> >>> ?principal >>> >>> note the logical argument "scores". >>> >>> Here is a small example: >>> >>> ############################## >>> require(psych) >>> require(GPArotation) >>> >>> dat<- principal(mtcars[, c("mpg", "hp", "wt")], nfactors = 1, >>> rotate = "oblimin", scores = TRUE) >>> >>> dat$scores >>> ############################## >>> >>> Cheerio, >>> >>> Josh >>> >>> On Thu, Mar 3, 2011 at 1:02 PM, Shari Clare >>> wrote: >>>> I am running a PCA, but would like to rotate my data and limit the >>>> number of factors that are analyzed. I can do this using the >>>> "principal" command from the psych package [principal(my.data, >>>> nfactors=3,rotate="varimax")], but the issue is that this does not >>>> report scores for the Principal Components the way "princomp" does. >>>> >>>> My question is: >>>> >>>> Can you get an output of scores using "principal" OR, is there a way >>>> to limit the number of factors that are included when you use >>>> "princomp"? >>>> >>>> Thanks, >>>> Shari Clare >>>> >>>> PhD Candidate >>>> Department of Renewable Resources >>>> University of Alberta >>>> sclare at ualberta.ca >>>> 780-492-2540 >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> >>> -- >>> Joshua Wiley >>> Ph.D. Student, Health Psychology >>> University of California, Los Angeles >>> http://www.joshuawiley.com/ >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Sat Mar 5 18:53:16 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sat, 05 Mar 2011 18:53:16 +0100 Subject: [R] Fwd: r.dll In-Reply-To: References: <4D711EB6.2070601@statistik.tu-dortmund.de> Message-ID: <4D72788C.2060508@statistik.tu-dortmund.de> On 04.03.2011 19:40, wesley mathew wrote: > Dear All > > I downloaded R-2.12.2.tar file, but I could not find R.dll file there. Do > u mean the Windows binary distribution is R-2.12.2.tar No, it is the source distribution. 1. Go to CRAN. 2. Click on "Windows" 3. Click on "base" 4. Click on "Previous releases" (since you want the outdated R-2.12.1) 5. Click on "R-2.12.1" 5. Click on "Download R 2.12.1 for Windows" Uwe Ligges or another file? > Can you help me to find R.dll of version 2.12.1 > > Kind regards > Wesley > > ---------- Forwarded message ---------- > From: Uwe Ligges > Date: 2011/3/4 > Subject: Re: [R] r.dll > To: wesley mathew > Cc: r-help at r-project.org > > > > > On 04.03.2011 18:15, wesley mathew wrote: > >> Dear all >> I have some problem to execute jri package. R.dll file has to copped to jri >> directory for the execution of jar file in eclips. But R.dll file is not >> available in the R version 2.12.1 . >> > > It is, at least in the Windows binary distribution. > > Uwe Ligges > > > > Is there any chance to get this >> file. Thanks in advanced >> >> Kind regards >> W. Mathew >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > From ligges at statistik.tu-dortmund.de Sat Mar 5 19:02:59 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sat, 05 Mar 2011 19:02:59 +0100 Subject: [R] How to show non user defined data set such as cu.summary (from rpart)? In-Reply-To: References: Message-ID: <4D727AD3.4030206@statistik.tu-dortmund.de> Yes, after you have loaded rpart using library("rpart"), you will find the search() list similar to this one: R> search() [1] ".GlobalEnv" "package:rpart" "package:psych" "package:stats" [5] "package:graphics" "package:grDevices" "package:datasets" "package:utils" [9] "package:fortunes" "package:methods" "Autoloads" "package:base" In order to list the contents of the environment of the rpart package, use ls(pos=2) now. Uwe Ligges On 05.03.2011 17:36, John Edwards wrote: > Hi All, > > ls() doesn't show cu.summary. ?ls says "When invoked with no argument at the > top level prompt, \u2018ls\u2019 shows what data sets and functions a user > has defined." Therefore, the reason ls() doesn't show cu.summary is because > cu.summary is from a package but not user defined. Is there a way to show > not only user defined data sets but also data sets from loaded packages? > >> library(rpart) >> str(cu.summary) > 'data.frame': 117 obs. of 5 variables: > $ Price : num 11950 6851 6995 8895 7402 ... > $ Country : Factor w/ 10 levels "Brazil","England",..: 5 5 10 10 10 7 5 > 6 6 7 ... > $ Reliability: Ord.factor w/ 5 levels "Much worse"<"worse"<..: 5 NA 1 4 2 4 > NA 5 5 2 ... > $ Mileage : num NA NA NA 33 33 37 NA NA 32 NA ... > $ Type : Factor w/ 6 levels "Compact","Large",..: 4 4 4 4 4 4 4 4 4 4 > ... >> ?ls >> ls() > character(0) > > -John > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From Greg.Snow at imail.org Sat Mar 5 19:25:44 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Sat, 5 Mar 2011 11:25:44 -0700 Subject: [R] retrieve x y coordinates of points in current plot In-Reply-To: <4D72489E.8050507@yahoo.de> References: <4D71204E.2030109@yahoo.de> <1299261317820-3335692.post@n4.nabble.com> <4D72489E.8050507@yahoo.de> Message-ID: It is not completely clear what you are trying to accomplish. Do you want to draw a shape in the plot then identify all the points in that shape? You could use locator (with type='l') to draw a polygon, then there are functions in add on packages (mostly the spatial ones) that will detect which points are within a polygon that you could use with the raw data and the polygon created. If that is not what you want, then maybe describe your goals in more detail (examples are good if you can give one). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Jannis > Sent: Saturday, March 05, 2011 7:29 AM > To: Dieter Menne > Cc: r-help at r-project.org > Subject: Re: [R] retrieve x y coordinates of points in current plot > > Thanks for your replies, Dieter and Richard. I am aware of these two > functions. I wanted, however, to write a similar function to identify > by > selecting large clouds of points. For this I would need to rertrieve > their coordinates after the plot was called and created. As identify() > is able to do this, I was wondering whether my function could do this > as > well in a similar manner. Until now, I have to supply the x and y > coordinates to my function as well. I would like to avoid this. > > > Jannis > > On 03/04/2011 06:55 PM, Dieter Menne wrote: > > jannis-2 wrote: > >> > >> is it somehow possible to retrieve the x and y coordinates of points > in > >> a scatterplot after it has been plotted? identify() somehow seems to > >> manage this, so I was wondering whether it is possible? > >> > > locator might be the more basic function you are looking for. > > > > Dieter > > > > > > -- > > View this message in context: http://r.789695.n4.nabble.com/retrieve- > x-y-coordinates-of-points-in-current-plot-tp3335642p3335692.html > > Sent from the R help mailing list archive at Nabble.com. > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From bt_jannis at yahoo.de Sat Mar 5 19:36:02 2011 From: bt_jannis at yahoo.de (Jannis) Date: Sat, 05 Mar 2011 19:36:02 +0100 Subject: [R] get name of script from where the command is invoked Message-ID: <4D728292.5010902@yahoo.de> Dear List members, is it somehow possible to retrieve the name of the file from where some command is invoked in (!) interactive mode? That means that I am running single lines of code within ESS /R-GUI and do NOT use source to execute the whole script. Cheers Jannis From bt_jannis at yahoo.de Sat Mar 5 19:54:28 2011 From: bt_jannis at yahoo.de (Jannis) Date: Sat, 05 Mar 2011 19:54:28 +0100 Subject: [R] retrieve x y coordinates of points in current plot In-Reply-To: References: <4D71204E.2030109@yahoo.de> <1299261317820-3335692.post@n4.nabble.com> <4D72489E.8050507@yahoo.de> Message-ID: <4D7286E4.5090209@yahoo.de> On 03/05/2011 07:25 PM, Greg Snow wrote: > It is not completely clear what you are trying to accomplish. Do you want to draw a shape in the plot then identify all the points in that shape? You could use locator (with type='l') to draw a polygon, then there are functions in add on packages (mostly the spatial ones) that will detect which points are within a polygon that you could use with the raw data and the polygon created. > > If that is not what you want, then maybe describe your goals in more detail (examples are good if you can give one). > Thats exactly what I want. drawing a polygon in a plot and searching for the points inside the polygon. I managed to create that polygon and to check which points are inside but only by supplying my function with the coordinates of the points. Now I was wondering whether it is also possible to retrieve these coordinates from the plot (similar to par()$usr and similar...). ideally it would work as follows: x<-rnorm(20) y=rnorm(20) plot(x,y) points.in.poly <- identify.poly() #see below #now click on the plot to identify the points Right now it only works like points.in.poly <- identify.poly(x,y) Anyway, supplying the points is not too complicated, it would just be easier to do without. identify.poly <- function(x,y,col.points='red') { require(sp) exit=FALSE i=0 coords.all <- list(x=vector(length=100),y=vector(length=100)) coords.all$x[1:100]<-NA coords.all$y[1:100]<-NA while (i<100) { coords.t <- locator(n=1) exit=!point.in.polygon(coords.t$x,coords.t$y, par()$usr[c(1,2,2,1)],par()$usr[c(3,3,4,4)]) if (exit) { break } i=i+1 points(coords.t,col=col.points,pch='+') coords.all$x[i] <- coords.t$x coords.all$y[i] <- coords.t$y if (i>1) points(coords.all$x[(i-1):i],coords.all$y[(i-1):i],lty=2, col=col.points,type='l') } points(coords.all$x[c(i,1)],coords.all$y[c(i,1)],lty=2, col=col.points,type='l') coords.all$x <- na.omit(coords.all$x) coords.all$y <- na.omit(coords.all$y) points.inpoly <- point.in.polygon(point.x=x,point.y=y, pol.x=coords.all$x,pol.y=coords.all$y) points(x[points.inpoly==1],y[points.inpoly==1],pch=par()$pch,col=col.points) data.return=list(in.poly=!!points.inpoly,x=x,y=y) } From Tranlm at berkeley.edu Sat Mar 5 19:56:26 2011 From: Tranlm at berkeley.edu (Linh Tran) Date: Sat, 5 Mar 2011 10:56:26 -0800 Subject: [R] subsetting data by specified observation number Message-ID: <5597c0b8fb1e8ddd7130c48e894d7e5e.squirrel@calmail.berkeley.edu> Hi members, I'd like to thank you guys ahead of time for the help. I'm kind of stuck. I have a data frame with ID and position numbers: 1> head(failed.3) id position 1 10000997 2 4 1000RW_M 2 15 1006RW_G 2 24 1012RW_M 3 28 10160917 2 30 1016RW_M 13 I'd like to use this to subset out a large dataset and keep only the observation number corresponding to the position number. So for example, ID 10000997 has 10 observations. I want to keep the 2nd one only. Thanks, -linh From jdnewmil at dcn.davis.ca.us Sat Mar 5 20:02:32 2011 From: jdnewmil at dcn.davis.ca.us (jdnewmil@gmail.com) Date: Sat, 05 Mar 2011 11:02:32 -0800 Subject: [R] subsetting data by specified observation number In-Reply-To: <5597c0b8fb1e8ddd7130c48e894d7e5e.squirrel@calmail.berkeley.edu> References: <5597c0b8fb1e8ddd7130c48e894d7e5e.squirrel@calmail.berkeley.edu> Message-ID: <99f82683-0fed-406e-8f75-d0f37972ad02@email.android.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Greg.Snow at imail.org Sat Mar 5 20:13:27 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Sat, 5 Mar 2011 12:13:27 -0700 Subject: [R] retrieve x y coordinates of points in current plot In-Reply-To: <4D7286E4.5090209@yahoo.de> References: <4D71204E.2030109@yahoo.de> <1299261317820-3335692.post@n4.nabble.com> <4D72489E.8050507@yahoo.de> <4D7286E4.5090209@yahoo.de> Message-ID: Theoretically possible, but it is going to be easiest to just supply the original data (like you do when you run identify). You can look at the output of the plot2script function in the TeachingDemos package to see the idea. That function creates a script which will recreate the current plot (base graphics), within the script are the coordinates of the points, but also a bunch of other stuff, so you would need to parse all of that to find just the part that you want, then do what you did before. What conditions are you in that you have the plot but not easy access to the data points plotted? -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: Jannis [mailto:bt_jannis at yahoo.de] > Sent: Saturday, March 05, 2011 11:54 AM > To: Greg Snow > Cc: Dieter Menne; r-help at r-project.org > Subject: Re: [R] retrieve x y coordinates of points in current plot > > On 03/05/2011 07:25 PM, Greg Snow wrote: > > It is not completely clear what you are trying to accomplish. Do you > want to draw a shape in the plot then identify all the points in that > shape? You could use locator (with type='l') to draw a polygon, then > there are functions in add on packages (mostly the spatial ones) that > will detect which points are within a polygon that you could use with > the raw data and the polygon created. > > > > If that is not what you want, then maybe describe your goals in more > detail (examples are good if you can give one). > > > Thats exactly what I want. drawing a polygon in a plot and searching > for > the points inside the polygon. I managed to create that polygon and to > check which points are inside but only by supplying my function with > the > coordinates of the points. Now I was wondering whether it is also > possible to retrieve these coordinates from the plot (similar to > par()$usr and similar...). > > > ideally it would work as follows: > > x<-rnorm(20) > y=rnorm(20) > plot(x,y) > points.in.poly <- identify.poly() #see below > #now click on the plot to identify the points > > Right now it only works like > points.in.poly <- identify.poly(x,y) > > Anyway, supplying the points is not too complicated, it would just be > easier to do without. > > identify.poly <- function(x,y,col.points='red') > { > require(sp) > exit=FALSE > i=0 > coords.all <- list(x=vector(length=100),y=vector(length=100)) > coords.all$x[1:100]<-NA > coords.all$y[1:100]<-NA > > while (i<100) > { > coords.t <- locator(n=1) > exit=!point.in.polygon(coords.t$x,coords.t$y, > > par()$usr[c(1,2,2,1)],par()$usr[c(3,3,4,4)]) > if (exit) > { > break > } > i=i+1 > points(coords.t,col=col.points,pch='+') > > coords.all$x[i] <- coords.t$x > coords.all$y[i] <- coords.t$y > if (i>1) > points(coords.all$x[(i-1):i],coords.all$y[(i-1):i],lty=2, > col=col.points,type='l') > } > points(coords.all$x[c(i,1)],coords.all$y[c(i,1)],lty=2, > col=col.points,type='l') > coords.all$x <- na.omit(coords.all$x) > coords.all$y <- na.omit(coords.all$y) > > > points.inpoly <- point.in.polygon(point.x=x,point.y=y, > > pol.x=coords.all$x,pol.y=coords.all$y) > > points(x[points.inpoly==1],y[points.inpoly==1],pch=par()$pch,col=col.po > ints) > data.return=list(in.poly=!!points.inpoly,x=x,y=y) > } From Greg.Snow at imail.org Sat Mar 5 20:20:48 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Sat, 5 Mar 2011 12:20:48 -0700 Subject: [R] subsetting data by specified observation number In-Reply-To: <5597c0b8fb1e8ddd7130c48e894d7e5e.squirrel@calmail.berkeley.edu> References: <5597c0b8fb1e8ddd7130c48e894d7e5e.squirrel@calmail.berkeley.edu> Message-ID: Will failed.3 have each id exactly once? Or could it have multiple lines for a given id? -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Linh Tran > Sent: Saturday, March 05, 2011 11:56 AM > To: r-help at r-project.org > Subject: [R] subsetting data by specified observation number > > Hi members, > > I'd like to thank you guys ahead of time for the help. I'm kind of > stuck. > > I have a data frame with ID and position numbers: > 1> head(failed.3) > id position > 1 10000997 2 > 4 1000RW_M 2 > 15 1006RW_G 2 > 24 1012RW_M 3 > 28 10160917 2 > 30 1016RW_M 13 > > I'd like to use this to subset out a large dataset and keep only the > observation number corresponding to the position number. So for > example, > ID 10000997 has 10 observations. I want to keep the 2nd one only. > > > Thanks, > > -linh > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From Tranlm at berkeley.edu Sat Mar 5 20:22:30 2011 From: Tranlm at berkeley.edu (Linh Tran) Date: Sat, 5 Mar 2011 11:22:30 -0800 Subject: [R] subsetting data by specified observation number In-Reply-To: <99f82683-0fed-406e-8f75-d0f37972ad02@email.android.com> References: <5597c0b8fb1e8ddd7130c48e894d7e5e.squirrel@calmail.berkeley.edu> <99f82683-0fed-406e-8f75-d0f37972ad02@email.android.com> Message-ID: Hi, Thank you for the reply. That would only work for the first ID. In addition, the data frame that I'm trying to subset is separate from the data frame that has the position numbers. "failed.3" is the data frame with the position numbers that I would like to keep. "def3" is the data frame that I need to subset from. linh > What is wrong with > > subset( failed.3, position == 2 ) > > ? > -- > Sent from my Android phone with K-9 Mail. Please excuse my brevity. > > Linh Tran wrote: > > Hi members, I'd like to thank you guys ahead of time for the help. I'm > kind of stuck. I have a data frame with ID and position numbers: 1> > head(failed.3) id position 1 10000997 2 4 1000RW_M 2 15 1006RW_G 2 24 > 1012RW_M 3 28 10160917 2 30 1016RW_M 13 I'd like to use this to subset out > a large dataset and keep only the observation number corresponding to the > position number. So for example, ID 10000997 has 10 observations. I want > to keep the 2nd one only. Thanks, > -linh_____________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting > guide http://www.R-project.org/posting-guide.html and provide commented, > minimal, self-contained, reproducible code. > > Thanks, -linh From Greg.Snow at imail.org Sat Mar 5 20:42:44 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Sat, 5 Mar 2011 12:42:44 -0700 Subject: [R] subsetting data by specified observation number In-Reply-To: <5597c0b8fb1e8ddd7130c48e894d7e5e.squirrel@calmail.berkeley.edu> References: <5597c0b8fb1e8ddd7130c48e894d7e5e.squirrel@calmail.berkeley.edu> Message-ID: Here is one way: > tmp1 <- data.frame(Species=c('setosa','virginica','versicolor'), + row=c(7,20,18) ) > > tmp.iris <- iris > tmp.iris$row <- ave(iris$Sepal.Length, iris$Species, FUN=seq_along) > > out.iris <- merge(tmp.iris, tmp1, by=c('Species','row')) > > > out.iris Species row Sepal.Length Sepal.Width Petal.Length Petal.Width 1 setosa 7 4.6 3.4 1.4 0.3 2 versicolor 18 5.8 2.7 4.1 1.0 3 virginica 20 6.0 2.2 5.0 1.5 > Another way would be to use the split function on your big data set, then use sapply to iterate over the list resulting and return just the rows from failed.3 in each group. Need to think a bit more about how that would look. You could also just loop through the rows of failed.3 and grab the corresponding pieces in the full dataset. There are probably a few other ways as well. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Linh Tran > Sent: Saturday, March 05, 2011 11:56 AM > To: r-help at r-project.org > Subject: [R] subsetting data by specified observation number > > Hi members, > > I'd like to thank you guys ahead of time for the help. I'm kind of > stuck. > > I have a data frame with ID and position numbers: > 1> head(failed.3) > id position > 1 10000997 2 > 4 1000RW_M 2 > 15 1006RW_G 2 > 24 1012RW_M 3 > 28 10160917 2 > 30 1016RW_M 13 > > I'd like to use this to subset out a large dataset and keep only the > observation number corresponding to the position number. So for > example, > ID 10000997 has 10 observations. I want to keep the 2nd one only. > > > Thanks, > > -linh > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From ggrothendieck at gmail.com Sat Mar 5 21:10:02 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Sat, 5 Mar 2011 15:10:02 -0500 Subject: [R] Multi-line input to rsympy In-Reply-To: <4D70D176.6020608@ucl.ac.uk> References: <4D70D176.6020608@ucl.ac.uk> Message-ID: On Fri, Mar 4, 2011 at 6:48 AM, Joanna Lewis wrote: > Dear R users, > > I have been using rsympy to solve a set of simultaneous equations from R. > There are two solutions for the variable I'm interested in, xx[0] and xx[1], > which are in terms of symbols called lam and conc. I'd like to pick out the > one which is positive at (lam=0, conc=0) and call it mysol. > > In python I could write: > > if (xx[0].subs(lam,0)).subs(conc,0)>0: > ?mysol=xx[0] > else: > ?mysol=xx[1] > > but I'm not sure how to do it from R via rsympy. The various combinations of > \t and \n characters and spaces I've tried haven't worked, and I haven't > been able to find any examples online or in the help file. > > Do you know whether it is possible to enter multi-line input using rsympy, > and if so how? > You can run multi-line python commands in Jython: > .Jython$exec("x = 1") > .Jython$exec("if x == 1: + z = 2 + else: + z = 3") > z <- .Jython$get("z") > .jstrVal(z) [1] "2" -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From maechler at stat.math.ethz.ch Sat Mar 5 22:55:39 2011 From: maechler at stat.math.ethz.ch (Martin Maechler) Date: Sat, 5 Mar 2011 22:55:39 +0100 Subject: [R] lty=NULL crashing R for x11(type="cairo") In-Reply-To: References: Message-ID: <19826.45403.548408.563906@stat.math.ethz.ch> >>>>> "IZ" == Ista Zahn >>>>> on Sat, 5 Mar 2011 14:07:04 +0000 writes: IZ> I confirm this bug exists and is 100% replicable on R IZ> version 2.12.2 (2011-02-25) Platform: i686-pc-linux-gnu IZ> (32-bit) WHoa... debugging .... ===> it *is* a bug in R after all : > plot(1); axis(1, lty=NULL) *** caught segfault *** address 0x7fff423ab000, cause 'memory not mapped' and yes, the bug is device dependent: E.g., it nicely works for postscript() or pdf() > postscript(); plot(1); axis(1, lty=NULL) ; dev.off() null device 1 and it's ok for type = "Xlib", but not for the default type = "cairo": > x11(type="Xlib") > plot(1); axis(1, lty=NULL) > x11(type="cairo") > plot(1); axis(1, lty=NULL) *** caught segfault *** address 0x7fffd875f000, cause 'memory not mapped' /u/maechler/bin/R_arg: line 137: 14914 Segmentation fault $exe $@ Process R-devel exited abnormally with code 139 at Sat Mar 5 22:53:35 2011 and similarly for > png(type="Xlib") # fine > png() # not fine From jasonkrupert at yahoo.com Sat Mar 5 23:38:08 2011 From: jasonkrupert at yahoo.com (Jason Rupert) Date: Sat, 5 Mar 2011 14:38:08 -0800 (PST) Subject: [R] Grouping data in ranges in table Message-ID: <900905.92985.qm@web56003.mail.re3.yahoo.com> Working with the built in R data set Orange, e.g. with(Orange, table(age, circumference)). How should I go about about grouping the ages and circumferences in the following ranges and having them display as such in a table? age range: 118 - 664 1004 - 1372 1582 circumference range: 30-58 62- 115 120-142 145-177 179-214 Thanks for any feedback and insights, as I hoping for an output that looks something like the following: circumference range 30-58 62- 115 145-177.... age range 118 - 664 ... 1004 - 1372 ... 1582 .... Thanks a ton. From murdoch.duncan at gmail.com Sat Mar 5 23:44:29 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Sat, 05 Mar 2011 17:44:29 -0500 Subject: [R] get name of script from where the command is invoked In-Reply-To: <4D728292.5010902@yahoo.de> References: <4D728292.5010902@yahoo.de> Message-ID: <4D72BCCD.1010609@gmail.com> On 11-03-05 1:36 PM, Jannis wrote: > Dear List members, > > > is it somehow possible to retrieve the name of the file from where some > command is invoked in (!) interactive mode? That means that I am running > single lines of code within ESS /R-GUI and do NOT use source to execute > the whole script. If you're not using source(), R doesn't know the name of the file. ESS probably does, but you'll probably have to ask on an ESS list to find out if you can get to it. Duncan Murdoch From Greg.Snow at imail.org Sat Mar 5 23:44:36 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Sat, 5 Mar 2011 15:44:36 -0700 Subject: [R] Grouping data in ranges in table In-Reply-To: <900905.92985.qm@web56003.mail.re3.yahoo.com> References: <900905.92985.qm@web56003.mail.re3.yahoo.com> Message-ID: ?cut -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Jason Rupert > Sent: Saturday, March 05, 2011 3:38 PM > To: R Project Help > Subject: [R] Grouping data in ranges in table > > Working with the built in R data set Orange, e.g. with(Orange, > table(age, > circumference)). > > > How should I go about about grouping the ages and circumferences in the > following ranges and having them display as such in a table? > age range: > 118 - 664 > 1004 - 1372 > 1582 > > circumference range: > 30-58 > 62- 115 > 120-142 > 145-177 > 179-214 > > Thanks for any feedback and insights, as I hoping for an output that > looks > something like the following: > circumference range > 30-58 62- 115 145-177.... > age range > 118 - 664 ... > 1004 - 1372 ... > 1582 .... > > > Thanks a ton. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From jorgeivanvelez at gmail.com Sat Mar 5 23:46:33 2011 From: jorgeivanvelez at gmail.com (Jorge Ivan Velez) Date: Sat, 5 Mar 2011 17:46:33 -0500 Subject: [R] Grouping data in ranges in table In-Reply-To: <900905.92985.qm@web56003.mail.re3.yahoo.com> References: <900905.92985.qm@web56003.mail.re3.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From xie at yihui.name Sun Mar 6 00:31:29 2011 From: xie at yihui.name (Yihui Xie) Date: Sat, 5 Mar 2011 17:31:29 -0600 Subject: [R] file mode lost in file.copy()? Message-ID: Hi, Recently I noticed file.copy() would discard the file mode information. Is this the expected behaviour or a bug for file.copy()? > file.create('testfile') [1] TRUE > file.info('testfile') size isdir mode mtime ctime testfile 0 FALSE 644 2011-03-05 17:06:39 2011-03-05 17:06:39 atime uid gid uname grname testfile 2011-03-05 17:06:40 1000 1000 yihui yihui > Sys.chmod('testfile', '0755') > file.info('testfile') size isdir mode mtime ctime testfile 0 FALSE 755 2011-03-05 17:06:39 2011-03-05 17:06:59 atime uid gid uname grname testfile 2011-03-05 17:07:00 1000 1000 yihui yihui > file.copy('testfile', 'testfile2') [1] TRUE > file.info('testfile2') size isdir mode mtime ctime testfile2 0 FALSE 644 2011-03-05 17:07:20 2011-03-05 17:07:20 atime uid gid uname grname testfile2 2011-03-05 17:07:21 1000 1000 yihui yihui > sessionInfo() R version 2.12.2 (2011-02-25) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 [7] LC_PAPER=en_US.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base Regards, Yihui -- Yihui Xie Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA From dwinsemius at comcast.net Sun Mar 6 06:00:05 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 6 Mar 2011 00:00:05 -0500 Subject: [R] xts POSIXct index format In-Reply-To: References: <1299283076151-3336136.post@n4.nabble.com> Message-ID: <63408EC7-0E02-4969-A26F-8AB39DE2747E@comcast.net> On Mar 5, 2011, at 11:28 AM, Joshua Ulrich wrote: > Hi Chris, > > Perhaps something like this? > > require(xts) > ds <- options(digits.secs=6) # so we can see sub-seconds > x <- xts(1:10, as.POSIXct("2011-01-21") + c(1,1,1,2:8)/1e3) > x > indexFormat(x) <- "%H:%M:%OS3" > x > Joshua; Does your reading of help(indexFormat) lead you to that suggestion? When I read it I thought that indexFormat would only accept onoe of " Date, POSIXct, chron,yearmon, yearqtr or timeDate." Those were the formats mentioned in the immediately preceding paragraph. I went to the help page hoping I would be told I could use format strings but left it thinking that I could only specify format classes. -- David > Hope that helps, > -- > Joshua Ulrich | FOSS Trading: www.fosstrading.com > > > > On Fri, Mar 4, 2011 at 5:57 PM, rivercode wrote: >> Hi, >> >> I cannot figure out how to change the index format when displaying >> POSIXct >> objects. >> >> Would like the xts index to display as %H:%M:%OS3 when doing >> viewing the xts >> object. >> >> Think I am missing the obvious. >> >> Cheers, >> Chris >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/xts-POSIXct-index-format-tp3336136p3336136.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From aquanyc at gmail.com Sun Mar 6 00:06:38 2011 From: aquanyc at gmail.com (rivercode) Date: Sat, 5 Mar 2011 15:06:38 -0800 (PST) Subject: [R] xts POSIXct index format In-Reply-To: References: <1299283076151-3336136.post@n4.nabble.com> Message-ID: <1299366398349-3337167.post@n4.nabble.com> Thank you for your help. indexFormat(x) solved the problem nicely. > head(a) 2011-03-04 09:30:00.0 22.10 2011-03-04 09:30:00.1 22.09 2011-03-04 09:30:00.2 22.10 2011-03-04 09:30:00.3 22.09 2011-03-04 09:30:00.4 22.10 2011-03-04 09:30:00.5 22.09 > indexFormat(a) <- "%H:%M:%OS3" > head(a) 09:30:00.000 22.10 09:30:00.100 22.09 09:30:00.200 22.10 09:30:00.300 22.09 09:30:00.400 22.10 09:30:00.500 22.09 -- View this message in context: http://r.789695.n4.nabble.com/xts-POSIXct-index-format-tp3336136p3337167.html Sent from the R help mailing list archive at Nabble.com. From ana-lee at web.de Sat Mar 5 21:02:48 2011 From: ana-lee at web.de (Anna Gretschel) Date: Sat, 05 Mar 2011 21:02:48 +0100 Subject: [R] testing power of correlation Message-ID: <4D7296E8.1070006@web.de> Dear List, does anyone know how I can test the strength of a correlation? Cheers, Anna From djbirdnerd at hotmail.com Sat Mar 5 18:40:41 2011 From: djbirdnerd at hotmail.com (djbirdnerd) Date: Sat, 5 Mar 2011 09:40:41 -0800 (PST) Subject: [R] Ordering several histograms In-Reply-To: References: <1299160800582-3333382.post@n4.nabble.com> Message-ID: <1299346841131-3336869.post@n4.nabble.com> not yet, but i have only just started programming in r... I don't know if i will able to... -- View this message in context: http://r.789695.n4.nabble.com/Ordering-several-histograms-tp3333382p3336869.html Sent from the R help mailing list archive at Nabble.com. From greener at uw.edu Sun Mar 6 03:39:16 2011 From: greener at uw.edu (Richard Green) Date: Sat, 5 Mar 2011 18:39:16 -0800 Subject: [R] How to load load multiple text files and order by id Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From kingsley at loaner.com Sun Mar 6 02:20:31 2011 From: kingsley at loaner.com (Kingsley G. Morse Jr.) Date: Sat, 5 Mar 2011 17:20:31 -0800 Subject: [R] Can body() return a function's body intact, in order, and as characters ready for editing? Message-ID: <20110306012031.GA6151@loaner.com> Is my understanding correct that the body() function currently can't return a function's body intact, in order, and as characters ready for editing? My testing and reading of body()'s help indicate that it can not. Here's what I'm seeing. Consider pasting 1+ and a function containing x^2 together to get 1+x^2 As you can see below, body() reports three elements, out of order. > f<-function(x) x^2; b<-body(f); paste("1+",b, sep="") [1] "1+^" "1+x" "1+2 I realize that this might be worked around with something like > f<-function(x) x^2; b<-do.call(paste,as.list(c(deparse(body(f)),sep=""))); paste("1+",b, sep="") [1] "1+x^2" However, I'm asking a different question. Is my understanding correct that body() can't return a function's body intact, in order, and as characters ready for editing? Thanks, Kingsley > sessionInfo() R version 2.12.1 (2010-12-16) Platform: i486-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US LC_NUMERIC=C LC_TIME=en_US [4] LC_COLLATE=en_US LC_MONETARY=C LC_MESSAGES=en_US [7] LC_PAPER=en_US LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] optimx_0.88 setRNG_2009.11-1 minqa_1.1.13 Rcpp_0.9.2 [5] Rvmmin_2011-2.11 Rcgmin_2011-2.10 ucminf_1.0-5 BB_2011.2-1 [9] quadprog_1.5-3 numDeriv_2010.11-1 rgp_0.2-3 snowfall_1.84 [13] snow_0.3-3 rrules_0.1-0 emoa_0.4-3 loaded via a namespace (and not attached): [1] tcltk_2.12.1 tools_2.12.1 > Sys.getlocale() [1] "LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=C;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C" From mark_difford at yahoo.co.uk Sat Mar 5 16:57:26 2011 From: mark_difford at yahoo.co.uk (Mark Difford) Date: Sat, 5 Mar 2011 07:57:26 -0800 (PST) Subject: [R] Change panel background color in spplot() In-Reply-To: <4D72554C.2060601@gmx.at> References: <4D72273E.60704@gmx.at> <4D724753.6070309@yahoo.de> <4D72554C.2060601@gmx.at> Message-ID: <1299340645868-3336769.post@n4.nabble.com> Marcel, Here is one way: spplot(meuse.grid, zcol = "part.a", par.settings = list(panel.background=list(col="grey"))) ## trellis.par.get() trellis.par.get()$panel.background Regards, Mark. > On 03/05/2011 01:06 PM, Marcel J. wrote: >> Hi! >> >> How does one change the background color of the map-panel in spplot()? >> >> Example: >> >> library(sp) >> >> data(meuse.grid) >> gridded(meuse.grid) = ~x+y >> >> spplot(meuse.grid, "part.a") >> >> How would I get another background-color for the map-panel (but not >> for the whole plot) here? >> >> Thank you! >> >> Marcel -- View this message in context: http://r.789695.n4.nabble.com/Change-panel-background-color-in-spplot-tp3336563p3336769.html Sent from the R help mailing list archive at Nabble.com. From rosyaraur at gmail.com Sat Mar 5 18:29:44 2011 From: rosyaraur at gmail.com (Umesh Rosyara) Date: Sat, 5 Mar 2011 12:29:44 -0500 Subject: [R] displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function Message-ID: <1017DE7935194ACBA31AFDC8B1B9B700@OwnerPC> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rosyaraur at gmail.com Sun Mar 6 00:29:55 2011 From: rosyaraur at gmail.com (Umesh Rosyara) Date: Sat, 5 Mar 2011 18:29:55 -0500 Subject: [R] please help ! label selected data points in huge number of data points potentially as high as 50, 000 ! Message-ID: <72B118B90D234D6B8566E9137A635655@OwnerPC> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From tsvisabo at yahoo.com Sat Mar 5 17:45:59 2011 From: tsvisabo at yahoo.com (tsvi sabo) Date: Sat, 5 Mar 2011 08:45:59 -0800 (PST) Subject: [R] Creating a code Message-ID: <934961.61995.qm@web36808.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From uwe.wolfram at uni-ulm.de Sat Mar 5 17:14:12 2011 From: uwe.wolfram at uni-ulm.de (Uwe Wolfram) Date: Sat, 05 Mar 2011 17:14:12 +0100 Subject: [R] Coefficient of Determination for nonlinear function In-Reply-To: References: <1299246001.1764.17.camel@pollux> Message-ID: <1299341652.1849.5.camel@pollux> Dear Bert, dear Andy, thanks for your answers! I am quite aware that I do not fit a linear model, so r^2 in Pearson's sens is indeed meaningless. Instead, I am "fitting" an equation - or rather using an optimisation - were the experimentally derived point cloud (x1, x2, x3) should deliver something like 1 = f(x1, x2, x3). What I am trying to estimate is the quality of the fit. One thing I computed so far is the standard error of the equation (SEE) which is fine. My former question pointed in the direction of how I could compute a coefficient of determination to estimate a goodness of fit. Calling it r^2 may mislead but there must be something similar in nonlinear regressions. Thanks for your efforts, Uwe Am Freitag, den 04.03.2011, 11:44 -0500 schrieb Liaw, Andy: > As far as I can tell, Uwe is not even fitting a model, but instead just > solving a nonlinear equation, so I don't know why he wants a R^2. I > don't see a statistical model here, so I don't know why one would want a > statistical measure. > > Andy > > > -----Original Message----- > > From: r-help-bounces at r-project.org > > [mailto:r-help-bounces at r-project.org] On Behalf Of Bert Gunter > > Sent: Friday, March 04, 2011 11:21 AM > > To: uwe.wolfram at uni-ulm.de; r-help at r-project.org > > Subject: Re: [R] Coefficient of Determination for nonlinear function > > > > The coefficient of determination, R^2, is a measure of how well your > > model fits versus a "NULL" model, which is that the data are constant. > > In nonlinear models, as opposed to linear models, such a null model > > rarely makes sense. Therefore the coefficient of determination is > > generally not meaningful in nonlinear modeling. > > > > Yet another way in which linear and nonlinear models > > fundamentally differ. > > > > -- Bert > > > > On Fri, Mar 4, 2011 at 5:40 AM, Uwe Wolfram > > wrote: > > > Dear Subscribers, > > > > > > I did fit an equation of the form 1 = f(x1,x2,x3) using a > > minimization > > > scheme. Now I want to compute the coefficient of > > determination. Normally > > > I would compute it as > > > > > > r_square = 1- sserr/sstot with sserr = sum_i (y_i - f_i) and sstot = > > > sum_i (y_i - mean(y)) > > > > > > sserr is clear to me but how can I compute sstot when there > > is no such > > > thing than differing y_i. These are all one. Thus > > mean(y)=1. Therefore, > > > sstot is 0. > > > > > > Thank you very much for your efforts, > > > > > > Uwe > > > -- > > > Uwe Wolfram > > > Dipl.-Ing. (Ph.D Student) > > > __________________________________________________ > > > Institute of Orthopaedic Research and Biomechanics > > > Director and Chair: Prof. Dr. Anita Ignatius > > > Center of Musculoskeletal Research Ulm > > > University Hospital Ulm > > > Helmholtzstr. 14 > > > 89081 Ulm, Germany > > > Phone: +49 731 500-55301 > > > Fax: +49 731 500-55302 > > > http://www.biomechanics.de > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > -- > > Bert Gunter > > Genentech Nonclinical Biostatistics > > 467-7374 > > http://devo.gene.com/groups/devo/depts/ncb/home.shtml > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > Notice: This e-mail message, together with any attach...{{dropped:26}} From jorgeivanvelez at gmail.com Sun Mar 6 06:10:28 2011 From: jorgeivanvelez at gmail.com (Jorge Ivan Velez) Date: Sun, 6 Mar 2011 00:10:28 -0500 Subject: [R] testing power of correlation In-Reply-To: <4D7296E8.1070006@web.de> References: <4D7296E8.1070006@web.de> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jeremy.miles at gmail.com Sun Mar 6 06:10:51 2011 From: jeremy.miles at gmail.com (Jeremy Miles) Date: Sat, 5 Mar 2011 21:10:51 -0800 Subject: [R] testing power of correlation In-Reply-To: <4D7296E8.1070006@web.de> References: <4D7296E8.1070006@web.de> Message-ID: Can you clarify what you mean? The strength of the correlation is the correlation. One (somewhat) useful definition is Cohen's, who said 0.1 is small, 0.3 is medium and 0.5 is large. Or do you (as your subject says) want to get the power for a correlation? This is a different thing. Jeremy On 5 March 2011 12:02, Anna Gretschel wrote: > Dear List, > > does anyone know how I can test the strength of a correlation? > > Cheers, Anna > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jeremy Miles Psychology Research Methods Wiki: www.researchmethodsinpsychology.com From jorgeivanvelez at gmail.com Sun Mar 6 06:21:31 2011 From: jorgeivanvelez at gmail.com (Jorge Ivan Velez) Date: Sun, 6 Mar 2011 00:21:31 -0500 Subject: [R] displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function In-Reply-To: <1017DE7935194ACBA31AFDC8B1B9B700@OwnerPC> References: <1017DE7935194ACBA31AFDC8B1B9B700@OwnerPC> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From kingsley at loaner.com Sun Mar 6 06:40:49 2011 From: kingsley at loaner.com (Kingsley G. Morse Jr.) Date: Sat, 5 Mar 2011 21:40:49 -0800 Subject: [R] How to load load multiple text files and order by id In-Reply-To: References: Message-ID: <20110306054049.GB9454@loaner.com> Hi Richard, If you haven't tried it already, maybe you could read the files into separate data frames with read.table(), and then combine them with merge(). Type ?merge to learn more. Good luck, Kingsley On 03/05/11 18:39, Richard Green wrote: > Hello R users, > I am fairly new to R and was hoping you could point me in the right > direction I have a set of text files (36). > Each file has only two columns (id and count) , I am trying to figure out a > way to load all the files together and > then have them ordered by id into a matrix data frame. For example > > If each txt file has : > ID count > id_00002 20 > id_00003 3 > > A Merged File: > ID count_file1 count_file2 count_file3 count_file4 > id_00002 20 8 12 5 19 26 > id_00003 3 0 2 0 0 0 > id_00004 75 84 241 149 271 257 > > Is there a relatively simply way to do that in R? I was trying with <- > read.table > and then <- cbind but that does not appear to be working. Any suggestions > folks have are appreciated. > Thanks > -Rich > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ggrothendieck at gmail.com Sun Mar 6 06:46:26 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Sun, 6 Mar 2011 00:46:26 -0500 Subject: [R] Can body() return a function's body intact, in order, and as characters ready for editing? In-Reply-To: <20110306012031.GA6151@loaner.com> References: <20110306012031.GA6151@loaner.com> Message-ID: On Sat, Mar 5, 2011 at 8:20 PM, Kingsley G. Morse Jr. wrote: > Is my understanding correct that the body() > function currently can't return a function's body > intact, in order, and as characters ready for > editing? > > My testing and reading of body()'s help indicate > that it can not. > > Here's what I'm seeing. > > Consider pasting > > ? ?1+ > > and a function containing > > ? ?x^2 > > together to get > > ? ?1+x^2 > > As you can see below, body() reports three > elements, out of order. > > ? ?> f<-function(x) x^2; b<-body(f); paste("1+",b, sep="") > ? ?[1] "1+^" "1+x" "1+2 > > I realize that this might be worked around with > something like > > ? ?> f<-function(x) x^2; b<-do.call(paste,as.list(c(deparse(body(f)),sep=""))); paste("1+",b, sep="") > ? ?[1] "1+x^2" > > However, I'm asking a different question. > > Is my understanding correct that body() can't > return a function's body intact, in order, and as > characters ready for editing? > body does not return character strings in the first place. It returns a language object. You are subsequently turning it into character strings and the problem occurs there. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From scttchamberlain4 at gmail.com Sun Mar 6 07:03:34 2011 From: scttchamberlain4 at gmail.com (Scott Chamberlain) Date: Sun, 6 Mar 2011 00:03:34 -0600 Subject: [R] How to load load multiple text files and order by id In-Reply-To: <20110306054049.GB9454@loaner.com> References: <20110306054049.GB9454@loaner.com> Message-ID: <52B2AEC1EE1547F889E16E7D3AD2B004@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djmuser at gmail.com Sun Mar 6 08:36:30 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Sat, 5 Mar 2011 23:36:30 -0800 Subject: [R] How to load load multiple text files and order by id In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From arrayprofile at yahoo.com Sun Mar 6 09:34:20 2011 From: arrayprofile at yahoo.com (array chip) Date: Sun, 6 Mar 2011 00:34:20 -0800 (PST) Subject: [R] read.ssd() from foreign package Message-ID: <81899.5370.qm@web56306.mail.re3.yahoo.com> Hi, I am encountering a confusing problem when I tried to use read.ssd to read SAS datasets. For one SAS dataset "a.sas7bdat", it did not work; while for another SAS dataset "b.sas7bdat" it worked: > tmp<-read.ssd("C:\\SASdata", "a",sascmd="C:/Program >Files/SAS/SASFoundation/9.2/sas.exe") SAS failed. SAS program at C:\DOCUME~1\yiz01\LOCALS~1\Temp\RtmpVjJa6m\file12384509.sas The log file will be file12384509.log in the current directory Warning message: In read.ssd("C:\\SASdata", "a", sascmd = "C:/Program Files/SAS/SASFoundation/9.2/sas.exe") : SAS return code was 1 > tmp<-read.ssd("C:\\SASdata", "b",sascmd="C:/Program >Files/SAS/SASFoundation/9.2/sas.exe") The attached log files are also attached. File "file12384509.log" is for dataset "a.sas7bdat" that does not work; while file "file1ad46e5d.log" is for dataset "b.sas7bdat" that does work. Can anyone suggest why one worked, the other did not? Thanks John From loren.collingwood at gmail.com Sun Mar 6 07:59:51 2011 From: loren.collingwood at gmail.com (Loren Collingwood) Date: Sat, 5 Mar 2011 22:59:51 -0800 (PST) Subject: [R] cv.glmnet errors In-Reply-To: <4D5F1858.90703@ucalgary.ca> Message-ID: <4778104.552.1299394791104.JavaMail.geo-discussion-forums@prcm18> I came across the same thing, doing multinomial cross validation with cv.glmnet but also doing it with a for loop with subsets on the X matrix and y response categories. I've tested it out various ways and I think the problem occurs because in one of the folds there are no codes for at least one of the responses. From what I gather, this trips up glmnet. See in the table code below where in the first case no zeroes appear, but in the second a zero appears. rand <- sample(3,dim(alldata)[1], replace=T) # alldata is a dataframe; allcodes is factor response variables obj1 <- glmnet(x=alldata[rand!=2,],y=allcodes[rand!=2], family="multinomial",maxit=500) #Worked obj2 <- glmnet(x=alldata[rand!=3,],y=allcodes[rand!=3], family="multinomial",maxit=500) #doesn't work > table(allcodes[rand!=2]) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 84 31 14 67 8 9 8 16 31 5 11 3 35 3 9 7 2 17 18 12 3 1 4 1 > table(allcodes[rand!=3]) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 85 20 14 72 12 7 13 15 32 4 13 3 26 3 15 5 6 13 23 16 1 0 3 1 I've looked at this with various sequences and it always seems to work when there's no zeroes, and crashes when there are zeroes. I'm working on a small data frame here (because of memory issues) so I don't think in general I would have 0s in nfold code categories. -Loren From pdalgd at gmail.com Sun Mar 6 10:51:51 2011 From: pdalgd at gmail.com (peter dalgaard) Date: Sun, 6 Mar 2011 10:51:51 +0100 Subject: [R] read.ssd() from foreign package In-Reply-To: <81899.5370.qm@web56306.mail.re3.yahoo.com> References: <81899.5370.qm@web56306.mail.re3.yahoo.com> Message-ID: On Mar 6, 2011, at 09:34 , array chip wrote: > Hi, I am encountering a confusing problem when I tried to use read.ssd to read > SAS datasets. For one SAS dataset "a.sas7bdat", it did not work; while for > another SAS dataset "b.sas7bdat" it worked: > >> tmp<-read.ssd("C:\\SASdata", "a",sascmd="C:/Program >> Files/SAS/SASFoundation/9.2/sas.exe") > SAS failed. SAS program at > C:\DOCUME~1\yiz01\LOCALS~1\Temp\RtmpVjJa6m\file12384509.sas > > The log file will be file12384509.log in the current directory > Warning message: > In read.ssd("C:\\SASdata", "a", sascmd = "C:/Program > Files/SAS/SASFoundation/9.2/sas.exe") : > SAS return code was 1 > >> tmp<-read.ssd("C:\\SASdata", "b",sascmd="C:/Program >> Files/SAS/SASFoundation/9.2/sas.exe") > > The attached log files are also attached. Nope... Presumably, your mailer encoded them as non-text and the mailing list software scrubbed them. Try inlining them. Not much we can do without an error message to go on. (I gather, by the way, that even SAS itself has trouble reading SAS files these days due to 32/64 bit issues. So my first question would be whether SAS can read both files.) > File "file12384509.log" is for dataset > "a.sas7bdat" that does not work; while file "file1ad46e5d.log" is for dataset > "b.sas7bdat" that does work. > > Can anyone suggest why one worked, the other did not? > > Thanks > > John > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com From murdoch.duncan at gmail.com Sun Mar 6 11:44:35 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Sun, 06 Mar 2011 05:44:35 -0500 Subject: [R] Can body() return a function's body intact, in order, and as characters ready for editing? In-Reply-To: <20110306012031.GA6151@loaner.com> References: <20110306012031.GA6151@loaner.com> Message-ID: <4D736593.5050109@gmail.com> On 11-03-05 8:20 PM, Kingsley G. Morse Jr. wrote: > Is my understanding correct that the body() > function currently can't return a function's body > intact, in order, and as characters ready for > editing? Yes, that's not what body() returns. You can get what you want by printing the result of body(), e.g. capture.output(print(body(f))) However, your use of paste below isn't going to work on multi-line functions. Duncan Murdoch > > My testing and reading of body()'s help indicate > that it can not. > > Here's what I'm seeing. > > Consider pasting > > 1+ > > and a function containing > > x^2 > > together to get > > 1+x^2 > > As you can see below, body() reports three > elements, out of order. > > > f<-function(x) x^2; b<-body(f); paste("1+",b, sep="") > [1] "1+^" "1+x" "1+2 > > I realize that this might be worked around with > something like > > > f<-function(x) x^2; b<-do.call(paste,as.list(c(deparse(body(f)),sep=""))); paste("1+",b, sep="") > [1] "1+x^2" > > However, I'm asking a different question. > > Is my understanding correct that body() can't > return a function's body intact, in order, and as > characters ready for editing? > > Thanks, > Kingsley > >> sessionInfo() > R version 2.12.1 (2010-12-16) > Platform: i486-pc-linux-gnu (32-bit) > > locale: > [1] LC_CTYPE=en_US LC_NUMERIC=C LC_TIME=en_US > [4] LC_COLLATE=en_US LC_MONETARY=C LC_MESSAGES=en_US > [7] LC_PAPER=en_US LC_NAME=C LC_ADDRESS=C > [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] optimx_0.88 setRNG_2009.11-1 minqa_1.1.13 Rcpp_0.9.2 > [5] Rvmmin_2011-2.11 Rcgmin_2011-2.10 ucminf_1.0-5 BB_2011.2-1 > [9] quadprog_1.5-3 numDeriv_2010.11-1 rgp_0.2-3 snowfall_1.84 > [13] snow_0.3-3 rrules_0.1-0 emoa_0.4-3 > > loaded via a namespace (and not attached): > [1] tcltk_2.12.1 tools_2.12.1 > >> Sys.getlocale() > [1] "LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=C;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C" > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Sun Mar 6 12:00:29 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sun, 06 Mar 2011 12:00:29 +0100 Subject: [R] Ordering several histograms In-Reply-To: <1299346841131-3336869.post@n4.nabble.com> References: <1299160800582-3333382.post@n4.nabble.com> <1299346841131-3336869.post@n4.nabble.com> Message-ID: <4D73694D.2030008@statistik.tu-dortmund.de> On 05.03.2011 18:40, djbirdnerd wrote: > not yet, but i have only just started programming in r... > I don't know if i will able to... Without quoting the former thread and without replying to a particular person (rather than the mailing list only): Do you expect anybody on the mailing list knows what you are talking about? Uwe Ligges > -- > View this message in context: http://r.789695.n4.nabble.com/Ordering-several-histograms-tp3333382p3336869.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From phhs80 at gmail.com Sun Mar 6 13:37:48 2011 From: phhs80 at gmail.com (Paul Smith) Date: Sun, 6 Mar 2011 12:37:48 +0000 Subject: [R] Plot and curve inside C++ Message-ID: Dear All, I would like to use - plot, - curve inside a C++ program. What R package do you recommend? Rcpp? Thanks in advance, Paul From sarah.goslee at gmail.com Sun Mar 6 14:11:23 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Sun, 6 Mar 2011 08:11:23 -0500 Subject: [R] please help ! label selected data points in huge number of data points potentially as high as 50, 000 ! In-Reply-To: <72B118B90D234D6B8566E9137A635655@OwnerPC> References: <72B118B90D234D6B8566E9137A635655@OwnerPC> Message-ID: I think you've made your problem too complicated. Given your example below (and THANK YOU for including a workable example), is this not what you need? sigdata <- dataf[dataf$p < 0.01,] plot(dataf$xvar, dataf$p) text(sigdata$xvar, sigdata$p, sigdata$name) text() will take vectors of arguments. Sarah On Sat, Mar 5, 2011 at 6:29 PM, Umesh Rosyara wrote: > Dear All > > I am reposting because I my problem is real issue and I have been working on > this. I know this might be simple to those who know it ! Anyway I need help > ! > > Let me clear my point. I have huge number of datapoints plotted using either > base plot function or xyplot in lattice (I have preference to use lattice). > ? ? ? ? name xvar ? ? ? ? ? ?p > 1 ? ? ? M1 ? ?1 ?0.107983837 > 2 ? ? ? M2 ? 11 ?0.209125624 > 3 ? ? ? M3 ? 21 ?0.163959428 > 4 ? ? ? M4 ? 31 ?0.132469859 > 5 ? ? ? M5 ? 41 ?0.086095130 > 6 ? ? ? M6 ? 51 ?0.180822010 > 7 ? ? ? M7 ? 61 ?0.246619925 > 8 ? ? ? M8 ? 71 ?0.147363687 > 9 ? ? ? M9 ? 81 ?0.162663127 > ........ > 5000 observations > > I need to plot xvar (x variable) and p (y variable) using either plot () or > xyplot(). And I want show (print to graph) datapoint name labels to those > rows that have p value < 0.01 (means that they are significant). With my > limited R knowlege I can use text (x,y, labels) option to manually add the > text, but I have huge number of data point(though I provide just 1000 here, > potentially it can go upto 50,000). So I want to display name corresponding > to those observations (rows) that have pvalue less than 0.05 (threshold). > > Here is my example dataset and my status: > name <- c(paste ("M", 1:5000, sep = "")) > xvar <- seq(1, 50000, 10) > set.seed(134) > p <- rnorm(5000, 0.15,0.05) > dataf <- data.frame(name,xvar, p) > > # using lattice (my first preference) > require(lattice) > xyplot(p ~ xvar, dataf) > > #I want to display names for the following observation that meet requirement > of p <0.01. > which (dataf$p < 0.01) > [1] ?811 ?854 1636 1704 2148 2161 2244 3205 3268 4177 4564 4614 4639 4706 > > Thus significant observations are: > ? ? ? ?name ?xvar ? ? ? ? ? ? p > 811 ? M811 ?8101 ?0.0050637068 > 854 ? M854 ?8531 -0.0433901783 > 1636 M1636 16351 -0.0279014039 > 1704 M1704 17031 ?0.0029878335 > 2148 M2148 21471 ?0.0048898232 > 2161 M2161 21601 -0.0354130557 > 2244 M2244 22431 ?0.0003255200 > 3205 M3205 32041 ?0.0079758430 > 3268 M3268 32671 ?0.0012797145 > 4177 M4177 41761 ?0.0015487439 > 4564 M4564 45631 ?0.0024867152 > 4614 M4614 46131 ?0.0078381964 > 4639 M4639 46381 -0.0063151605 > 4706 M4706 47051 ?0.0032200517 > > I want the datapoint (8101, 0.0050637068) with M811 in the plot. Similarly > for all of the above (that are significant). I do not want to label all out > of 5000 who do have p value < 0.01. I know I can add manually - text (8101, > 0.0050637068, M811) in plot() in base. > > plot (dataf$xvar,p) > text (8101, 0.0050637068, "M811") > text (8531, -0.0433901783, "M854") > > I need more automation to deal with observations as high as 50,000. In real > sense I do not know how many variables there will be. > > You help is highly appreciated. Thank you; > > Best Regards > > Umesh R > > > > -- Sarah Goslee http://www.functionaldiversity.org From marchywka at hotmail.com Sun Mar 6 14:25:28 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Sun, 6 Mar 2011 08:25:28 -0500 Subject: [R] Coefficient of Determination for nonlinear function In-Reply-To: <1299341652.1849.5.camel@pollux> References: <1299246001.1764.17.camel@pollux>, , , <1299341652.1849.5.camel@pollux> Message-ID: ---------------------------------------- > From: uwe.wolfram at uni-ulm.de > To: andy_liaw at merck.com > Date: Sat, 5 Mar 2011 17:14:12 +0100 > CC: r-help at r-project.org; gunter.berton at gene.com > Subject: Re: [R] Coefficient of Determination for nonlinear function > > Dear Bert, dear Andy, > > thanks for your answers! I am quite aware that I do not fit a linear > model, so r^2 in Pearson's sens is indeed meaningless. Instead, I am > "fitting" an equation - or rather using an optimisation - were the > experimentally derived point cloud (x1, x2, x3) should deliver something > like 1 = f(x1, x2, x3). What I am trying to estimate is the quality of > the fit. One thing I computed so far is the standard error of the The quality of the fit is determined by how much additional funding it allows you to secure :) Obviously I'm being facetious but there are two real issues here. You may in fact be modeling revenue numbers as another poster here explicitly intended. Money or not, the quality is related to some underlying system you are presumably attempting to understand. Non-linear being a classification of exclusion it is quite open ended and any generic goodness measure may not be of much use to you. The other side of my first sentence would be that it is always easy to shop for a result you want for some purpose other than understanding your data. You may not state this, but you will likely find many ways to measure your results and then end up picking the one that agrees the most with what you want to believe. The great thing about R is that ad hoc exploratory work is easy and you may find simply plotting residuals and doing simple sensitivity tests by perturbing the data can be of some use. Or you may want a specific test to determine if you have ( say your nonlinear equation is a fit to a spectrum of some kind) a bunch of gaussian or lorentzian lines for example. I think I can say with reasonable certainty, "it depends." > equation (SEE) which is fine. My former question pointed in the > direction of how I could compute a coefficient of determination to > estimate a goodness of fit. Calling it r^2 may mislead but there must be > something similar in nonlinear regressions. > > Thanks for your efforts, > > Uwe > > > Am Freitag, den 04.03.2011, 11:44 -0500 schrieb Liaw, Andy: > > As far as I can tell, Uwe is not even fitting a model, but instead just > > solving a nonlinear equation, so I don't know why he wants a R^2. I > > don't see a statistical model here, so I don't know why one would want a > > statistical measure. > > > > Andy > > > > > -----Original Message----- > > > From: r-help-bounces at r-project.org > > > [mailto:r-help-bounces at r-project.org] On Behalf Of Bert Gunter > > > Sent: Friday, March 04, 2011 11:21 AM > > > To: uwe.wolfram at uni-ulm.de; r-help at r-project.org > > > Subject: Re: [R] Coefficient of Determination for nonlinear function > > > > > > The coefficient of determination, R^2, is a measure of how well your > > > model fits versus a "NULL" model, which is that the data are constant. > > > In nonlinear models, as opposed to linear models, such a null model > > > rarely makes sense. Therefore the coefficient of determination is > > > generally not meaningful in nonlinear modeling. > > > > > > Yet another way in which linear and nonlinear models > > > fundamentally differ. > > > > > > -- Bert > > > > > > On Fri, Mar 4, 2011 at 5:40 AM, Uwe Wolfram > > > wrote: > > > > Dear Subscribers, > > > > > > > > I did fit an equation of the form 1 = f(x1,x2,x3) using a > > > minimization > > > > scheme. Now I want to compute the coefficient of > > > determination. Normally > > > > I would compute it as > > > > > > > > r_square = 1- sserr/sstot with sserr = sum_i (y_i - f_i) and sstot = > > > > sum_i (y_i - mean(y)) > > > > > > > > sserr is clear to me but how can I compute sstot when there > > > is no such > > > > thing than differing y_i. These are all one. Thus > > > mean(y)=1. Therefore, > > > > sstot is 0. > > > > > > > > Thank you very much for your efforts, > > > > > > > > Uwe > > > > -- > > > > Uwe Wolfram > > > > Dipl.-Ing. (Ph.D Student) > > > > __________________________________________________ > > > > Institute of Orthopaedic Research and Biomechanics > > > > Director and Chair: Prof. Dr. Anita Ignatius > > > > Center of Musculoskeletal Research Ulm > > > > University Hospital Ulm > > > > Helmholtzstr. 14 > > > > 89081 Ulm, Germany > > > > Phone: +49 731 500-55301 > > > > Fax: +49 731 500-55302 > > > > http://www.biomechanics.de > > > > > > > > ______________________________________________ > > > > R-help at r-project.org mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > > > > > > -- > > > Bert Gunter > > > Genentech Nonclinical Biostatistics > > > 467-7374 > > > http://devo.gene.com/groups/devo/depts/ncb/home.shtml > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > Notice: This e-mail message, together with any attach...{{dropped:26}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From bajaj141003 at gmail.com Sun Mar 6 15:03:03 2011 From: bajaj141003 at gmail.com (Nipesh Bajaj) Date: Sun, 6 Mar 2011 19:33:03 +0530 Subject: [R] Writing Rd files Message-ID: Hi all, I have created a package and now into writing it's help files. However I am having problem on, how to put a 'new line' in any statement of the help file? For example please consider following: \title{ This is a new function and this function will calculate the mean. } However I want to write it is this way: \title{ This is a new function & and, this function will calculate the mean. } I have tried using: "\n" or "\\" however could not achieve what I want. Can somebody please help me on how to tell that, break here and go to the next line? Thanks, From bajaj141003 at gmail.com Sun Mar 6 15:32:43 2011 From: bajaj141003 at gmail.com (Nipesh Bajaj) Date: Sun, 6 Mar 2011 20:02:43 +0530 Subject: [R] Seeking guidance in package creation when it contains s4 class Message-ID: Dear all, I am having problem to create a package when this package is supposed to have some newly created s4 class. Here is my workout: > #rm(list = ls()) > setClass("aClass", sealed=T, representation(slot1 = "vector", slot2 = "character")) [1] "aClass" > fn1 <- function(x, y, z) { + x <- x[1] + y <- y[1] + z <- as.character(z[1]) + new("aClass", slot1 = x+y, slot2 = z) + } > #fn1(1,2,3) > package.skeleton("trial11") Creating directories ... Creating DESCRIPTION ... Creating Read-and-delete-me ... Saving functions and data ... Making help files ... Done. Further steps are described in './trial11/Read-and-delete-me'. Warning message: In dump(internalObjs, file = file.path(code_dir, sprintf("%s-internal.R", : deparse of an S4 object will not be source()able While running package.skeleton, I got this warning message, then when I run R CMD INSTALL trial11, I got an error saying: ERROR: unable to collate files for package 'trial11' It would be really helpful if somebody can point me how different the package creation will be if it contains s4 class? Thanks, From johannes_graumann at web.de Sun Mar 6 11:04:12 2011 From: johannes_graumann at web.de (Johannes Graumann) Date: Sun, 6 Mar 2011 13:04:12 +0300 Subject: [R] read.table mystery Message-ID: Hello, Please have a look at the code below, which I use to read in the attached file. As line 18 of the file reads "1065:>sp|Q9V3T9|ADRO_DROME NADPH:adrenodoxin oxidoreductase, mitochondrial OS=Drosophila melanogaster GN=dare PE=2 SV=1", I expect the code below to produce a 3 column data frame with most of the last column empty and line 18 to produce a data.frame row like so: V1 1065 V2 >sp|Q9V3T9|ADRO_DROME NADPH V3 adrenodoxin oxidoreductase, mitochondrial OS=Drosophila melanogaster GN=dare PE=2 SV=1 Why is that not so? Thanks for any hint. Sincerely, Joh read.table( "/tmp/testfile.txt", sep=":", header=FALSE, quote="", fill=TRUE )[19,] -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: testfile.txt URL: From marchywka at hotmail.com Sun Mar 6 14:06:48 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Sun, 6 Mar 2011 08:06:48 -0500 Subject: [R] Rapache ( was Developing a web crawler ) In-Reply-To: <4D6FE62B.70209@Vanderbilt.edu> References: <1299144164900-3332993.post@n4.nabble.com>, , <4D6FE62B.70209@Vanderbilt.edu> Message-ID: ---------------------------------------- > Date: Thu, 3 Mar 2011 13:04:11 -0600 > From: Matt.Shotwell at vanderbilt.edu > To: r-help at r-project.org > Subject: Re: [R] Developing a web crawler / R "webkit" or something similar? [off topic] > > On 03/03/2011 08:07 AM, Mike Marchywka wrote: > > > > > > > > > > > > > > > >> Date: Thu, 3 Mar 2011 01:22:44 -0800 > >> From: antujsrv at gmail.com > >> To: r-help at r-project.org > >> Subject: [R] Developing a web crawler > >> > >> Hi, > >> > >> I wish to develop a web crawler in R. I have been using the functionalities > >> available under the RCurl package. > >> I am able to extract the html content of the site but i don't know how to go > > > > In general this can be a big effort but there may be things in > > text processing packages you could adapt to execute html and javascript. > > However, I guess what I'd be looking for is something like a "webkit" > > package or other open source browser with or without an "R" interface. > > This actually may be an ideal solution for a lot of things as you get > > all the content handlers of at least some browser. > > > > > > Now that you mention it, I wonder if there are browser plugins to handle > > "R" content ( I'd have to give this some thought, put a script up as > > a web page with mime type "test/R" and have it execute it in R. ) > > There are server-side solutions for this sort of thing. See > http://rapache.net/ . Also, there was a string of messages on R-devel > some years ago addressing the mime type issue; beginning here: > http://tolstoy.newcastle.edu.au/R/devel/05/11/3054.html . Though I don't > know whether there was a resolution. Some suggestions were text/x-R, > text/x-Rd, application/x-RData. > The rapache demo looks like something I could use right away but I haven't looked into the handlers yet. I have installed rapache now on my debian system ( still have config issues but I did get apach2 to restart LOL) Before I plow into this too far, how would this compare/compete with something like a PHP library for Rserve? That is the approach I had been pursuing. Thanks. > -Matt > > > From rosyaraur at gmail.com Sun Mar 6 14:17:29 2011 From: rosyaraur at gmail.com (Umesh Rosyara) Date: Sun, 6 Mar 2011 08:17:29 -0500 Subject: [R] Data lebals xylattice plot: RE: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function In-Reply-To: References: <1017DE7935194ACBA31AFDC8B1B9B700@OwnerPC> Message-ID: <78EC64EDAC724C96BC94096382DFCAF9@OwnerPC> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From edd at debian.org Sun Mar 6 15:53:50 2011 From: edd at debian.org (Dirk Eddelbuettel) Date: Sun, 6 Mar 2011 08:53:50 -0600 Subject: [R] Plot and curve inside C++ In-Reply-To: References: Message-ID: <19827.40958.41376.97811@max.nulle.part> On 6 March 2011 at 12:37, Paul Smith wrote: | Dear All, | | I would like to use | | - plot, | - curve | | inside a C++ program. What R package do you recommend? Rcpp? You can use base R, embedding R is explained in the 'Writing R Extensions' manual. That said, the material is a little on the advanced side. Rcpp and RInside try to provide an easier API, and some users find it helpful. As for your question, I am committing the code below as rinside_sample11.cpp in the examples/standard/ directory of RInside. With the generic Makefile in that diretory, you just say 'make' and the rinside_sample11 binary results (as do all the other examples and tests there). I hope you find it mostly self-explanatory, if not please come to the rcpp-devel list for help. Dirk // Simple example motivated by post from Paul Smith // to r-help on 06 Mar 2011 // // Copyright (C) 2011 Dirk Eddelbuettel and Romain Francois #include // for the embedded R via RInside int main(int argc, char *argv[]) { // create an embedded R instance RInside R(argc, argv); // evaluate an R expression with curve() // because RInside defaults to interactive=false we use a file std::string cmd = "tmpf <- tempfile('curve'); " "png(tmpf); "