From john.kerpel at gmail.com Thu Sep 1 00:36:04 2011 From: john.kerpel at gmail.com (John Kerpel) Date: Wed, 31 Aug 2011 17:36:04 -0500 Subject: [R] MS-VAR introduction In-Reply-To: <1314825458317-3782271.post@n4.nabble.com> References: <005901c9ee6f$16a19ac0$43e4d040$@com> <1314825458317-3782271.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rshepard at appl-ecosys.com Thu Sep 1 01:55:22 2011 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Wed, 31 Aug 2011 16:55:22 -0700 (PDT) Subject: [R] Correct Syntax for subset.data.frame() Message-ID: I want to create individual data.frames for each of the 8 param factors in chemdata. The syntax I tried (based on Teetor's book, page 132) and R's response are: > ars <- subset(chemdata, select=c(site,sampdate,param,quant), subset=(param = "As")) Error in subset.data.frame(chemdata, select = c(site, sampdate, param, : 'subset' must evaluate to logical I'm asking for columns of type factor, date, factor, and numeric. What have I done incorrectly here? Rich From hindiogine at gmail.com Thu Sep 1 01:56:19 2011 From: hindiogine at gmail.com (Henri-Paul Indiogine) Date: Wed, 31 Aug 2011 16:56:19 -0700 Subject: [R] dbWriteTable error message In-Reply-To: References: Message-ID: I am answering myself here.... 2011/8/31 Henri-Paul Indiogine : > dbWriteTable(con, "fileAttr", DF.4, row.names=FALSE, overwrite=TRUE) > > Then I get the following error: > > [1] FALSE > Warning message: > In value[[3L]](cond) : > ?RAW() can only be applied to a 'raw', not a 'character' I have no idea what the cause was, but I rebuilt the data frame from scratch and now it works. HP -- Henri-Paul Indiogine Curriculum & Instruction Texas A&M University TutorFind Learning Centre Email: hindiogine at gmail.com Skype: hindiogine Website: http://people.cehd.tamu.edu/~sindiogine From jholtman at gmail.com Thu Sep 1 02:42:15 2011 From: jholtman at gmail.com (jim holtman) Date: Wed, 31 Aug 2011 20:42:15 -0400 Subject: [R] Correct Syntax for subset.data.frame() In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From xkziloj at gmail.com Thu Sep 1 03:32:55 2011 From: xkziloj at gmail.com (. .) Date: Wed, 31 Aug 2011 22:32:55 -0300 Subject: [R] Removing special chars in strings? Message-ID: Hi all, How can I replace those "\" in the str? Thanks in advance. func <- function(str) { print(gsub("\\","",str)) } func("bla\ble\bli") From mlt at gmx.us Thu Sep 1 03:40:15 2011 From: mlt at gmx.us (Mikhail Titov) Date: Wed, 31 Aug 2011 20:40:15 -0500 Subject: [R] Removing special chars in strings? In-Reply-To: References: Message-ID: <4E5EE27F.7070803@gmx.us> I usually use something like [\\] Mikhail On 08/31/2011 08:32 PM, . . wrote: > Hi all, > > How can I replace those "\" in the str? > > Thanks in advance. > > func <- function(str) { > print(gsub("\\","",str)) > } > func("bla\ble\bli") > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From wdunlap at tibco.com Thu Sep 1 03:47:19 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 1 Sep 2011 01:47:19 +0000 Subject: [R] Removing special chars in strings? In-Reply-To: References: Message-ID: There are no backslash characters in the string "bla\ble\bli". "\b" is used to indicate a backspace character, just as "\n" is used to indicate a newline character. You can get rid of the backslash characters with > gsub("\b","","bla\ble\bli") [1] "blaleli" or change them to b's with > gsub("\b","b","bla\ble\bli") [1] "blablebli" Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of . . > Sent: Wednesday, August 31, 2011 6:33 PM > To: R-help at r-project.org > Subject: [R] Removing special chars in strings? > > Hi all, > > How can I replace those "\" in the str? > > Thanks in advance. > > func <- function(str) { > print(gsub("\\","",str)) > } > func("bla\ble\bli") > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From wdunlap at tibco.com Thu Sep 1 03:50:04 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 1 Sep 2011 01:50:04 +0000 Subject: [R] Removing special chars in strings? In-Reply-To: References: Message-ID: Typo below: > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of William Dunlap > Sent: Wednesday, August 31, 2011 6:47 PM > To: . .; R-help at r-project.org > Subject: Re: [R] Removing special chars in strings? > > There are no backslash characters in the string "bla\ble\bli". > "\b" is used to indicate a backspace character, just > as "\n" is used to indicate a newline character. > > You can get rid of the XbackslashX characters with backspace > > gsub("\b","","bla\ble\bli") > [1] "blaleli" > or change them to b's with > > gsub("\b","b","bla\ble\bli") > [1] "blablebli" > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > > -----Original Message----- > > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of . . > > Sent: Wednesday, August 31, 2011 6:33 PM > > To: R-help at r-project.org > > Subject: [R] Removing special chars in strings? > > > > Hi all, > > > > How can I replace those "\" in the str? > > > > Thanks in advance. > > > > func <- function(str) { > > print(gsub("\\","",str)) > > } > > func("bla\ble\bli") > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From xkziloj at gmail.com Thu Sep 1 03:58:45 2011 From: xkziloj at gmail.com (. .) Date: Wed, 31 Aug 2011 22:58:45 -0300 Subject: [R] Removing special chars in strings? In-Reply-To: References: Message-ID: I got it! Where did I find the table relating the code and the respective meaning? I want to replace ". Thanks On Wed, Aug 31, 2011 at 10:47 PM, William Dunlap wrote: > There are no backslash characters in the string "bla\ble\bli". > "\b" is used to indicate a backspace character, just > as "\n" is used to indicate a newline character. > > You can get rid of the backslash characters with > ?> gsub("\b","","bla\ble\bli") > ?[1] "blaleli" > or change them to b's with > ?> gsub("\b","b","bla\ble\bli") > ?[1] "blablebli" > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > >> -----Original Message----- >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of . . >> Sent: Wednesday, August 31, 2011 6:33 PM >> To: R-help at r-project.org >> Subject: [R] Removing special chars in strings? >> >> Hi all, >> >> How can I replace those "\" in the str? >> >> Thanks in advance. >> >> func <- function(str) { >> ? print(gsub("\\","",str)) >> } >> func("bla\ble\bli") >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > From wdunlap at tibco.com Thu Sep 1 04:38:58 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 1 Sep 2011 02:38:58 +0000 Subject: [R] Removing special chars in strings? In-Reply-To: References: Message-ID: > -----Original Message----- > From: . . [mailto:xkziloj at gmail.com] > Sent: Wednesday, August 31, 2011 6:59 PM > To: William Dunlap > Cc: R-help at r-project.org > Subject: Re: [R] Removing special chars in strings? > > I got it! > > Where did I find the table relating the code and the respective meaning? Did you start with help("character")? Aside from the \a they are the traditional C ones (e.g., Kernighan and Richie, 1978, p 181). \a alert (bell) \b backspace \t tab \n newline \v vertical space \f formfeed \r carriage return (without newline) \' single quote \" double quote print() shows the backslashed version (except for \', which is not required) and cat() causes the user interface to display their "meaning". Not all user interfaces support all of them. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > I want to replace ". > > Thanks > > On Wed, Aug 31, 2011 at 10:47 PM, William Dunlap wrote: > > There are no backslash characters in the string "bla\ble\bli". > > "\b" is used to indicate a backspace character, just > > as "\n" is used to indicate a newline character. > > > > You can get rid of the backslash characters with > > ?> gsub("\b","","bla\ble\bli") > > ?[1] "blaleli" > > or change them to b's with > > ?> gsub("\b","b","bla\ble\bli") > > ?[1] "blablebli" > > > > Bill Dunlap > > Spotfire, TIBCO Software > > wdunlap tibco.com > > > >> -----Original Message----- > >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of . . > >> Sent: Wednesday, August 31, 2011 6:33 PM > >> To: R-help at r-project.org > >> Subject: [R] Removing special chars in strings? > >> > >> Hi all, > >> > >> How can I replace those "\" in the str? > >> > >> Thanks in advance. > >> > >> func <- function(str) { > >> ? print(gsub("\\","",str)) > >> } > >> func("bla\ble\bli") > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > From jholtman at gmail.com Thu Sep 1 05:07:08 2011 From: jholtman at gmail.com (Jim Holtman) Date: Wed, 31 Aug 2011 23:07:08 -0400 Subject: [R] Basic question about re-writing for loop as a function In-Reply-To: References: Message-ID: use Rprof to see where thr time is spent. take the strsplit out of the loop and do it once outsidr to create an object you can test against in the loop. you can probably get rid of the loop easily, but since there is no example of the data, it is hard to create a solution. Sent from my iPad On Aug 29, 2011, at 9:55, Chris Beeley wrote: > Hello- > > Sorry to ask a basic question, but I've spent many hours on this now > and seem to be missing something. > > I have a loop that looks like this: > > mainmat=data.frame(matrix(data=0, ncol=92, nrow=length(predata$Words_MH))) > > for(i in 1:length(predata$Words_MH)){ > for(j in 1:92){ > > mainmat[i,j]=ifelse(j %in% > as.numeric(unlist(strsplit(predata$Words_MH[i], split=","))), 1, 0) > > } > } > > What it's doing is creating a matrix with 92 columns, that's the > number of different codes, and then for every row of my data it looks > to see if the code (code 1, code 2, etc.) is in the string and if it > is, returns a 1 in the relevant column (column 1 for code 1, column 2 > for code 2, etc.) > > There are 1000 rows in the database, and I have to run several > versions of this code, so it just takes way too long, I have been > trying to rewrite using lapply. I tried this: > > myfunction=function(x, y) ifelse(x %in% > as.numeric(unlist(strsplit(predata$Words_MH[y], split=","))), 1, 0) > > for(j in 1:92){ > mainmat[,j]= lapply(predata$Words, myfunction) > } > > but I don't think I can use something that takes two inputs, and I > can't seem to remove either. > > Here's a dput of the first 10 rows of the variable in case that's helpful: > > predata$Words=c("1", "1", "1", "1", "2,3,4", "5", "1", "1", "6", "7,8,9,10") > > Given these data, I want the function to return, for the first column, > 1, 1, 1, 1, 0, 0, 1, 1, 0, 0 (because those are the values of Words > which contain a 1) and for the second column return 0, 0, 0, 0, 1, 0, > 0, 0, 0, 0 (because the fifth value is the only one that contains a > 2). > > Any suggestions gratefully received! > > Chris Beeley > Institute of Mental Health, UK > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From xkziloj at gmail.com Thu Sep 1 05:12:53 2011 From: xkziloj at gmail.com (. .) Date: Thu, 1 Sep 2011 00:12:53 -0300 Subject: [R] Evaluate an expresion comming from string Message-ID: Whats wrong here? I was expecting 11 as the result... Thanks in advance. bench <- function(str,...) { func <- function(x) x+10 expr <- list(...)[1] str <- gsub("XXX",expr,str) x <- as.call(gsub("\"","",str)) eval(x) } bench("func(XXX)", "1") From jholtman at gmail.com Thu Sep 1 05:18:55 2011 From: jholtman at gmail.com (Jim Holtman) Date: Wed, 31 Aug 2011 23:18:55 -0400 Subject: [R] Basic question about re-writing for loop as a function In-Reply-To: References: Message-ID: <4DC7719E-BE9E-400B-8D11-B7F8E6D0EF74@gmail.com> sorry, did not see your data at the bottom of the email Sent from my iPad On Aug 29, 2011, at 9:55, Chris Beeley wrote: > Hello- > > Sorry to ask a basic question, but I've spent many hours on this now > and seem to be missing something. > > I have a loop that looks like this: > > mainmat=data.frame(matrix(data=0, ncol=92, nrow=length(predata$Words_MH))) > > for(i in 1:length(predata$Words_MH)){ > for(j in 1:92){ > > mainmat[i,j]=ifelse(j %in% > as.numeric(unlist(strsplit(predata$Words_MH[i], split=","))), 1, 0) > > } > } > > What it's doing is creating a matrix with 92 columns, that's the > number of different codes, and then for every row of my data it looks > to see if the code (code 1, code 2, etc.) is in the string and if it > is, returns a 1 in the relevant column (column 1 for code 1, column 2 > for code 2, etc.) > > There are 1000 rows in the database, and I have to run several > versions of this code, so it just takes way too long, I have been > trying to rewrite using lapply. I tried this: > > myfunction=function(x, y) ifelse(x %in% > as.numeric(unlist(strsplit(predata$Words_MH[y], split=","))), 1, 0) > > for(j in 1:92){ > mainmat[,j]= lapply(predata$Words, myfunction) > } > > but I don't think I can use something that takes two inputs, and I > can't seem to remove either. > > Here's a dput of the first 10 rows of the variable in case that's helpful: > > predata$Words=c("1", "1", "1", "1", "2,3,4", "5", "1", "1", "6", "7,8,9,10") > > Given these data, I want the function to return, for the first column, > 1, 1, 1, 1, 0, 0, 1, 1, 0, 0 (because those are the values of Words > which contain a 1) and for the second column return 0, 0, 0, 0, 1, 0, > 0, 0, 0, 0 (because the fifth value is the only one that contains a > 2). > > Any suggestions gratefully received! > > Chris Beeley > Institute of Mental Health, UK > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From lothlorien90 at hotmail.com Thu Sep 1 04:25:15 2011 From: lothlorien90 at hotmail.com (Ash) Date: Wed, 31 Aug 2011 19:25:15 -0700 (PDT) Subject: [R] problem with "plm" package In-Reply-To: <80098b6d0911270708g216f6b01m5dd57131b199df66@mail.gmail.com> References: <80098b6d0911270708g216f6b01m5dd57131b199df66@mail.gmail.com> Message-ID: <1314843915121-3782639.post@n4.nabble.com> Hi, I am trying to complete a very simple panel analysis on some bank data. Call: Formula<-plm(RoE~RoA+CAR+Inc.Dep+Cash.TL+NPL.Loans, data=Banks1, model="random", index=c("Bank.I.D.","Year")) summary(Formula) I get the following error code: ERROR: missing value where TRUE/FALSE needed Does anyone know what true/false field I am missing? Thanks, -- View this message in context: http://r.789695.n4.nabble.com/problem-with-dynformula-from-plm-package-tp867660p3782639.html Sent from the R help mailing list archive at Nabble.com. From sharma.ram.h at gmail.com Thu Sep 1 03:45:32 2011 From: sharma.ram.h at gmail.com (Ram H. Sharma) Date: Wed, 31 Aug 2011 21:45:32 -0400 Subject: [R] UNSOLVED: Fwd: generate correlated qualitative data Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mackay at nsw.chariot.net.au Thu Sep 1 00:59:31 2011 From: mackay at nsw.chariot.net.au (Duncan Mackay) Date: Thu, 01 Sep 2011 08:59:31 +1000 Subject: [R] Getting the values out of histogram (lattice) In-Reply-To: References: Message-ID: <201108312301.p7VN1O2N007997@mail12.tpg.com.au> Hi Monica An example abbreviated from ?histogram x = histogram( ~ height, data = singer) names(x) # to see what is there str(x) # information x$panel.args.common $breaks [1] 59.36 61.28 63.20 65.12 67.04 68.96 70.88 72.80 74.72 76.64 $type [1] "percent" $equal.widths [1] TRUE $nint [1] 9 # x$panel.args: name as number x[[35]] [[1]] [[1]]$x [1] 64 62 66 65 60 61 65 66 65 63 67 65 62 65 68 65 63 65 62 65 66 62 65 63 65 66 65 62 65 66 65 61 65 66 65 62 63 67 60 67 66 62 65 62 [45] 61 62 66 60 65 65 61 64 68 64 63 62 64 62 64 65 60 65 70 63 67 66 65 62 68 67 67 63 67 66 63 72 62 61 66 64 60 61 66 66 66 62 70 65 [89] 64 63 65 69 61 66 65 61 63 64 67 66 68 70 65 65 65 64 66 64 70 63 70 64 63 67 65 63 66 66 64 64 70 70 66 66 66 69 67 65 69 72 71 66 [133] 76 74 71 66 68 67 70 65 72 70 68 64 73 66 68 67 64 68 73 69 71 69 76 71 69 71 66 69 71 71 71 69 70 69 68 70 68 69 72 70 72 69 73 71 [177] 72 68 68 71 66 68 71 73 73 70 68 70 75 68 71 70 74 70 75 75 69 72 71 70 71 68 70 75 72 66 72 70 69 72 75 67 75 74 72 72 74 72 72 74 [221] 70 66 68 75 68 70 72 67 70 70 69 72 71 74 75 etc to suite your requirements HTH Regards Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England ARMIDALE NSW 2351 Email: home mackay at northnet.com.au At 23:50 31/08/2011, you wrote: >Hi, > > > >I have a relatively big dataset and I want to construct >some histograms using the histogram function in lattice. One thing I am >interested in is to look at differences between >density and percent. I know I can >use the hist function but it seems that this function gives sometimes some >wrong answers and the density is actually a >percent since it is calculated as counts in the >bin divided by the total no. of points. Let me explain. > > > >If I let the hist function to decide the breaks, or I use >a small number, or one of the pre-determined methods to select breaks then >everything seems to be in order. But if I decide to use ? for example ? 100 as >a breaks (I have over 90000 data points so the number of breaks is not >necessarily too large I would think) the density for the first bin is over 1, >although for all the other breaks the density is >actually a percent since it is >the count for that bin divided by the total no. >of points I have. So . Here it >is something wrong or most probably I am doing something wrong. > > > >If I use the function histogram from lattice it is >obvious that there is a difference between the percent param and the density >param. I looked at the function code and I >didn't understand it ? to be honest. >It seems it calls inside the hist function, or a slightly modify variant of >hist. Reading about the object trellis I saw I can access different info about >the graph it generates but nothing about the actual data that goes into >defining the histogram. How can I access the data from it? > > > >I am not sure if my problem is platform specific ? it should >not be ? but I have Rx64 2.13.1 on windows machine, in case it counts. > > > >I appreciate your help, thanks, > > > >Monica > > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. From nilaya.sharma at gmail.com Thu Sep 1 03:10:06 2011 From: nilaya.sharma at gmail.com (Nilaya Sharma) Date: Wed, 31 Aug 2011 21:10:06 -0400 Subject: [R] vector output loop or function Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rolf.turner at xtra.co.nz Thu Sep 1 02:09:05 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Thu, 01 Sep 2011 12:09:05 +1200 Subject: [R] Getting the values out of histogram (lattice) In-Reply-To: References: Message-ID: <4E5ECD21.1010008@xtra.co.nz> I'm not entirely sure that I understand what your problem is. A reproducible example would probably have helped. However I conjecture that the problem boils down to confusing "probability" with "probability *density*". Percentages are the (estimated) bin probabilities times 100. The percentage for the i-th bin is 100*n_i/n where n_i is the count for the i-th bin and n is the sum of the n_i. The percentages sum to 100 (equivalent to probabilities summing to 1). The *densities* in contrast *integrate* to 1. The density value for the i-th bin is w_i * n_i/n where w_i is the width of the i-th bin. (If the breaks have been set sensibly, the w_i all have the same value, i.e. the bin widths are all the same.) Does this answer your question? (In an example that I tried the percentages and the density values are --- not surprisingly!!! --- completely consistent.) You are correct in observing that it is difficult to dig out the ``histogram values'' (the bar heights) when using lattice. You can actually get at them using lattice:::hist.constructor(), but it's not for the fainthearted. cheers, Rolf Turner P. S. You really should be absolutely certain that you know what you're talking about before accusing a package of giving ``wrong answers''. R. T. On 01/09/11 01:50, Monica Pisica wrote: > > > Hi, > > > > I have a relatively big dataset and I want to construct > some histograms using the histogram function in lattice. One thing I am > interested in is to look at differences between density and percent. I know I can > use the hist function but it seems that this function gives sometimes some > wrong answers and the density is actually a percent since it is calculated as counts in the bin divided by the total no. of points. Let me explain. > > > > If I let the hist function to decide the breaks, or I use > a small number, or one of the pre-determined methods to select breaks then > everything seems to be in order. But if I decide to use ? for example ? 100 as > a breaks (I have over 90000 data points so the number of breaks is not > necessarily too large I would think) the density for the first bin is over 1, > although for all the other breaks the density is actually a percent since it is > the count for that bin divided by the total no. of points I have. So ?. Here it > is something wrong or most probably I am doing something wrong. > > > > If I use the function histogram from lattice it is > obvious that there is a difference between the percent param and the density > param. I looked at the function code and I didn't understand it ? to be honest. > It seems it calls inside the hist function, or a slightly modify variant of > hist. Reading about the object trellis I saw I can access different info about > the graph it generates but nothing about the actual data that goes into > defining the histogram. How can I access the data from it? > > > > I am not sure if my problem is platform specific ? it should > not be ? but I have Rx64 2.13.1 on windows machine, in case it counts. > > > > I appreciate your help, thanks, From xkziloj at gmail.com Thu Sep 1 05:47:58 2011 From: xkziloj at gmail.com (. .) Date: Thu, 1 Sep 2011 00:47:58 -0300 Subject: [R] Evaluate an expresion comming from string In-Reply-To: References: Message-ID: ...and this also does not work: bench <- function(str,...) { func <- function(x) { x+10 } expr <- list(...)[1] str <- gsub("XXX",expr,str) x <- as.expression(gsub("\"","",str)) eval(x) } bench("func(XXX)", "1") Comments are appreciable. On Thu, Sep 1, 2011 at 12:12 AM, . . wrote: > Whats wrong here? > > I was expecting 11 as the result... > > Thanks in advance. > > bench <- function(str,...) { > func <- function(x) x+10 > ?expr <- list(...)[1] > ?str <- gsub("XXX",expr,str) > ?x <- as.call(gsub("\"","",str)) > ?eval(x) > } > bench("func(XXX)", "1") > From jwiley.psych at gmail.com Thu Sep 1 06:25:50 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Wed, 31 Aug 2011 21:25:50 -0700 Subject: [R] Evaluate an expresion comming from string In-Reply-To: References: Message-ID: Hi ". .", You are confused about where the quotes come from. Your character string does not have double quotes in it---they are just there for printing. Compare: nchar("XXX") nchar("'XXX'") for example. To do what you want, try: bench <- function(str,...) { func <- function(x) x+10 expr <- list(...)[1] str <- gsub("XXX",expr,str) eval(parse(text = str)) } bench("func(XXX)", "1") then avoid using str which is an important function in the utils package (part of R core/recommended) as a variable name. Then give some serious thought about whether you really want to do what you are doing by cobbling together text you parse and evaluate. There are typically simpler, cleaner, more efficient, trustworthy, and generally more wholesome ways of doing the above (but if you insist, eval(parse(text = "stuff")) is probably the way to go). Cheers, Josh On Wed, Aug 31, 2011 at 8:12 PM, . . wrote: > Whats wrong here? > > I was expecting 11 as the result... > > Thanks in advance. > > bench <- function(str,...) { > func <- function(x) x+10 > ?expr <- list(...)[1] > ?str <- gsub("XXX",expr,str) > ?x <- as.call(gsub("\"","",str)) > ?eval(x) > } > bench("func(XXX)", "1") > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ From erinm.hodgess at gmail.com Thu Sep 1 06:30:42 2011 From: erinm.hodgess at gmail.com (Erin Hodgess) Date: Wed, 31 Aug 2011 23:30:42 -0500 Subject: [R] missing R.dll file Message-ID: Dear R People: I was using R with no problems last night. Now tonight, when I type in "Rgui", I get an error message that R.dll is not found and try to re-install. Have any of you run into this, please? Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodgess at gmail.com From rolf.turner at xtra.co.nz Thu Sep 1 06:59:23 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Thu, 01 Sep 2011 16:59:23 +1200 Subject: [R] Getting the values out of histogram (lattice) In-Reply-To: <201108312301.p7VN1O2N007997@mail12.tpg.com.au> References: <201108312301.p7VN1O2N007997@mail12.tpg.com.au> Message-ID: <4E5F112B.5090803@xtra.co.nz> 'Scuse me, but I don't see anything in your example relating to what the OP asked for. She wanted to get at the ``actual data defining the histogram'', which I interpret as meaning the bar heights (the percentages, density values, or counts, depending on "type"). These do not appeared to be stored in the object returned by histogram(). cheers, Rolf Turner On 01/09/11 10:59, Duncan Mackay wrote: > Hi Monica > > An example abbreviated from ?histogram > > x = histogram( ~ height, data = singer) > > names(x) > # to see what is there > str(x) > > # information > x$panel.args.common > $breaks > [1] 59.36 61.28 63.20 65.12 67.04 68.96 70.88 72.80 74.72 76.64 > > $type > [1] "percent" > > $equal.widths > [1] TRUE > > $nint > [1] 9 > > # x$panel.args: name as number > x[[35]] > [[1]] > [[1]]$x > [1] 64 62 66 65 60 61 65 66 65 63 67 65 62 65 68 65 63 65 62 65 66 62 > 65 63 65 66 65 62 65 66 65 61 65 66 65 62 63 67 60 67 66 62 65 62 > [45] 61 62 66 60 65 65 61 64 68 64 63 62 64 62 64 65 60 65 70 63 67 66 > 65 62 68 67 67 63 67 66 63 72 62 61 66 64 60 61 66 66 66 62 70 65 > [89] 64 63 65 69 61 66 65 61 63 64 67 66 68 70 65 65 65 64 66 64 70 63 > 70 64 63 67 65 63 66 66 64 64 70 70 66 66 66 69 67 65 69 72 71 66 > [133] 76 74 71 66 68 67 70 65 72 70 68 64 73 66 68 67 64 68 73 69 71 > 69 76 71 69 71 66 69 71 71 71 69 70 69 68 70 68 69 72 70 72 69 73 71 > [177] 72 68 68 71 66 68 71 73 73 70 68 70 75 68 71 70 74 70 75 75 69 > 72 71 70 71 68 70 75 72 66 72 70 69 72 75 67 75 74 72 72 74 72 72 74 > [221] 70 66 68 75 68 70 72 67 70 70 69 72 71 74 75 > > etc to suite your requirements > > HTH > > Regards > > Duncan > > > Duncan Mackay > Department of Agronomy and Soil Science > University of New England > ARMIDALE NSW 2351 > Email: home mackay at northnet.com.au > > > > At 23:50 31/08/2011, you wrote: > > > >> Hi, >> >> >> >> I have a relatively big dataset and I want to construct >> some histograms using the histogram function in lattice. One thing I am >> interested in is to look at differences between density and percent. >> I know I can >> use the hist function but it seems that this function gives sometimes >> some >> wrong answers and the density is actually a percent since it is >> calculated as counts in the bin divided by the total no. of points. >> Let me explain. >> >> >> >> If I let the hist function to decide the breaks, or I use >> a small number, or one of the pre-determined methods to select breaks >> then >> everything seems to be in order. But if I decide to use ? for example >> ? 100 as >> a breaks (I have over 90000 data points so the number of breaks is not >> necessarily too large I would think) the density for the first bin is >> over 1, >> although for all the other breaks the density is actually a percent >> since it is >> the count for that bin divided by the total no. of points I have. So >> ?. Here it >> is something wrong or most probably I am doing something wrong. >> >> >> >> If I use the function histogram from lattice it is >> obvious that there is a difference between the percent param and the >> density >> param. I looked at the function code and I didn't understand it ? to >> be honest. >> It seems it calls inside the hist function, or a slightly modify >> variant of >> hist. Reading about the object trellis I saw I can access different >> info about >> the graph it generates but nothing about the actual data that goes into >> defining the histogram. How can I access the data from it? >> >> >> >> I am not sure if my problem is platform specific ? it should >> not be ? but I have Rx64 2.13.1 on windows machine, in case it counts. >> >> >> >> I appreciate your help, thanks, >> >> >> >> Monica >> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From eran at taykey.com Thu Sep 1 08:27:56 2011 From: eran at taykey.com (Eran Eidinger) Date: Thu, 1 Sep 2011 09:27:56 +0300 Subject: [R] Namespace in packages Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ripley at stats.ox.ac.uk Thu Sep 1 08:39:15 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Thu, 1 Sep 2011 07:39:15 +0100 (BST) Subject: [R] dbWriteTable error message In-Reply-To: References: Message-ID: On Wed, 31 Aug 2011, Henri-Paul Indiogine wrote: > I am at loss of what is going on here ... > > I am trying to write to a SQLite database: > > con <- dbConnect(dbDriver("SQLite"), dbname="pres-docs.rqda") > > I have a data frame that is 889 rows by 7 columns. The column number > and types agree with the database table columns and column type. > > dbWriteTable(con, "fileAttr", DF.4, row.names=FALSE, overwrite=TRUE) > > Then I get the following error: > > [1] FALSE > Warning message: > In value[[3L]](cond) : > RAW() can only be applied to a 'raw', not a 'character' > > Where could I start looking? For the record, this sort of message almost always indicates internal corruption of R objects, so run the example under valgrind. Oh, you failed to tell us the 'at a minimum' information required by the posting guide, so we don't know if valgrind runs on your platform. > > Thanks, > Henri-Paul > > > -- > Henri-Paul Indiogine > > Curriculum & Instruction > Texas A&M University > TutorFind Learning Centre > > Email: hindiogine at gmail.com > Skype: hindiogine > Website: http://people.cehd.tamu.edu/~sindiogine > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From alaios at yahoo.com Thu Sep 1 08:58:16 2011 From: alaios at yahoo.com (Alaios) Date: Wed, 31 Aug 2011 23:58:16 -0700 (PDT) Subject: [R] ggplot2 to create a "square" plot In-Reply-To: References: <1314811136.640.YahooMailNeo@web120110.mail.ne1.yahoo.com> Message-ID: <1314860296.53552.YahooMailNeo@web120112.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From totangjie at gmail.com Thu Sep 1 09:11:13 2011 From: totangjie at gmail.com (Jie TANG) Date: Thu, 1 Sep 2011 15:11:13 +0800 Subject: [R] how to get the varifying character with two variables? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ripley at stats.ox.ac.uk Thu Sep 1 09:33:06 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Thu, 1 Sep 2011 08:33:06 +0100 (BST) Subject: [R] how to get the varifying character with two variables? In-Reply-To: References: Message-ID: See ?outer (use paste as the function, as in one of the examples). On Thu, 1 Sep 2011, Jie TANG wrote: > hi R user > mtdno<-paste("data",1:3,sep="") > tyno<-paste("obs",1:5,sep="") > flnm<-paste(mtdno,tyno,"_err.dat",sep="") > > flnm is > [1] "data1obs1_err.dat" "data2obs2_err.dat" "data3obs3_err.dat" > [4] "data1obs4_err.dat" "data2obs5_err.dat" > > but actually what i want is data from 1 to 3 and obs from 1 to 5. thus ,I > can read 15 files but not 5 files > > how could I do? > thanks. > -- > TANG Jie > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From totangjie at gmail.com Thu Sep 1 09:45:51 2011 From: totangjie at gmail.com (Jie TANG) Date: Thu, 1 Sep 2011 15:45:51 +0800 Subject: [R] how to get the varifying character with two variables? In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From azamjaafari at yahoo.com Thu Sep 1 10:34:04 2011 From: azamjaafari at yahoo.com (azam jaafari) Date: Thu, 1 Sep 2011 01:34:04 -0700 (PDT) Subject: [R] save grid Message-ID: <1314866044.56708.YahooMailNeo@web37104.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From landronimirc at gmail.com Thu Sep 1 10:37:57 2011 From: landronimirc at gmail.com (Liviu Andronic) Date: Thu, 1 Sep 2011 10:37:57 +0200 Subject: [R] problem with "plm" package In-Reply-To: <1314843915121-3782639.post@n4.nabble.com> References: <80098b6d0911270708g216f6b01m5dd57131b199df66@mail.gmail.com> <1314843915121-3782639.post@n4.nabble.com> Message-ID: On Thu, Sep 1, 2011 at 4:25 AM, Ash wrote: > Hi, > > I am trying to complete a very simple panel analysis on some bank data. > > Call: > Formula<-plm(RoE~RoA+CAR+Inc.Dep+Cash.TL+NPL.Loans, data=Banks1, > model="random", index=c("Bank.I.D.","Year")) > summary(Formula) > > I get the following error code: > > ERROR: > ?missing value where TRUE/FALSE needed > > Does anyone know what true/false field I am missing? > I am not sure what goes wrong, but try first to Banks1.p <- pdata.frame(Banks1, index=c("Bank.I.D.","Year") and see if that went fine, and then Banks1.fit <- plm(RoE~RoA+CAR+Inc.Dep+Cash.TL+NPL.Loans, data=Banks1.p, model="random") It may help if you posted str(Banks1) Liviu From paul.hiemstra at knmi.nl Thu Sep 1 10:55:33 2011 From: paul.hiemstra at knmi.nl (Paul Hiemstra) Date: Thu, 1 Sep 2011 08:55:33 +0000 Subject: [R] missing R.dll file In-Reply-To: References: Message-ID: <4E5F4885.8000804@knmi.nl> On 09/01/2011 04:30 AM, Erin Hodgess wrote: > Dear R People: > > I was using R with no problems last night. > > Now tonight, when I type in "Rgui", I get an error message that R.dll > is not found and try to re-install. > > Have any of you run into this, please? > > Thanks, > Erin > > Hi Erin, Without more information, we cannot help you. Please read the posting guide for suggestions. Specifically, which OS do you use (windows probably), which version of R etc. The first thing I would do is start a search for R.dll and see if the file is present. If not, try reinstalling R and see if the problem persists. good luck, Paul -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 From paul.hiemstra at knmi.nl Thu Sep 1 10:57:23 2011 From: paul.hiemstra at knmi.nl (Paul Hiemstra) Date: Thu, 01 Sep 2011 08:57:23 +0000 Subject: [R] !!!function to do the knn!!! In-Reply-To: References: <1314801332615-3781137.post@n4.nabble.com> Message-ID: <4E5F48F3.7060405@knmi.nl> On 08/31/2011 02:55 PM, David Winsemius wrote: > Thank your for your entry in the Poorly Capitalized and Inadequately > Searched Posting Contest. You will be advised of your ranking in due > course. In the meantime, you may want to consult the recommended > search site for []functions and []Task Views that have "clustering" > and "classification" in their text: > > http://search.r-project.org/cgi-bin/namazu.cgi?query=classification+clustering&max=100&result=normal&sort=score&idxname=functions&idxname=views > > > I originally set the parameters at just "classification" in []Task > Views and []functions but that list was too long. > Candidate for fortunes? Paul -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 From mdowle at mdowle.plus.com Thu Sep 1 10:59:36 2011 From: mdowle at mdowle.plus.com (Matthew Dowle) Date: Thu, 1 Sep 2011 09:59:36 +0100 Subject: [R] formatting a 6 million row data set; creating a censoring variable References: Message-ID: This is the fastest data.table way I can think of : ans = mydt[,list(mytime=.N),by=list(id,mygroup)] ans[,censor:=0L] ans[J(unique(id)), censor:=1L, mult="last"] id mygroup mytime censor [1,] 1 A 1 1 [2,] 2 B 3 0 [3,] 2 C 3 0 [4,] 2 D 6 1 [5,] 3 A 3 0 [6,] 3 B 3 1 [7,] 4 A 1 1 > I'll post the timings on the real data set shortly. Please do. Matthew "William Dunlap" wrote in message news:E66794E69CFDE04D9A70842786030B9304E857 at PA-MBX04.na.tibco.com... > I'll assume that all of an individual's data rows > are contiguous and that an individual always passes through > the groups in order (or, least, the individual > never leaves a group and then reenters it), so we > can find everything we need to know by comparing each > row with the previous row. > > You can use rle() to quickly make the time > column: > > rle(paste(d$mygroup, d$id))$lengths > [1] 1 3 3 6 3 3 1 > > For the censor column it is probably easiest to consider > what rle() must do internally and use a modification of that. > E.g., > isFirstInRun <- function(x) c(TRUE, x[-1] != x[-length(x)]) > isLastInRun <- function(x) c(x[-1] != x[-length(x)], TRUE) > outputRows <- isLastInRun(d$mygroup) | isLastInRun(d$id) > output <- d[outputRows, ] > output$mytime <- diff(c(0, which(outputRows))) > output$censor <- as.integer(isLastInRun(e$id)) > which gives you > > output > gender mygroup id mytimes censor > 1 F A 1 1 1 > 4 F B 2 3 0 > 7 F C 2 3 0 > 13 F D 2 6 1 > 16 M A 3 3 0 > 19 M B 3 3 1 > 20 M A 4 1 1 > You showed a rearrangment of the columns > > output[, c("id", "mygroup", "mytime", "censor")] > id mygroup mytime censor > 1 1 A 1 1 > 4 2 B 3 0 > 7 2 C 3 0 > 13 2 D 6 1 > 16 3 A 3 0 > 19 3 B 3 1 > 20 4 A 1 1 > This ought to be quicker than plyr, but data.table > may do similar run-oriented operations. > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > >> -----Original Message----- >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] >> On Behalf Of Juliet Hannah >> Sent: Wednesday, August 31, 2011 10:51 AM >> To: r-help at r-project.org >> Subject: [R] formatting a 6 million row data set; creating a censoring >> variable >> >> List, >> >> Consider the following data. >> >> gender mygroup id >> 1 F A 1 >> 2 F B 2 >> 3 F B 2 >> 4 F B 2 >> 5 F C 2 >> 6 F C 2 >> 7 F C 2 >> 8 F D 2 >> 9 F D 2 >> 10 F D 2 >> 11 F D 2 >> 12 F D 2 >> 13 F D 2 >> 14 M A 3 >> 15 M A 3 >> 16 M A 3 >> 17 M B 3 >> 18 M B 3 >> 19 M B 3 >> 20 M A 4 >> >> Here is the reshaping I am seeking (explanation below). >> >> id mygroup mytime censor >> [1,] 1 A 1 1 >> [2,] 2 B 3 0 >> [3,] 2 C 3 0 >> [4,] 2 D 6 1 >> [5,] 3 A 3 0 >> [6,] 3 B 3 1 >> [7,] 4 A 1 1 >> >> I need to create 2 variables. The first one is a time variable. >> Observe that for id=2, the variable mygroup=B was observed 3 times. In >> the solution we see in row 2 that id=2 has a mytime variable of 3. >> >> Next, I need to create a censoring variable. >> >> Notice id=2 goes through has values of B, C, D for mygroup. This means >> the change from B to C and C to D is observed. There is no change >> from D. I need to indicate this with a 'censoring' variable. So B and >> C would have values 0, and D would have a value of 1. As another >> example, id=1 never changes, so I assign it censor= 1. Overall, if a >> change is observed, 0 should be assigned, and if a change is not >> observed 1 should be assigned. >> >> One potential challenge is that the original data set has over 5 >> million rows. I have ideas, but I'm still getting used the the >> data.table and plyr syntax. I also seek a base R solution. I'll post >> the timings on the real data set shortly. >> >> Thanks for your help. >> >> > sessionInfo() >> R version 2.13.1 (2011-07-08) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> # Here is a simplified data set >> >> myData <- structure(list(gender = c("F", "F", "F", "F", "F", "F", "F", >> "F", "F", "F", "F", "F", "F", "M", "M", "M", "M", "M", "M", "M" >> ), mygroup = c("A", "B", "B", "B", "C", "C", "C", "D", "D", "D", >> "D", "D", "D", "A", "A", "A", "B", "B", "B", "A"), id = c("1", >> "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "3", >> "3", "3", "3", "3", "3", "4")), .Names = c("gender", "mygroup", >> "id"), class = "data.frame", row.names = c(NA, -20L)) >> >> >> # here is plyr solution with idata.frame >> >> library(plyr) >> imyData <- idata.frame(myData) >> timeData <- idata.frame(ddply(imyData, .(id,mygroup), summarize, >> mytime = length(mygroup))) >> >> makeCensor <- function(x) { >> myvec <- rep(0,length(x)) >> lastInd <- length(myvec) >> myvec[lastInd] = 1 >> myvec >> } >> >> >> plyrSolution <- ddply(timeData, "id", transform, censor = >> makeCensor(mygroup)) >> >> >> # here is a data table solution >> # use makeCensor function from above >> >> library(data.table) >> mydt <- data.table(myData) >> setkey(mydt,id,mygroup) >> >> timeData <- mydt[,list(mytime=length(gender)),by=list(id,mygroup)] >> makeCensor <- function(x) { >> myvec <- rep(0,length(x)) >> lastInd <- length(myvec) >> myvec[lastInd] = 1 >> myvec >> } >> >> mycensor <- timeData[,list(censor=makeCensor(mygroup)),by=id] >> datatableSolution <- cbind(timeData,mycensor[,list(censor)]) >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > From d.carpenter at nhm.ac.uk Thu Sep 1 11:43:49 2011 From: d.carpenter at nhm.ac.uk (Dan Carpenter ) Date: Thu, 1 Sep 2011 10:43:49 +0100 Subject: [R] unequal bins in filled.contour In-Reply-To: <4E5E78B5.6010802@gmail.com> References: <2849ACCCD0E95246A1477249EEAA5DB9CD8BB7@HOMER.nhm.ac.uk> <4E5E78B5.6010802@gmail.com> Message-ID: <2849ACCCD0E95246A1477249EEAA5DB9CD8BB8@HOMER.nhm.ac.uk> That's brilliant, thanks Dr Dan Carpenter | Post-doctoral Research Assistant | Soil Biodiversity Group | Entomology Department | Natural History Museum | London | SW7 5BD | 0207 942 5208 | d.carpenter at nhm.ac.uk -----Original Message----- From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com] Sent: 31 August 2011 19:09 To: Dan Carpenter Cc: r-help at r-project.org Subject: Re: [R] unequal bins in filled.contour On 31/08/2011 10:33 AM, Dan Carpenter wrote: > Hello, > > I am trying to plot SADIE red-blue plots of cluster indicies using > filled.contour. I want a plot which only has three bins for the data: > <-1.5, -1.5 - 1.5,>1.5, but I am having trouble getting there. > > example > X1 X2 X3 X4 X5 > 1 -5 -4.5 1.0 4.5 6 > 2 -3 -2.0 1.2 -1.0 3 > 3 0 0.0 0.0 -0.5 -1 > 4 -2 -3.0 1.0 1.5 3 > 5 -6 -2.0 0.5 3.0 2 > example<-as.matrix(example) > filled.contour(example) > filled.contour(example,levels=seq(min(example),max(example)), > color.palette=colorRampPalette(c("blue","white","red"))) > > I tried this to get just three bins, but data outside that range doesn't > plot. > seq(-4.5,4.5,by=3) > [1] -4.5 -1.5 1.5 4.5 > filled.contour(example, levels=seq(-4.5,4.5,by=3), > color.palette=colorRampPalette(c("blue","white","red"))) > > I increased the range but I now have two shades of red and blue, when > what I really want is one shade of each > seq(-7.5,7.5,by=3) > [1] -7.5 -4.5 -1.5 1.5 4.5 7.5 > filled.contour(example, levels=seq(-7.5,7.5,by=3), > color.palette=colorRampPalette(c("blue","white","red"))) > > Is there a way to either > 1) Create unequal bins, so all data below -1.5 is blue and all data > above 1.5 red > Or > 2) colour more than one bin the same shade 1) Just say where you want the breaks, e.g. levels=c(min(example), -1.5, 1.5, max(example)) 2) Give the colours you want using something like col=c("red", "white", "red") Duncan Murdoch From mailzhuyao at gmail.com Thu Sep 1 11:51:42 2011 From: mailzhuyao at gmail.com (zhu yao) Date: Thu, 1 Sep 2011 17:51:42 +0800 Subject: [R] How to retrieve bias-corrected probability from calibrate.rms Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From janko.thyson.rstuff at googlemail.com Thu Sep 1 11:59:13 2011 From: janko.thyson.rstuff at googlemail.com (Janko Thyson) Date: Thu, 01 Sep 2011 11:59:13 +0200 Subject: [R] Why does loading saved/cached objects add significantly to RAM consumption? In-Reply-To: References: <4E5CC298.6090306@googlemail.com> Message-ID: <4E5F5771.5050602@googlemail.com> On 30.08.2011 20:33, Henrik Bengtsson wrote: > Hi. > > On Tue, Aug 30, 2011 at 3:59 AM, Janko Thyson > wrote: >> Dear list, >> >> I make use of cached objects extensively for time consuming computations and >> yesterday I happened to notice some very strange behavior in that respect: >> When I execute a given computation whose result I'd like to cache (tried >> both saving it as '.Rdata' and via package 'R.cache' which uses a own >> filetype '.Rcache'), > Just to clarify, it is just the filename extension that is "custom"; > it uses base::save() internally. It is very unlikely that R.cache has > to do with your problem. Okay, got it. > >> my R session consumes about 200 MB of RAM, which is >> fine. Now, when I make use of the previously cached object (i.e. loading it, >> assigning it to a certain field of a Reference Class object), I noticed that >> RAM consumption of my R process jumps to about 250 MB! >> a >> Each new loading of cached/saved objects adds to that consumption (in total, >> I have about 5-8 objects that are processed this way), so at some point I >> easily get a RAM consumption of over 2 GB where I'm only at about 200 MB of >> consumption when I compute each object directly! Object sizes (checked with >> 'object.size()') remain fairly constant. What's even stranger: after loading >> cached objects and removing them (either via 'rm()' or by assigning a >> 'fresh' empty object to the respective Reference Class field, RAM >> consumption remains at this high level and never comes down again. >> >> I checked the behavior also in a small example which is a simplification of >> my use case and which you'll find below (checked both on Win XP and Win 7 32 >> bit). I couldn't quite reproduce an immediate increase in RAM consumption, > I couldn't reproduce it either using sessionInfo(): > > R version 2.13.1 Patched (2011-08-29 r56823) > Platform: x86_64-pc-mingw32/x64 (64-bit) > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > attached base packages: > [1] stats graphics grDevices utils datasets methods base > loaded via a namespace (and not attached): > [1] tools_2.13.1 I'll try to come up with an example that resembles more of my actual use case. >> but what I still find really strange is >> a) why do repeated 'load()' calls result in an increase in RAM consumption? >> b) why does the latter not go down again after the objects have been removed >> from '.GlobalEnv'? Thanks for the hint to an explicit call to 'gc()'. That brings down memorey usage and would work if I wouldn't need the "content" of the objects I load and could therefore remove them ('rm(x)'; 'gc()'), but that's exactly what I need: load data and assign it to some environments. > Removed objects may still sit in memory - it is only when R's garbage > collector (GC) comes around and removes them that the memory usage > goes down. You can force the garbage collector to run by calling > gc(), but normally it is automatically triggered whenever needed. > > Note that the GC will only be able to clean up the memory of removed > objects IFF there are no other references to that object/piece of > memory. When you use References classes (cf. setRefClass()) and > environments, you end up keeping references internally in objects > without being aware of it. My guess is that your other code may have > such issues, whereas the code below does not. > > There is also the concept of "promises" [see 'R Language Definition' > document], which *may* also be involved. > > FYI, the Sysinternals Process Explorer > [http://technet.microsoft.com/en-us/sysinternals/bb896653] is a useful > tool for studying individual processes such as R. Thanks for that one as well! I'll have a more detailed look into this. Best regards, Janko > My $.02 > > Henrik > >> Did anyone of you experience a similar behavior? Or even better, does anyone >> know why this is happening and how it might be fixed (or be worked around)? >> ;-) >> >> I really need your help on this one as it's crucial for my thesis, thanks a >> lot for anyone replying!! >> >> Regards, >> Janko >> >> ##### EXAMPLE ##### >> >> setRefClass("A", fields=list(.PRIMARY="environment")) >> setRefClass("Test", fields=list(a="A")) >> >> obj.1<- lapply(1:5000, function(x){ >> rnorm(x) >> }) >> names(obj.1)<- paste("sample", 1:5000, sep=".") >> obj.1<- as.environment(obj.1) >> >> test<- new("Test", a=new("A", .PRIMARY=obj.1)) >> test$a$.PRIMARY$sample.10 >> >> #+++++ >> >> object.size(test) >> object.size(test$a) >> object.size(obj.1) >> # RAM used by R session: 118 MB >> >> save(obj.1, file="C:/obj.1.Rdata") >> # Results in an object of ca. 94 MB >> save(test, file="C:/test.Rdata") >> # Results in an object of ca. 94 MB >> >> ##### START A NEW R SESSION ##### >> >> load("C:/test.Rdata") >> # RAM consumption still fine at 115 - 118 MB >> >> # But watch how it goes up as we repeatedly load objects >> for(x in 1:5){ >> load("C:/test.Rdata") >> } >> for(x in 1:5){ >> load("C:/obj.1.Rdata") >> } >> # Somehow there seems to be an upper limit, though >> >> # Removing the objects does not bring down RAM consumption >> rm(obj.1) >> rm(test) >> >> ########## >> >>> Sys.info() >> sysname release >> "Windows" "XP" >> version nodename >> "build 2600, Service Pack 3" "ASHB-109C-02" >> machine login >> "x86" "wwa418" >> user >> "wwa418" >> >>> sessionInfo() >> R version 2.13.1 (2011-07-08) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 >> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C >> [5] LC_TIME=German_Germany.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> loaded via a namespace (and not attached): >> [1] codetools_0.2-8 tools_2.13.1 >> >> From petr.pikal at precheza.cz Thu Sep 1 12:16:45 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Thu, 1 Sep 2011 12:16:45 +0200 Subject: [R] !!!function to do the knn!!! In-Reply-To: <1314812135626-3781738.post@n4.nabble.com> References: <1314801332615-3781137.post@n4.nabble.com> <1314812135626-3781738.post@n4.nabble.com> Message-ID: Hi Try to call 112 if you are in Europe. Regards Petr > Re: [R] !!!function to do the knn!!! > > help, help ,help!!! > > -- > View this message in context: http://r.789695.n4.nabble.com/function-to- > do-the-knn-tp3781137p3781738.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From landronimirc at gmail.com Thu Sep 1 12:40:06 2011 From: landronimirc at gmail.com (Liviu Andronic) Date: Thu, 1 Sep 2011 12:40:06 +0200 Subject: [R] !!!function to do the knn!!! In-Reply-To: References: <1314801332615-3781137.post@n4.nabble.com> <1314812135626-3781738.post@n4.nabble.com> Message-ID: I would nominate the following fortune: On Thu, Sep 1, 2011 at 12:16 PM, Petr PIKAL wrote: > Try to call 112 if you are in Europe. > David Winsemius: Thank your for your entry in the Poorly Capitalized and Inadequately Searched Posting Contest. mark: help, help ,help!!! Petr PIKAL: Try to call 112 if you are in Europe. Liviu From JRadinger at gmx.at Thu Sep 1 12:41:50 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Thu, 01 Sep 2011 12:41:50 +0200 Subject: [R] read.xlsx handle NAs Message-ID: <20110901104150.296910@gmx.net> Hello, I import a xlsx-table with read.xlx which contains NAs in 4 columns. In the excel table these 4 columns contain floats/integers and the word NA. But when I am importing the NAs are recognized as strings rather than as missing values. How can I set that if a cell contains the word NA that this is a real missing value? Is there any option (probably colClasses) where I can set it for the import, and how should it look like? Or do I have to treat the imported dataframe, and if yes how? Thank you Johannes -- From paul.hiemstra at knmi.nl Thu Sep 1 12:55:15 2011 From: paul.hiemstra at knmi.nl (Paul Hiemstra) Date: Thu, 01 Sep 2011 10:55:15 +0000 Subject: [R] save grid In-Reply-To: <1314866044.56708.YahooMailNeo@web37104.mail.mud.yahoo.com> References: <1314866044.56708.YahooMailNeo@web37104.mail.mud.yahoo.com> Message-ID: <4E5F6493.8030204@knmi.nl> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From murdoch.duncan at gmail.com Thu Sep 1 13:02:59 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Thu, 01 Sep 2011 07:02:59 -0400 Subject: [R] Namespace in packages In-Reply-To: References: Message-ID: <4E5F6663.8050908@gmail.com> On 11-09-01 2:27 AM, Eran Eidinger wrote: > Hello, > > I wonder how I might create a package that only reveals some of the function > in the package to the user. > > I've tried creating an R package using the following: > f<- function(x,y) x+y > g<- function(x,y) x-y > h<- function(x,y) f(x,y)*g(x,y) > > package.skeleton(list=c("f","g","h"), name="mypkg") > > and would like only h() to be available when I load it, and exposed. > I tried creating a file called NAMESPACE that has one line: > export(h) > But after running R CMD install, when I try to load it with: > library("mypkg",lib.loc=...) i get "Error in namespaceExport(ns, exports) : > undefined exports: h"/ > What am I doing wrong? Your description of what you did sounds like the right thing. Does your package work without the NAMESPACE file? I.e. are all of f, g and h visible? I would guess there is something else wrong in it, and h is really not there. > (BTW, side question, if, after loading, i overload f() or g() with a new > definition from the R console, will h() be affected?) No. Duncan Murdoch > > Thanks, > Eran Eidinger. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From eran at taykey.com Thu Sep 1 13:04:20 2011 From: eran at taykey.com (Eran Eidinger) Date: Thu, 1 Sep 2011 14:04:20 +0300 Subject: [R] Namespace in packages In-Reply-To: <4E5F6663.8050908@gmail.com> References: <4E5F6663.8050908@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From murdoch.duncan at gmail.com Thu Sep 1 13:06:51 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Thu, 01 Sep 2011 07:06:51 -0400 Subject: [R] Namespace in packages In-Reply-To: References: <4E5F6663.8050908@gmail.com> Message-ID: <4E5F674B.9050102@gmail.com> On 11-09-01 7:04 AM, Eran Eidinger wrote: > Yes, the package works fine without the NAMESPACE file, and all 3 functions > are visible. When you include the NAMESPACE file, what does R CMD check tell you? Duncan Murdoch > > > On Thu, Sep 1, 2011 at 2:02 PM, Duncan Murdochwrote: > >> On 11-09-01 2:27 AM, Eran Eidinger wrote: >> >>> Hello, >>> >>> I wonder how I might create a package that only reveals some of the >>> function >>> in the package to the user. >>> >>> I've tried creating an R package using the following: >>> f<- function(x,y) x+y >>> g<- function(x,y) x-y >>> h<- function(x,y) f(x,y)*g(x,y) >>> >>> package.skeleton(list=c("f","**g","h"), name="mypkg") >>> >>> and would like only h() to be available when I load it, and exposed. >>> I tried creating a file called NAMESPACE that has one line: >>> export(h) >>> But after running R CMD install, when I try to load it with: >>> library("mypkg",lib.loc=...) i get "Error in namespaceExport(ns, exports) >>> : >>> undefined exports: h"/ >>> What am I doing wrong? >>> >> >> Your description of what you did sounds like the right thing. >> >> Does your package work without the NAMESPACE file? I.e. are all of f, g >> and h visible? I would guess there is something else wrong in it, and h is >> really not there. >> >> >> >> (BTW, side question, if, after loading, i overload f() or g() with a new >>> definition from the R console, will h() be affected?) >>> >> >> No. >> >> Duncan Murdoch >> >> >>> Thanks, >>> Eran Eidinger. >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________**________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/**listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/** >>> posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> > > From xkziloj at gmail.com Thu Sep 1 14:10:18 2011 From: xkziloj at gmail.com (. .) Date: Thu, 1 Sep 2011 09:10:18 -0300 Subject: [R] Measuring CPU time Message-ID: Why time is increasing for the same operation? I was expecting +/- the same time for each n. Thanks in advance. bench <- function(f1, n, ...) { t <- 0 for(i in 1:n) { func <- function(x) x^2 expr <- list(...)[1] f1 <- c("system.time(y <- ", gsub("XXX",expr,f1),")[3]") t1 <- eval(parse(text = f1)) printf("time %d: %f\n", i, t1) t <- t + t1 } t <- t/n printf("mean time: %f", t) } bench("func(XXX)", 10, "1:100") From erich.neuwirth at univie.ac.at Thu Sep 1 14:20:45 2011 From: erich.neuwirth at univie.ac.at (Erich Neuwirth) Date: Thu, 01 Sep 2011 14:20:45 +0200 Subject: [R] Namespace in packages In-Reply-To: References: <4E5F6663.8050908@gmail.com> Message-ID: <4E5F789D.1040705@univie.ac.at> On 9/1/2011 1:04 PM, Eran Eidinger wrote: >> >>> Hello, >>> >>> I wonder how I might create a package that only reveals some of the >>> function >>> in the package to the user. >>> >>> I've tried creating an R package using the following: >>> f<- function(x,y) x+y >>> g<- function(x,y) x-y >>> h<- function(x,y) f(x,y)*g(x,y) >>> >>> package.skeleton(list=c("f","**g","h"), name="mypkg") ^ what is the meaning of ** here? such an object seems not to be defined. From djmuser at gmail.com Thu Sep 1 14:23:53 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Thu, 1 Sep 2011 05:23:53 -0700 Subject: [R] vector output loop or function In-Reply-To: References: Message-ID: Hi: Here's one approach: X1 <- sample(1:4, 10, replace = TRUE, prob = c(0.4, 0.2, 0.2, 0.2)) foo <- function(x) { m <- matrix(NA, nrow = length(x), ncol = length(x)) m[, 1] <- x idx <- seq_len(length(x)) for(j in idx[-1]) { k <- sample(idx, 2) x <- replace(x, k, 5) m[, j] <- x } m } foo(X1) # For an example I ran, I got > foo(X1) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 4 4 4 4 4 4 4 4 4 4 [2,] 2 2 2 2 5 5 5 5 5 5 [3,] 4 4 4 4 4 4 4 5 5 5 [4,] 1 1 1 1 1 1 1 5 5 5 [5,] 1 5 5 5 5 5 5 5 5 5 [6,] 2 2 2 2 2 2 2 2 5 5 [7,] 3 3 5 5 5 5 5 5 5 5 [8,] 4 4 4 4 5 5 5 5 5 5 [9,] 3 5 5 5 5 5 5 5 5 5 [10,] 4 4 4 5 5 5 5 5 5 5 The function returns a matrix (which should make the function a bit faster). You can always convert it to a data frame and assign column names to it afterward, or you can modify the function to return a data frame rather than a matrix (as.data.frame(m) in the last line). HTH, Dennis On Wed, Aug 31, 2011 at 6:10 PM, Nilaya Sharma wrote: > Dear all > > Sorry for simple question: > > I want to put the following option into look as number of X is large 1000 > variables > > X1 <- sample(c(1,2, 3, 4),10, replace = T, prob = c(0.4, 0.2, 0.2, 0.2)) > > cv1 <- round(runif(2, 1, 10)) > > > # X2 is copy of X1 > > X2 <- X1 > > # now X2 is different in cv1 random positions > > X2[cv1] <- ? 5 > > cv2 <- round(runif(2, 1, 10)) > > > # X3 is copy of X2 > > X3 <- X2 > > X3[cv2] <- ? 5 > > ?. > > So on till X10 > > mydf <- data.frame ( X1, X2, X3, X4, X5, X6, X7, X8, X9, X10) > > > > ?The basic idea is the X2 is like X1 but is different at two positions where > the normal value is replaced with 5, the position is defined by cv1 . The > process is repeated till the last variable. > > > > I tried several way. One of unsuccessful function: > > ?v = 2:10 > > ?mufun3 <- function(v, x){ > > ? ? ? ? ? ?x[,v] <- x[,v-1] > > ? ? ? ? ? ?cv1 <- round(runif(3, 1, 10)) > > ? ? ? ? ? ?} > > ?mufun3 (v, X1) > > > Thank you for the help > > > NIL > > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From ggrothendieck at gmail.com Thu Sep 1 14:25:06 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Thu, 1 Sep 2011 08:25:06 -0400 Subject: [R] Measuring CPU time In-Reply-To: References: Message-ID: On Thu, Sep 1, 2011 at 8:10 AM, . . wrote: > Why time is increasing for the same operation? > > I was expecting +/- the same time for each n. > > Thanks in advance. > > bench <- function(f1, n, ...) { > ?t <- 0 > ?for(i in 1:n) { > ? ?func <- function(x) x^2 > ? ?expr <- list(...)[1] > ? ?f1 <- c("system.time(y <- ", gsub("XXX",expr,f1),")[3]") > ? ?t1 <- eval(parse(text = f1)) > ? ?printf("time %d: %f\n", i, t1) > ? ?t <- t + t1 > ?} > ?t <- t/n > ?printf("mean time: %f", t) > } > bench("func(XXX)", 10, "1:100") > On each iteration f1 gets larger. (Also printf is not defined.) Check out the rbenchmark package. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From jholtman at gmail.com Thu Sep 1 14:33:30 2011 From: jholtman at gmail.com (jim holtman) Date: Thu, 1 Sep 2011 08:33:30 -0400 Subject: [R] Measuring CPU time In-Reply-To: References: Message-ID: Do a little debugging on your code (put print(f1)) and you will see that you keep adding to the length of the expression to be evaluated and the results you see are correct. Learn how to debug your functions. On Thu, Sep 1, 2011 at 8:10 AM, . . wrote: > Why time is increasing for the same operation? > > I was expecting +/- the same time for each n. > > Thanks in advance. > > bench <- function(f1, n, ...) { > ?t <- 0 > ?for(i in 1:n) { > ? ?func <- function(x) x^2 > ? ?expr <- list(...)[1] > ? ?f1 <- c("system.time(y <- ", gsub("XXX",expr,f1),")[3]") > ? ?t1 <- eval(parse(text = f1)) > ? ?printf("time %d: %f\n", i, t1) > ? ?t <- t + t1 > ?} > ?t <- t/n > ?printf("mean time: %f", t) > } > bench("func(XXX)", 10, "1:100") > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From azamjaafari at yahoo.com Thu Sep 1 14:38:09 2011 From: azamjaafari at yahoo.com (azam jaafari) Date: Thu, 1 Sep 2011 05:38:09 -0700 (PDT) Subject: [R] convert to grid file Message-ID: <1314880689.25668.YahooMailNeo@web37101.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From f.harrell at vanderbilt.edu Thu Sep 1 14:43:27 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Thu, 1 Sep 2011 05:43:27 -0700 (PDT) Subject: [R] How to retrieve bias-corrected probability from calibrate.rms In-Reply-To: References: Message-ID: <1314881007611-3783420.post@n4.nabble.com> cal <- calibrate(fit, ...); note that cal is a matrix. colnames(cal) will tell you what to pick, in this case cal[,'calibrated.corrected']. Be sure to follow the posting guide. Frank yz wrote: > > Dear R users: > > In Prof. Harrell's library rms, calibrate.rms plot the Bias-corrected > Probability and Apparent Probability. > The latter one can be retrieved from class calibrate.default. But how to > retrieve the former one. > > BW > > *Yao Zhu* > *Department of Urology > Fudan University Shanghai Cancer Center > Shanghai, China* > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/How-to-retrieve-bias-corrected-probability-from-calibrate-rms-tp3783160p3783420.html Sent from the R help mailing list archive at Nabble.com. From igorcarron at gmail.com Thu Sep 1 12:16:47 2011 From: igorcarron at gmail.com (Igor Carron) Date: Thu, 1 Sep 2011 12:16:47 +0200 Subject: [R] Newer Matrix Factorization Techniques Message-ID: Hi, I am not sure if this should go to r-help or r-dev list. I have looked at some archives of R libraries but cannot seem to see a project that focuses on the new matrix factorization techniques that are showing up in the literature. I have made a list of them: https://sites.google.com/site/igorcarron2/matrixfactorizations they include Robust PCA, Dictionnary Learning, Sparse PCA and are mostly implemented in Matlab. What is most interesting is that even though they share the same names with techniques already listed in some R libraries, they really are implementing very different algorithms that perform better. Here are is question: Is there a repository in the R-world somewhere that lists new implementations of an algorithm like say robust PCA ? Thanks in advance, Igor. ------------------------ Igor Carron, Ph.D. http://nuit-blanche.blogspot.com From petr.pikal at precheza.cz Thu Sep 1 14:59:47 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Thu, 1 Sep 2011 14:59:47 +0200 Subject: [R] convert to grid file In-Reply-To: <1314880689.25668.YahooMailNeo@web37101.mail.mud.yahoo.com> References: <1314880689.25668.YahooMailNeo@web37101.mail.mud.yahoo.com> Message-ID: Hi > > Hi > > I computed probability in each cell. > I have: > > [99883,] -0.0062412957690 > [99884,] -0.0062412957690 > [99885,] -0.0062412957690 > [99886,] -0.0062412957690 > [99887,] -0.0062412957690 > [99888,] -0.0062412957690 > [99889,] 0.9909126638948 > [99890,] 0.9909126638948 > [99891,] 0.9909126638948 > [99892,] 0.9909126638948 > [99893,] 0.9909126638948 > [99894,] 0.9909126638948 > [99895,] 0.9909126638948 > [99896,] 0.9909126638948 > [99897,] 0.9909126638948 > [99898,] 0.9909126638948 > [99899,] 0.9909126638948 > [99900,] 0.9909126638948 > [99901,] 0.9909126638948 > [99902,] 0.9909126638948 > [99903,] 0.9909126638948 > > [99999,] -0.0062412957690 > [ reached getOption("max.print") -- omitted 839931 rows ]] > > How want to convert this matrix to a grid file with 970*960 pixel. Assuming your object is one column matrix called mat1 You can simply change its dimension dim(mat) <- c(970, 960) regards Petr > > Thanks alot > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From nvanzuydam at gmail.com Thu Sep 1 10:34:45 2011 From: nvanzuydam at gmail.com (natalie.vanzuydam) Date: Thu, 1 Sep 2011 01:34:45 -0700 (PDT) Subject: [R] Multiple events Cox's model and proportional hazards Message-ID: <1314866085787-3783031.post@n4.nabble.com> Hi, I am using the survival package to perform a Cox's regression analysis on multiple events of myocardial infarctions. I have been using the Andersen and Gill model: coxph(Surv(time1,time2,status)~factor(treatment)+age+sex+cluster(id). I was just wondering if this model should satisfy proportional hazards assumptions. I have run the cox.zph function and the age parameter violates the proportional hazards? What would be the best way to construct this model. Should I include time dependent covariates? Thanks, Natalie ----- Natalie Van Zuydam PhD Student University of Dundee nvanzuydam at dundee.ac.uk -- View this message in context: http://r.789695.n4.nabble.com/Multiple-events-Cox-s-model-and-proportional-hazards-tp3783031p3783031.html Sent from the R help mailing list archive at Nabble.com. From jaugusiak at googlemail.com Thu Sep 1 13:18:42 2011 From: jaugusiak at googlemail.com (J. Augusiak) Date: Thu, 1 Sep 2011 13:18:42 +0200 Subject: [R] Help with creating date as POSIXct Message-ID: <007e01cc6898$ea5f65e0$bf1e31a0$@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From r_b_hamilton at yahoo.co.uk Thu Sep 1 13:28:02 2011 From: r_b_hamilton at yahoo.co.uk (betty_d) Date: Thu, 1 Sep 2011 04:28:02 -0700 (PDT) Subject: [R] betareg question - keeping the mean fixed? Message-ID: <1314876482055-3783303.post@n4.nabble.com> Hello, I have a dataset with proportions that vary around a fixed mean, is it possible to use betareg to look at variance in the dispersion parameter while keeping the mean fixed? I am very new to R but have tried the following: svec<-c(qlogis(mean(data1$scaled)),0,0,0) f<-betareg(scaled~-1 | expt_label + grouped_hpi, data=data1, link.phi="log", control=betareg.control(start=svec)) I understood that y~-1 could be used to give a fixed mean of 0.5 however I get the following error: Error in linkinv(x %*% beta + offset) : Argument eta must be a nonempty numeric vector I think I can work round this by using: svec2<-c(qlogis(mean(data1$scaled)),0,0,0,0,0) f2<-betareg(scaled ~ expt_label + grouped_hpi | expt_label + grouped_hpi, data=data1, + link.phi="log",control=betareg.control(start=svec2)) This appears to work (ie doesn't return errors), but given i know the mean to be fixed I would like to do this in the model (especially as my data set is small) - is this possible? Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/betareg-question-keeping-the-mean-fixed-tp3783303p3783303.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Thu Sep 1 14:30:50 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 1 Sep 2011 08:30:50 -0400 Subject: [R] how to get the varifying character with two variables? In-Reply-To: References: Message-ID: <2F061E06-AA51-49B6-9100-257A06876CDE@comcast.net> On Sep 1, 2011, at 3:45 AM, Jie TANG wrote: > thank you. it works. > but further question is that if we can let the " flnm" to be a 2- > dimension > matrix [3,5]? > since > mtdno<-paste("data",1:3,sep="") > tyno<-paste("obs",1:5,sep="") > flnm<-paste(mtdno,tyno,"_err.dat",sep="") > flnm would be a 1-dimension with 15 elements? Just take out the c() call that converted the matrix from outer into a vector. -- David > > thankyou > > 2011/9/1 Jorge I Velez > >> Hi Jie, >> >> Try >> >> c(outer(mtdno, tyno, FUN = paste, "_err.dat", sep = "")) >> >> HTH, >> Jorge >> >> >> On Thu, Sep 1, 2011 at 3:11 AM, Jie TANG <> wrote: >> >>> hi R user >>> mtdno<-paste("data",1:3,sep="") >>> tyno<-paste("obs",1:5,sep="") >>> flnm<-paste(mtdno,tyno,"_err.dat",sep="") >>> >>> flnm is >>> [1] "data1obs1_err.dat" "data2obs2_err.dat" "data3obs3_err.dat" >>> [4] "data1obs4_err.dat" "data2obs5_err.dat" >>> >>> but actually what i want is data from 1 to 3 and obs from 1 to 5. >>> thus ,I >>> can read 15 files but not 5 files >>> >>> how could I do? >>> thanks. >>> -- >>> TANG Jie >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> > > > -- > TANG Jie > Email: totangjie at gmail.com > Tel: 0086-2154896104 > Shanghai Typhoon Institute,China > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From cenae27 at hotmail.com Thu Sep 1 14:36:28 2011 From: cenae27 at hotmail.com (cenae27) Date: Thu, 1 Sep 2011 05:36:28 -0700 (PDT) Subject: [R] R Help finding Mean Message-ID: <1314880588900-3783400.post@n4.nabble.com> bob<-read.csv('shi.csv', header=T) newmean<-matrix(0, test, dim(bob)[2]-6);a<-0; for (i in c(4,8:(dim(bob)[2]))) {a<-a+1;newmean[,a]<-tapply(bob[,i], bob$Exam, mean)} colnames(newmean)<-colnames(bob)[c(4,8:(dim(bob)[2]))] Could anyone please help me what does the above code does ... I want to find mean ... but would like to know what exactly is the above code doing. Thanks for your help. Cenae -- View this message in context: http://r.789695.n4.nabble.com/R-Help-finding-Mean-tp3783400p3783400.html Sent from the R help mailing list archive at Nabble.com. From jim.silverton at gmail.com Thu Sep 1 15:14:16 2011 From: jim.silverton at gmail.com (Jim Silverton) Date: Thu, 1 Sep 2011 09:14:16 -0400 Subject: [R] Negative Binomial GLM Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From azamjaafari at yahoo.com Thu Sep 1 15:14:22 2011 From: azamjaafari at yahoo.com (azam jaafari) Date: Thu, 1 Sep 2011 06:14:22 -0700 (PDT) Subject: [R] convert to grid file In-Reply-To: References: <1314880689.25668.YahooMailNeo@web37101.mail.mud.yahoo.com> Message-ID: <1314882862.81234.YahooMailNeo@web37102.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djmuser at gmail.com Thu Sep 1 15:18:43 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Thu, 1 Sep 2011 06:18:43 -0700 Subject: [R] ggplot2 to create a "square" plot In-Reply-To: <1314860296.53552.YahooMailNeo@web120112.mail.ne1.yahoo.com> References: <1314811136.640.YahooMailNeo@web120110.mail.ne1.yahoo.com> <1314860296.53552.YahooMailNeo@web120112.mail.ne1.yahoo.com> Message-ID: Hi: On Wed, Aug 31, 2011 at 11:58 PM, Alaios wrote: > Dear Dennis, > I would like to thank you for your reply. > I also checked the web sites that you gave me, it is hard to find everything > about ggplot2 at one place with concrete examples that help you understand > directly what you are plotting. There is work in progress to alleviate that problem, but it is still being worked on and it may be a little while before it becomes publicly available. That could mean anywhere from a few days to a few months. I don't know what is the current status wrt the project. > As you have already mentioned ggsave can save the image as I want to and > that is what I am using now. > One minor issue (as you have also mentioned is to remove they gray border > between the x and y legend and the red and blue area. See scale_continuous and look at the expand = argument; something like scale_x_continuous(expand = c(0, 0)) + scale_y_continuous(expand = c(0, 0)) might be what you need, but try it for yourself. HTH, Dennis > Thus I checked the website and tried by applying the options to remove it. > Unfortunately I ended up with a full list of more arguments > print(v + geom_tile(aes(fill=dB))+ > opts(axis.text.x=theme_text(size=20),axis.text.y=theme_text(size=20), > axis.title.x=theme_text(size=25) , axis.title.y=theme_text(size=25), > legend.title=theme_text(size=25,hjust=-0.4) , > legend.text=theme_text(size=20) ,?? panel.background=theme_blank() , > panel.margin=unit(100,"lines") , panel.grid.major=theme_line(size=0.1) , > plot.margin=unit(c(0,0,0,0),"lines") ) + scale_x_continuous('km')? + > scale_y_continuous('km')??? ) > so far I have not remove that extra space.. > Do you have any suggestion of how I can remove it? > I would like to thank all for their time. > B.R > Alex > ________________________________ > From: Dennis Murphy > To: Alaios > Cc: "R-help at r-project.org" > Sent: Wednesday, August 31, 2011 9:34 PM > Subject: Re: [R] ggplot2 to create a "square" plot > > Hi: > > I'd suggest using ggsave(); in particular, see its height = and width > = arguments. If you have some time, you could look at some examples of > ggplot2 themes: > https://github.com/hadley/ggplot2/wiki/themes > and some examples of how to use various opts(): > https://github.com/hadley/ggplot2/wiki/%2Bopts%28%29-List > > These can be useful if you need to reduce the amount of space around > the plot or reposition the legend to the top or bottom to get more > horizontal space for the plot. This sometimes is a germane issue when > the plot is intended to be square. > > HTH, > Dennis > > On Wed, Aug 31, 2011 at 10:18 AM, Alaios wrote: >> Dear all, >> I am using ggplot with geom_tile to print as an image a matrix? I have. My >> matrix is a squared one of 512*512 cells. >> >> The code that does that is written below >> >> >>> print(v + geom_tile(aes(fill=dB))+ >>> opts(axis.text.x=theme_text(size=20),axis.text.y=theme_text(size=20), >>> axis.title.x=theme_text(size=25) , axis.title.y=theme_text(size=25), >>> legend.title=theme_text(size=25,hjust=-0.4) , >>> legend.text=theme_text(size=20)) + scale_x_continuous('km')? + >>> scale_y_continuous('km')??? ) >> >> >> >> as you can see from the picture below >> >> http://imageshack.us/photo/my-images/171/backupf.jpg/ >> >> this squared matrix is printed a bit squeezed with the height being bigger >> than the width. Would be possible somehow to print that plot by keeping the >> square-look of the matrix in the plot? Of course the other elements like >> axis and legend will make the over all plot to not be square but I do not >> care as the blue and red region forms a square. >> >> I would like to thank you in advance for your help >> B.R >> Alex >> >> ? ? ? ?[[alternative HTML version deleted]] >> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > > From therneau at mayo.edu Thu Sep 1 15:23:39 2011 From: therneau at mayo.edu (Terry Therneau) Date: Thu, 01 Sep 2011 08:23:39 -0500 Subject: [R] Weights using Survreg Message-ID: <1314883419.2400.7.camel@nemo> The survreg function uses case weights. That is, if a subject is given a weight of 2, the result is the same as if there were a second observation (exactly the same). Early in my career data sets that contained only categorical variables were often collapsed in just this way, in order to save on computer memory and allow the analysis of larger problems. (Continuous variables such as age might be turned into categorical to facilitate the collapse). Long, long ago in computer time... Terry Therneau From Bettina.Gruen at jku.at Thu Sep 1 15:24:35 2011 From: Bettina.Gruen at jku.at (Bettina Gruen) Date: Thu, 01 Sep 2011 23:24:35 +1000 Subject: [R] betareg question - keeping the mean fixed? In-Reply-To: <1314876482055-3783303.post@n4.nabble.com> References: <1314876482055-3783303.post@n4.nabble.com> Message-ID: <4E5F8793.1090308@jku.at> Hi, > I have a dataset with proportions that vary around a fixed mean, is it > possible to use betareg to look at variance in the dispersion parameter > while keeping the mean fixed? > > I am very new to R but have tried the following: > > svec<-c(qlogis(mean(data1$scaled)),0,0,0) > f<-betareg(scaled~-1 | expt_label + grouped_hpi, data=data1, link.phi="log", > control=betareg.control(start=svec)) > > I understood that y~-1 could be used to give a fixed mean of 0.5 however I > get the following error: > Error in linkinv(x %*% beta + offset) : > Argument eta must be a nonempty numeric vector If you want to have a fixed mean, i.e., only fit an intercept, you need to specify it using y ~ 1 | exp_label + grouped_hpi. Including -1 in the formula on the right hand side makes only sense if you have other covariates included and explicitly want to exclude the intercept. HTH, Bettina -- ------------------------------------------------------------------- Bettina Gr?n Institut f?r Angewandte Statistik / IFAS Johannes Kepler Universit?t Linz Altenbergerstra?e 69 4040 Linz, Austria Tel: +43 732 2468-5889 Fax: +43 732 2468-9846 E-Mail:Bettina.Gruen at jku.at www.ifas.jku.at ------------------------------------------------------------------- -- ------------------------------------------------------------------- Bettina Gr?n Institut f?r Angewandte Statistik / IFAS Johannes Kepler Universit?t Linz Altenbergerstra?e 69 4040 Linz, Austria Tel: +43 732 2468-5889 Fax: +43 732 2468-9846 E-Mail: Bettina.Gruen at jku.at www.ifas.jku.at From mailzhuyao at gmail.com Thu Sep 1 15:29:02 2011 From: mailzhuyao at gmail.com (zhu yao) Date: Thu, 1 Sep 2011 21:29:02 +0800 Subject: [R] How to retrieve bias-corrected probability from calibrate.rms In-Reply-To: <1314881007611-3783420.post@n4.nabble.com> References: <1314881007611-3783420.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From millerlp at gmail.com Thu Sep 1 15:37:39 2011 From: millerlp at gmail.com (Luke Miller) Date: Thu, 1 Sep 2011 09:37:39 -0400 Subject: [R] Help with creating date as POSIXct In-Reply-To: <007e01cc6898$ea5f65e0$bf1e31a0$@gmail.com> References: <007e01cc6898$ea5f65e0$bf1e31a0$@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.weylandt at gmail.com Thu Sep 1 15:37:56 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Thu, 1 Sep 2011 09:37:56 -0400 Subject: [R] rJava Installation Problems: 'cannot open compressed file 'rJava/DESCRIPTION', probable reason 'No such file or directory'' Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From xkziloj at gmail.com Thu Sep 1 15:42:48 2011 From: xkziloj at gmail.com (. .) Date: Thu, 1 Sep 2011 10:42:48 -0300 Subject: [R] Alternatives to integrate? Message-ID: Hi all, is there any alternative to the function integrate? Any comments are welcome. Thanks in advance. From bps0002 at auburn.edu Thu Sep 1 15:53:13 2011 From: bps0002 at auburn.edu (B77S) Date: Thu, 1 Sep 2011 06:53:13 -0700 (PDT) Subject: [R] Alternatives to integrate? In-Reply-To: References: Message-ID: <1314885193156-3783645.post@n4.nabble.com> package "caTools" see ?trapz . wrote: > > Hi all, > > is there any alternative to the function integrate? > > Any comments are welcome. > > Thanks in advance. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- View this message in context: http://r.789695.n4.nabble.com/Alternatives-to-integrate-tp3783624p3783645.html Sent from the R help mailing list archive at Nabble.com. From michael.weylandt at gmail.com Thu Sep 1 15:55:08 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Thu, 1 Sep 2011 09:55:08 -0400 Subject: [R] Alternatives to integrate? In-Reply-To: <1314885193156-3783645.post@n4.nabble.com> References: <1314885193156-3783645.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bps0002 at auburn.edu Thu Sep 1 15:55:52 2011 From: bps0002 at auburn.edu (B77S) Date: Thu, 1 Sep 2011 06:55:52 -0700 (PDT) Subject: [R] R Help finding Mean In-Reply-To: <1314880588900-3783400.post@n4.nabble.com> References: <1314880588900-3783400.post@n4.nabble.com> Message-ID: <1314885352131-3783652.post@n4.nabble.com> see ?mean Then avoid other peoples code. cenae27 wrote: > > bob<-read.csv('shi.csv', header=T) > > newmean<-matrix(0, test, dim(bob)[2]-6);a<-0; for (i in > c(4,8:(dim(bob)[2]))) > {a<-a+1;newmean[,a]<-tapply(bob[,i], bob$Exam, mean)} > colnames(newmean)<-colnames(bob)[c(4,8:(dim(bob)[2]))] > > Could anyone please help me what does the above code does ... I want to > find mean ... but would like to know what exactly is the above code doing. > > Thanks for your help. > Cenae > -- View this message in context: http://r.789695.n4.nabble.com/R-Help-finding-Mean-tp3783400p3783652.html Sent from the R help mailing list archive at Nabble.com. From jholtman at gmail.com Thu Sep 1 15:55:55 2011 From: jholtman at gmail.com (jim holtman) Date: Thu, 1 Sep 2011 09:55:55 -0400 Subject: [R] R Help finding Mean In-Reply-To: <1314880588900-3783400.post@n4.nabble.com> References: <1314880588900-3783400.post@n4.nabble.com> Message-ID: If you format it a little differently, it is easier to read: bob<-read.csv('shi.csv', header=T) newmean<-matrix(0, test, dim(bob)[2]-6) a<-0 for (i in c(4,8:(dim(bob)[2]))){ a<-a+1 newmean[,a]<-tapply(bob[,i], bob$Exam, mean) } colnames(newmean)<-colnames(bob)[c(4,8:(dim(bob)[2]))] It looks like it is computing the means of columns 4, then 8 through the last one, but since we don't know what 'shi.csv' is, it is hard to tell. You might have problems in the assignment to 'newmean' depending on if the length of the result of tapply is equal to the number of rows. It appears that each row should be an Exam. The are other approaches (data.table, plyr) that might be applicable, but you need to following the posting guidelines. On Thu, Sep 1, 2011 at 8:36 AM, cenae27 wrote: > bob<-read.csv('shi.csv', header=T) > > newmean<-matrix(0, test, dim(bob)[2]-6);a<-0; for (i in > c(4,8:(dim(bob)[2]))) > {a<-a+1;newmean[,a]<-tapply(bob[,i], bob$Exam, mean)} > colnames(newmean)<-colnames(bob)[c(4,8:(dim(bob)[2]))] > > Could anyone please help me what does the above code does ... I want to find > mean ... but would like to know what exactly is the above code doing. > > Thanks for your help. > Cenae > > -- > View this message in context: http://r.789695.n4.nabble.com/R-Help-finding-Mean-tp3783400p3783400.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From jholtman at gmail.com Thu Sep 1 16:01:02 2011 From: jholtman at gmail.com (jim holtman) Date: Thu, 1 Sep 2011 10:01:02 -0400 Subject: [R] Help with creating date as POSIXct In-Reply-To: <007e01cc6898$ea5f65e0$bf1e31a0$@gmail.com> References: <007e01cc6898$ea5f65e0$bf1e31a0$@gmail.com> Message-ID: try this: > myDate <- as.POSIXct(day, format = "%y%m%d") > myDate [1] "2011-08-09 EDT" > myDate <- myDate + 0:3600 > str(myDate) POSIXct[1:3601], format: "2011-08-09 00:00:00" "2011-08-09 00:00:01" "2011-08-09 00:00:02" ... > On Thu, Sep 1, 2011 at 7:18 AM, J. Augusiak wrote: > Dear list, > > > > I want to create a POSIX time vector as follows: > > > > day ? ?<- as.character("110809") > > time.t <- 1:3600 > > t.min ?<- time.t %/% 60 > > t.sec ?<- time.t-t.min*60 > > DATE ? <- as.POSIXct(strptime(paste(day,t.min,t.sec),"%y%m%d %M%S")) > > Tail(DATE) > > > > > > The problem is that the last element (3600) returns a NA and I don't > understand why. 600, 1200, 2400 no problem, only 3600. Any helpful advice is > highly appreciated :) > > > > Cheers, > > > > Jacqueline > > > > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From kw.stat at gmail.com Thu Sep 1 16:06:02 2011 From: kw.stat at gmail.com (Kevin Wright) Date: Thu, 1 Sep 2011 09:06:02 -0500 Subject: [R] Newer Matrix Factorization Techniques In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From petr.pikal at precheza.cz Thu Sep 1 16:22:41 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Thu, 1 Sep 2011 16:22:41 +0200 Subject: [R] convert to grid file In-Reply-To: <1314882862.81234.YahooMailNeo@web37102.mail.mud.yahoo.com> References: <1314880689.25668.YahooMailNeo@web37101.mail.mud.yahoo.com> <1314882862.81234.YahooMailNeo@web37102.mail.mud.yahoo.com> Message-ID: Hi > Thank you Petr > > It work. > > Now I have a matrix 970*960. If I want to convert to spatial grid (each > pixel has x and y coordinate). > How can I do? I do not understand. What do you want to do with your data? Maybe you could consult spatial package or CRAN Task views. One option could be to make vectors of row and columns coordinates. But it depends on what you want to do with your data. Regards Petr > > Thanks > > > From: Petr PIKAL > To: azam jaafari > Cc: R-help > Sent: Thursday, September 1, 2011 8:59 AM > Subject: Re: [R] convert to grid file > > Hi > > > > > Hi > > > > I computed probability in each cell. > > I have: > > > > [99883,] -0.0062412957690 > > [99884,] -0.0062412957690 > > [99885,] -0.0062412957690 > > [99886,] -0.0062412957690 > > [99887,] -0.0062412957690 > > [99888,] -0.0062412957690 > > [99889,] 0.9909126638948 > > [99890,] 0.9909126638948 > > [99891,] 0.9909126638948 > > [99892,] 0.9909126638948 > > [99893,] 0.9909126638948 > > [99894,] 0.9909126638948 > > [99895,] 0.9909126638948 > > [99896,] 0.9909126638948 > > [99897,] 0.9909126638948 > > [99898,] 0.9909126638948 > > [99899,] 0.9909126638948 > > [99900,] 0.9909126638948 > > [99901,] 0.9909126638948 > > [99902,] 0.9909126638948 > > [99903,] 0.9909126638948 > > > > [99999,] -0.0062412957690 > > [ reached getOption("max.print") -- omitted 839931 rows ]] > > > > How want to convert this matrix to a grid file with 970*960 pixel. > > Assuming your object is one column matrix called mat1 > > You can simply change its dimension > > dim(mat) <- c(970, 960) > > regards > Petr > > > > > > > > Thanks alot > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > From xkziloj at gmail.com Thu Sep 1 16:33:53 2011 From: xkziloj at gmail.com (. .) Date: Thu, 1 Sep 2011 11:33:53 -0300 Subject: [R] Alternatives to integrate? In-Reply-To: References: <1314885193156-3783645.post@n4.nabble.com> Message-ID: Hi Michael, This is the problem: func <- Vectorize(function(x, a, sad, samp="pois", trunc=0, ...) { result <- function(x) { f1 <- function(n) { f <- function() { dcom <- paste("d", sad, sep="") dots <- c(as.name("n"), list(...)) do.call(dcom, dots) } g <- function() { dcom <- paste("d", samp, sep="") lambda <- a * n dots <- c(as.name("x"), as.name("lambda")) do.call(dcom, dots) } f() * g() } integrate(f1,0,2000)$value # adaptIntegrate(f1,0,2000)$integral # n <- 0:2000 # trapz(n,f1(n)) # area(f1, 0, 2000, limit=10000, eps=1e-100) } return(result(x) / (1 - result(trunc))) }, "x") func(200, 0.05, "exp", rate=0.001) If you could propose something I will be gratefull. Thanks in advance. On Thu, Sep 1, 2011 at 10:55 AM, R. Michael Weylandt wrote: > Mr ". .", > > MASS::area comes to mind but it may be more helpful if you could say what > you are looking for / why integrate is not appropriate it is for whatever > you are doing. > > Strictly speaking, I suppose there are all sorts of "alternatives" to > integrate() if you are willing to be really creative and build something > from scratch: diff(), cumsum(), lm(), hist(), t(), c(), .... > > Michael Weylandt > > On Thu, Sep 1, 2011 at 9:53 AM, B77S wrote: >> >> package "caTools" >> see ?trapz >> >> >> . wrote: >> > >> > Hi all, >> > >> > is there any alternative to the function integrate? >> > >> > Any comments are welcome. >> > >> > Thanks in advance. >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/Alternatives-to-integrate-tp3783624p3783645.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > From azamjaafari at yahoo.com Thu Sep 1 16:44:17 2011 From: azamjaafari at yahoo.com (azam jaafari) Date: Thu, 1 Sep 2011 07:44:17 -0700 (PDT) Subject: [R] convert to grid file In-Reply-To: References: <1314880689.25668.YahooMailNeo@web37101.mail.mud.yahoo.com> <1314882862.81234.YahooMailNeo@web37102.mail.mud.yahoo.com> Message-ID: <1314888257.64489.YahooMailNeo@web37106.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jcbouette at gmail.com Thu Sep 1 16:39:47 2011 From: jcbouette at gmail.com (=?ISO-8859-1?Q?Jean=2DChristophe_BOU=CBTT=C9?=) Date: Thu, 1 Sep 2011 10:39:47 -0400 Subject: [R] qqplot for count data Message-ID: Dear list, I just tried to do the same thing, and did not find anything on a weighted qqplot. My weights are actually counts (positive integers). Here is a modification of qqplot, following Duncan Murdoch's suggestion. Any feedback would be welcome! Thanks, Jean-Christophe weighted.qqplot <- function (x, y, plot.it = TRUE, xlab = deparse(substitute(x)), ylab = deparse(substitute(y)), x.counts=rep(1L,length.out=length(x)), y.counts=rep(1L,length.out=length(y)), ...){ sx <- sort(x) sy <- sort(y) swx <- cumsum(x.counts[order(x)]) swy <- cumsum(y.counts[order(y)]) lenx <- length(sx) leny <- length(sy) sx <- approx(swx, sx, n=min(lenx,leny))$y sy <- approx(swy, sy, n=min(lenx,leny))$y if (plot.it) plot(sx, sy, xlab = xlab, ylab = ylab, ...) invisible(list(x = sx, y = sy)) } #Sample example n <- 15 a <- runif(n);b <- 1L:length(a);x <- rep(a,b) c <- runif(n);d <- length(c):1L;y <- rep(c,d) weighted.qqplot(x,y,type="b") par(new=TRUE) weighted.qqplot(a,c,x.counts=b,y.counts=d,type="b",pch="*",col="grey") par(new=TRUE) qqplot(x,y,type="b",pch="+",col="red") From: Duncan Murdoch Date: Thu 16 Mar 2006 - 05:50:27 EST On 3/15/2006 1:38 PM, Vivek Satsangi wrote: > Folks, > I am documenting what I finally did, for the next person who comes along... > > Following Dr. Murdoch's suggestion, I looked at qqplot. The following > approach might be helpful to get to the same information as given by > qqplot. > To summarize the ask: given x, y, xw and yw, show (visually is okay) > whether a and b are from the same distribution. xw is the weight of > each x observation and yw is the weight of each y observation. > > Put x and xw into a dataframe. > Sort by x. > Calculate cumulative x weights, normalized to total 1. > > Put y and yw into a dataframe. > Sort by y > Calculate cumulative weights, normalized to total 1. > > Plot x and y against cumulative normalized weights. The shapes of the > two lines should be similar (to the eye)-- or the distribution is > "different". One variation that would make the result more like a qqplot: you could work out a vector of weights w (perhaps the cumulative weights from x or from y or perhaps something else) and plot y(w) versus x(w), where y(w) and x(w) are the linear interpolation values that approx gives you. Duncan Murdoch From michael.weylandt at gmail.com Thu Sep 1 16:49:50 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Thu, 1 Sep 2011 10:49:50 -0400 Subject: [R] Alternatives to integrate? In-Reply-To: References: <1314885193156-3783645.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Yan_Li at ibi.com Thu Sep 1 16:59:49 2011 From: Yan_Li at ibi.com (Li, Yan) Date: Thu, 1 Sep 2011 10:59:49 -0400 Subject: [R] iSeries and R Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From petr.pikal at precheza.cz Thu Sep 1 17:01:58 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Thu, 1 Sep 2011 17:01:58 +0200 Subject: [R] convert to grid file In-Reply-To: <1314888257.64489.YahooMailNeo@web37106.mail.mud.yahoo.com> References: <1314880689.25668.YahooMailNeo@web37101.mail.mud.yahoo.com> <1314882862.81234.YahooMailNeo@web37102.mail.mud.yahoo.com> <1314888257.64489.YahooMailNeo@web37106.mail.mud.yahoo.com> Message-ID: Hi I do not know sp package therefore I can not help you. Anyway without knowing "what is wrong" and without some reproducible example you hardly get any reasonable answer. It seems to me that your d list especially x and y is not what you expect to be. see str(d) 430071.4887:460006.8067 will resilt in row of integers from 430071 to 460006. I believe it is not what you want. Regards Petr > > Thank you > > I want to make a map from my spatial data that I can show it in GIS. > I used the package sp and > > d<- list(x=430071.4887:460006.8067, y =3390040.0591:3420006.2701, z = > matrix(mat, 970, 960)) > gt<- GridTopology(cellcentre.offset = c(d$x[1], d$y[1]), > cellsize=c(diff(d$x[1:2]), diff(d$y[1:2])), cells.dim = dim(d$z)) > grdatts<-SpatialGridDataFrame(gt, data.frame(depth = > as.vector(d$z[,ncol(d$z):1]))) > image(grd.atts, axes=TRUE) > grd<-SpatialGridDataFrame(gt, data.frame(depth = > as.vector(d$z[,ncol(d$z):1]))) > > It doesn't work and I think it is wrong. > > Thanks > > From: Petr PIKAL > To: azam jaafari > Cc: R-help > Sent: Thursday, September 1, 2011 10:22 AM > Subject: Re: [R] convert to grid file > > Hi > > > Thank you Petr > > > > It work. > > > > Now I have a matrix 970*960. If I want to convert to spatial grid (each > > pixel has x and y coordinate). > > How can I do? > > I do not understand. What do you want to do with your data? Maybe you > could consult spatial package or CRAN Task views. One option could be to > make vectors of row and columns coordinates. But it depends on what you > want to do with your data. > > Regards > Petr > > > > > > Thanks > > > > > > From: Petr PIKAL > > To: azam jaafari > > Cc: R-help > > Sent: Thursday, September 1, 2011 8:59 AM > > Subject: Re: [R] convert to grid file > > > > Hi > > > > > > > > Hi > > > > > > I computed probability in each cell. > > > I have: > > > > > > [99883,] -0.0062412957690 > > > [99884,] -0.0062412957690 > > > [99885,] -0.0062412957690 > > > [99886,] -0.0062412957690 > > > [99887,] -0.0062412957690 > > > [99888,] -0.0062412957690 > > > [99889,] 0.9909126638948 > > > [99890,] 0.9909126638948 > > > [99891,] 0.9909126638948 > > > [99892,] 0.9909126638948 > > > [99893,] 0.9909126638948 > > > [99894,] 0.9909126638948 > > > [99895,] 0.9909126638948 > > > [99896,] 0.9909126638948 > > > [99897,] 0.9909126638948 > > > [99898,] 0.9909126638948 > > > [99899,] 0.9909126638948 > > > [99900,] 0.9909126638948 > > > [99901,] 0.9909126638948 > > > [99902,] 0.9909126638948 > > > [99903,] 0.9909126638948 > > > > > > [99999,] -0.0062412957690 > > > [ reached getOption("max.print") -- omitted 839931 rows ]] > > > > > > How want to convert this matrix to a grid file with 970*960 pixel. > > > > Assuming your object is one column matrix called mat1 > > > > You can simply change its dimension > > > > dim(mat) <- c(970, 960) > > > > regards > > Petr > > > > > > > > > > > > > > Thanks alot > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > From bt_jannis at yahoo.de Thu Sep 1 17:14:28 2011 From: bt_jannis at yahoo.de (Jannis) Date: Thu, 1 Sep 2011 16:14:28 +0100 (BST) Subject: [R] get arguments passed to function inside a function Message-ID: <1314890068.48163.YahooMailClassic@web28201.mail.ukl.yahoo.com> Dear list, I am wondering whether there is an (easy) way to access all arguments and their values passed to a function inside this function and (for example) store them in a list object? I could imagine using ls() inside this function and then looping through all names and assigning list entries with the values of the respective objects but I could imagine that there is already something ready made in R to achieve this. Does anybody have an idea? Thanks Jannis From hpaul.benton08 at imperial.ac.uk Thu Sep 1 17:13:13 2011 From: hpaul.benton08 at imperial.ac.uk (Benton, Paul) Date: Thu, 1 Sep 2011 15:13:13 +0000 Subject: [R] readBin fails to read large files References: <8CCE7727-9F99-4033-9611-C10B62489077@ic.ac.uk> Message-ID: <1D34FD4B-D539-4758-A6B1-EF8295970485@imperial.ac.uk> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From j.maas at uea.ac.uk Thu Sep 1 17:16:42 2011 From: j.maas at uea.ac.uk (Jim Maas) Date: Thu, 01 Sep 2011 16:16:42 +0100 Subject: [R] cannot correct step size, geweke.diag of coda Message-ID: <4E5FA1DA.9090707@uea.ac.uk> Intermittently I'm getting this error from the geweke.diag function of the coda package. Would anyone be kind enough to enlighten me as to the possible source of such an error, or how to debug/locate it? Error in { : task 22 failed - "inner loop 1; cannot correct step size" Thanks a bunch. J -- Dr. Jim Maas University of East Anglia From jholtman at gmail.com Thu Sep 1 17:22:30 2011 From: jholtman at gmail.com (jim holtman) Date: Thu, 1 Sep 2011 11:22:30 -0400 Subject: [R] readBin fails to read large files In-Reply-To: <1D34FD4B-D539-4758-A6B1-EF8295970485@imperial.ac.uk> References: <8CCE7727-9F99-4033-9611-C10B62489077@ic.ac.uk> <1D34FD4B-D539-4758-A6B1-EF8295970485@imperial.ac.uk> Message-ID: Are you running a 64-bit version of R? It sounds like your operating system is not giving you enough memory. It looks like this is not under Windows in a native mode. On Thu, Sep 1, 2011 at 11:13 AM, Benton, Paul wrote: > Posting for a friend > > Begin forwarded message: > > From: "Geier, Florian" > > Subject: Fwd: readBin fails to read large files > Date: September 1, 2011 4:10:53 PM GMT+01:00 > To: > > > > Begin forwarded message: > > Date: 1 September 2011 16:01:45 GMT+01:00 > Subject: readBin fails to read large files > > Dear all, > > I am trying to read a large file (~2GB) of unsigned ints into R. Using the command: > > raw<-readBin("file",n=10^8, integer(),endian="little",signed=FALSE) > > It works fine for n=10^8, but fails for n=10^9 (or even at n=6*10^8). My machine$sizeof.long is 8 bit. > I am running R 2.13.1 on a x86_64-apple-darwin9.8.0/x86_64 (64-bit) architecture. > > Thanks for your help > > Florian > > -- > AXA doctoral fellow > Bundy lab - Biomolecular Medicine > Imperial College London > > > > > > -- > AXA doctoral fellow > Bundy lab - Biomolecular Medicine > Imperial College London > > > > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From SPhillips at Lexington1.net Thu Sep 1 17:23:11 2011 From: SPhillips at Lexington1.net (Shane Phillips) Date: Thu, 1 Sep 2011 11:23:11 -0400 Subject: [R] Converting anova/ancova summary to data frame In-Reply-To: <4E5E9D20.7010607@statistik.tu-dortmund.de> References: <4E5E9D20.7010607@statistik.tu-dortmund.de> Message-ID: Perfect! Thank you! S -----Original Message----- From: Uwe Ligges [mailto:ligges at statistik.tu-dortmund.de] Sent: Wednesday, August 31, 2011 4:44 PM To: Shane Phillips Cc: r-help at r-project.org Subject: Re: [R] Converting anova/ancova summary to data frame On 31.08.2011 22:33, Shane Phillips wrote: > Hi! > > Can anyone tell me how to convert the anova/ancova summary output into a data frame? For the first component of such an object: as.data.frame(summary.aov_object[[1]]) Uwe Ligges > Thanks! > > Shane Phillips > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From totangjie at gmail.com Thu Sep 1 17:28:37 2011 From: totangjie at gmail.com (Jie TANG) Date: Thu, 1 Sep 2011 23:28:37 +0800 Subject: [R] how to get the varifying character with two variables? In-Reply-To: <2F061E06-AA51-49B6-9100-257A06876CDE@comcast.net> References: <2F061E06-AA51-49B6-9100-257A06876CDE@comcast.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From murdoch.duncan at gmail.com Thu Sep 1 17:27:25 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Thu, 01 Sep 2011 11:27:25 -0400 Subject: [R] get arguments passed to function inside a function In-Reply-To: <1314890068.48163.YahooMailClassic@web28201.mail.ukl.yahoo.com> References: <1314890068.48163.YahooMailClassic@web28201.mail.ukl.yahoo.com> Message-ID: <4E5FA45D.40701@gmail.com> On 01/09/2011 11:14 AM, Jannis wrote: > Dear list, > > > I am wondering whether there is an (easy) way to access all arguments and their values passed to a function inside this function and (for example) store them in a list object? > > I could imagine using ls() inside this function and then looping through all names and assigning list entries with the values of the respective objects but I could imagine that there is already something ready made in R to achieve this. > > Does anybody have an idea? as.list(environment()) should do it. Duncan Murdoch From murdoch.duncan at gmail.com Thu Sep 1 17:30:32 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Thu, 01 Sep 2011 11:30:32 -0400 Subject: [R] readBin fails to read large files In-Reply-To: <1D34FD4B-D539-4758-A6B1-EF8295970485@imperial.ac.uk> References: <8CCE7727-9F99-4033-9611-C10B62489077@ic.ac.uk> <1D34FD4B-D539-4758-A6B1-EF8295970485@imperial.ac.uk> Message-ID: <4E5FA518.2050904@gmail.com> On 01/09/2011 11:13 AM, Benton, Paul wrote: > Posting for a friend > What does "fails" mean, i.e. what is the error message? (You might want to get Florian online here.) Duncan Murdoch > Begin forwarded message: > > From: "Geier, Florian"> > Subject: Fwd: readBin fails to read large files > Date: September 1, 2011 4:10:53 PM GMT+01:00 > To: > > > > Begin forwarded message: > > Date: 1 September 2011 16:01:45 GMT+01:00 > Subject: readBin fails to read large files > > Dear all, > > I am trying to read a large file (~2GB) of unsigned ints into R. Using the command: > > raw<-readBin("file",n=10^8, integer(),endian="little",signed=FALSE) > > It works fine for n=10^8, but fails for n=10^9 (or even at n=6*10^8). My machine$sizeof.long is 8 bit. > I am running R 2.13.1 on a x86_64-apple-darwin9.8.0/x86_64 (64-bit) architecture. > > Thanks for your help > > Florian > > -- > AXA doctoral fellow > Bundy lab - Biomolecular Medicine > Imperial College London > > > > > > -- > AXA doctoral fellow > Bundy lab - Biomolecular Medicine > Imperial College London > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jvadams at usgs.gov Thu Sep 1 17:48:19 2011 From: jvadams at usgs.gov (Jean V Adams) Date: Thu, 1 Sep 2011 10:48:19 -0500 Subject: [R] Treat an Unquoted Character String as a Data Frame Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bill_harris at facilitatedsystems.com Thu Sep 1 17:50:35 2011 From: bill_harris at facilitatedsystems.com (Bill Harris) Date: Thu, 1 Sep 2011 08:50:35 -0700 Subject: [R] Hysteresis modeling and simulation Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From aelmore at usgs.gov Thu Sep 1 17:29:52 2011 From: aelmore at usgs.gov (aelmore) Date: Thu, 1 Sep 2011 08:29:52 -0700 (PDT) Subject: [R] executing R scripts - viewing results and errors Message-ID: <1314890992773-3783946.post@n4.nabble.com> Hi, For the first time, I am trying to call/run an R script from another program, passing parameters and results back and forth. I have been learning a little programming as I have progressed through the various projects I have been working on, but I haven't had any formal training and am now stumped. Specifically, how do you pass results and errors out of R so that you can debug? My R script works fine as a standalone. My Python script seems clean and isn't giving me any errors, but the R script isn't running. R is obviously not receiving the inputs because the run-time is zero for the R portion of my model execution. Since I haven't been able to figure out how to "see" into R, however, I am unable to figure out where the problem lies. Can anyone give me a leg up on how you can get at outputs and errors when you do this sort of batch execution? If it's of any help, I can send along the Python and R scripts, but since I don't know that it's really necessary for this sort of question I won't clutter the thread, up front. Thanks, Annie -- View this message in context: http://r.789695.n4.nabble.com/executing-R-scripts-viewing-results-and-errors-tp3783946p3783946.html Sent from the R help mailing list archive at Nabble.com. From florian.geier08 at imperial.ac.uk Thu Sep 1 17:32:49 2011 From: florian.geier08 at imperial.ac.uk (Geier, Florian) Date: Thu, 1 Sep 2011 15:32:49 +0000 Subject: [R] readBin fails to read large files In-Reply-To: References: <8CCE7727-9F99-4033-9611-C10B62489077@ic.ac.uk> <1D34FD4B-D539-4758-A6B1-EF8295970485@imperial.ac.uk> Message-ID: Hi Jim, yes - it definitely is 64 bit. I call it with r64 and .Platform$r_arch [1] "x86_64" It is on a apple snow leopard (10.6.8) with 16 GB of Ram - not windows Florian On 1 Sep 2011, at 16:22, jim holtman wrote: > Are you running a 64-bit version of R? It sounds like your operating > system is not giving you enough memory. It looks like this is not > under Windows in a native mode. > > On Thu, Sep 1, 2011 at 11:13 AM, Benton, Paul > wrote: >> Posting for a friend >> >> Begin forwarded message: >> >> From: "Geier, Florian" > >> Subject: Fwd: readBin fails to read large files >> Date: September 1, 2011 4:10:53 PM GMT+01:00 >> To: >> >> >> >> Begin forwarded message: >> >> Date: 1 September 2011 16:01:45 GMT+01:00 >> Subject: readBin fails to read large files >> >> Dear all, >> >> I am trying to read a large file (~2GB) of unsigned ints into R. Using the command: >> >> raw<-readBin("file",n=10^8, integer(),endian="little",signed=FALSE) >> >> It works fine for n=10^8, but fails for n=10^9 (or even at n=6*10^8). My machine$sizeof.long is 8 bit. >> I am running R 2.13.1 on a x86_64-apple-darwin9.8.0/x86_64 (64-bit) architecture. >> >> Thanks for your help >> >> Florian >> >> -- >> AXA doctoral fellow >> Bundy lab - Biomolecular Medicine >> Imperial College London >> >> >> >> >> >> -- >> AXA doctoral fellow >> Bundy lab - Biomolecular Medicine >> Imperial College London >> >> >> >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- AXA doctoral fellow Bundy lab - Biomolecular Medicine Imperial College London From florian.geier08 at imperial.ac.uk Thu Sep 1 17:39:43 2011 From: florian.geier08 at imperial.ac.uk (Geier, Florian) Date: Thu, 1 Sep 2011 15:39:43 +0000 Subject: [R] readBin fails to read large files In-Reply-To: <4E5FA518.2050904@gmail.com> References: <8CCE7727-9F99-4033-9611-C10B62489077@ic.ac.uk> <1D34FD4B-D539-4758-A6B1-EF8295970485@imperial.ac.uk> <4E5FA518.2050904@gmail.com> Message-ID: <32FD52CC-01C2-4B4B-9727-C5527B03A5E1@ic.ac.uk> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jcbouette at gmail.com Thu Sep 1 17:24:05 2011 From: jcbouette at gmail.com (=?ISO-8859-1?Q?Jean=2DChristophe_BOU=CBTT=C9?=) Date: Thu, 1 Sep 2011 11:24:05 -0400 Subject: [R] cannot correct step size, geweke.diag of coda In-Reply-To: <4E5FA1DA.9090707@uea.ac.uk> References: <4E5FA1DA.9090707@uea.ac.uk> Message-ID: Kindly provide a reproducible example. 2011/9/1 Jim Maas : > Intermittently I'm getting this error from the geweke.diag function of the > coda package. ?Would anyone be kind enough to enlighten me as to the > possible source of such an error, or how to debug/locate it? > > Error in { : task 22 failed - "inner loop 1; cannot correct step size" > > Thanks a bunch. > > J > > -- > Dr. Jim Maas > University of East Anglia > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From Thomas.Chesney at nottingham.ac.uk Thu Sep 1 16:54:24 2011 From: Thomas.Chesney at nottingham.ac.uk (Thomas Chesney) Date: Thu, 1 Sep 2011 15:54:24 +0100 Subject: [R] Automatic Recoding Message-ID: <5EAA21940C65214F9C11DA5FBBC14F0B2D13C4B100@EXCHANGE2.ad.nottingham.ac.uk> I have a text file full of numbers (it's a edgelist for a graph) and I would like to recode the numbers as they are way too big to work with. So for instance the following: 676529098667 1000198767829 676529098667 100867672856227 676529098667 91098726278 676529098667 98928373 1092837363526 716172829 would become: 0 1 0 2 0 3 0 4 5 6 i.e. all 676529098667 would become 0, all 1000198767829 would become 1 etc. If I read all the values into a matrix, is there a pre-existing function that can do the recoding? Thank you! Thomas ChesneyThis message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. From murdoch.duncan at gmail.com Thu Sep 1 18:06:45 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Thu, 01 Sep 2011 12:06:45 -0400 Subject: [R] executing R scripts - viewing results and errors In-Reply-To: <1314890992773-3783946.post@n4.nabble.com> References: <1314890992773-3783946.post@n4.nabble.com> Message-ID: <4E5FAD95.7020105@gmail.com> On 01/09/2011 11:29 AM, aelmore wrote: > Hi, > > For the first time, I am trying to call/run an R script from another > program, passing parameters and results back and forth. I have been > learning a little programming as I have progressed through the various > projects I have been working on, but I haven't had any formal training and > am now stumped. Specifically, how do you pass results and errors out of R > so that you can debug? > > My R script works fine as a standalone. My Python script seems clean and > isn't giving me any errors, but the R script isn't running. R is obviously > not receiving the inputs because the run-time is zero for the R portion of > my model execution. Since I haven't been able to figure out how to "see" > into R, however, I am unable to figure out where the problem lies. > > Can anyone give me a leg up on how you can get at outputs and errors when > you do this sort of batch execution? > > If it's of any help, I can send along the Python and R scripts, but since I > don't know that it's really necessary for this sort of question I won't > clutter the thread, up front. If you are running batch scripts (as opposed to running interactively), the basic debugging tool is to put lots of cat() and print() calls into your code so that you can monitor things as they go. You might need to write to a file to serve as a log of the run, if your Python code is reading the standard output. Duncan Murdoch From totangjie at gmail.com Thu Sep 1 18:10:57 2011 From: totangjie at gmail.com (Jie TANG) Date: Fri, 2 Sep 2011 00:10:57 +0800 Subject: [R] how to plot a series of data in a dataframe? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Thu Sep 1 18:23:37 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 1 Sep 2011 12:23:37 -0400 Subject: [R] Automatic Recoding In-Reply-To: <5EAA21940C65214F9C11DA5FBBC14F0B2D13C4B100@EXCHANGE2.ad.nottingham.ac.uk> References: <5EAA21940C65214F9C11DA5FBBC14F0B2D13C4B100@EXCHANGE2.ad.nottingham.ac.uk> Message-ID: <9ADF565B-97A5-45A6-8298-093118EA8590@comcast.net> On Sep 1, 2011, at 10:54 AM, Thomas Chesney wrote: > I have a text file full of numbers (it's a edgelist for a graph) and > I would like to recode the numbers as they are way too big to work > with. So for instance the following: > > 676529098667 1000198767829 > 676529098667 100867672856227 > 676529098667 91098726278 > 676529098667 98928373 > 1092837363526 716172829 > > would become: > > 0 1 > 0 2 > 0 3 > 0 4 > 5 6 > > i.e. all 676529098667 would become 0, all 1000198767829 would become > 1 etc. Depending on how that set of numbers was entered see if this is helpful: 1) First entering across first then down. x <- c(676529098667 , 1000198767829, 676529098667 , 100867672856227, 676529098667 , 91098726278, 676529098667 , 98928373, 1092837363526 ,716172829) as.numeric(factor(x, levels=unique(x)) ) # [1] 1 2 1 3 1 4 1 5 6 7 2( Now entering first down then over. x2 <- matrix(x, ncol=2, byrow=TRUE) # Matrices are column first ordered. as.numeric(factor(x2, levels=unique(c(x2))) ) # need c() to avoid warning. # [1] 1 1 1 1 2 3 4 5 6 7 > If I read all the values into a matrix, is there a pre-existing > function that can do the recoding? You can just subtract one from the factor results. The trick is to use explicit levels determined to match the sort order you want. Other wise the levels would be first collated. -- David Winsemius, MD West Hartford, CT From jholtman at gmail.com Thu Sep 1 18:29:31 2011 From: jholtman at gmail.com (jim holtman) Date: Thu, 1 Sep 2011 12:29:31 -0400 Subject: [R] how to plot a series of data in a dataframe? In-Reply-To: References: Message-ID: try this: > x <- read.table('clipboard', header = TRUE) > x V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 1 1001 3 24 12 24.7 44.4 70.1 49.3 33.7 3.0 6.8 2.7 NA 2 1001 3 25 0 70.1 49.3 33.7 138.2 152.5 NA 4.2 6.9 17.5 3 1001 3 25 12 33.7 187.7 286.5 386.7 NA 16.2 46.0 48.8 43.1 4 1001 3 26 0 88.6 129.4 NA NA NA 55.5 26.5 NA NA 5 1001 3 24 12 24.7 24.1 44.3 109.1 96.3 3.0 6.8 9.3 17.2 > boxplot(x[, 5:13]) On Thu, Sep 1, 2011 at 12:10 PM, Jie TANG wrote: > hi > > i have a dataframe with the name "obsdata" > ? ? ? V1 V2 V3 V4 ? ?V5 ? ?V6 ? ?V7 ? ?V8 ? ? V9 ? V10 ? V11 ? V12 ? V13 > 1 ? ?1001 ?3 24 12 ?24.7 ?44.4 ?70.1 ?49.3 ? 33.7 ? 3.0 ? 6.8 ? 2.7 ? ?NA > 2 ? ?1001 ?3 25 ?0 ?70.1 ?49.3 ?33.7 138.2 ?152.5 ? ?NA ? 4.2 ? 6.9 ?17.5 > 3 ? ?1001 ?3 25 12 ?33.7 187.7 286.5 386.7 ? ? NA ?16.2 ?46.0 ?48.8 ?43.1 > 4 ? ?1001 ?3 26 ?0 ?88.6 129.4 ? ?NA ? ?NA ? ? NA ?55.5 ?26.5 ? ?NA ? ?NA > 5 ? ?1001 ?3 24 12 ?24.7 ?24.1 ?44.3 109.1 ? 96.3 ? 3.0 ? 6.8 ? 9.3 ?17.2 > > not i want to boxplot the data from V5 to V11 ,what can i do ? > > it seems that boxplot(obsdata$V5) can only plot one group of data ? > > -- > TANG Jie > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From anopheles123 at gmail.com Thu Sep 1 18:29:36 2011 From: anopheles123 at gmail.com (Weidong Gu) Date: Thu, 1 Sep 2011 12:29:36 -0400 Subject: [R] how to plot a series of data in a dataframe? In-Reply-To: References: Message-ID: This can be done using bwplot in lattice library. Also, it is better to organize your data in 'long' format. Look at functions reshape or melt in reshape library. Weidong Gu On Thu, Sep 1, 2011 at 12:10 PM, Jie TANG wrote: > hi > > i have a dataframe with the name "obsdata" > ? ? ? V1 V2 V3 V4 ? ?V5 ? ?V6 ? ?V7 ? ?V8 ? ? V9 ? V10 ? V11 ? V12 ? V13 > 1 ? ?1001 ?3 24 12 ?24.7 ?44.4 ?70.1 ?49.3 ? 33.7 ? 3.0 ? 6.8 ? 2.7 ? ?NA > 2 ? ?1001 ?3 25 ?0 ?70.1 ?49.3 ?33.7 138.2 ?152.5 ? ?NA ? 4.2 ? 6.9 ?17.5 > 3 ? ?1001 ?3 25 12 ?33.7 187.7 286.5 386.7 ? ? NA ?16.2 ?46.0 ?48.8 ?43.1 > 4 ? ?1001 ?3 26 ?0 ?88.6 129.4 ? ?NA ? ?NA ? ? NA ?55.5 ?26.5 ? ?NA ? ?NA > 5 ? ?1001 ?3 24 12 ?24.7 ?24.1 ?44.3 109.1 ? 96.3 ? 3.0 ? 6.8 ? 9.3 ?17.2 > > not i want to boxplot the data from V5 to V11 ,what can i do ? > > it seems that boxplot(obsdata$V5) can only plot one group of data ? > > -- > TANG Jie > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From ripley at stats.ox.ac.uk Thu Sep 1 18:36:13 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Thu, 1 Sep 2011 17:36:13 +0100 (BST) Subject: [R] readBin fails to read large files In-Reply-To: <1D34FD4B-D539-4758-A6B1-EF8295970485@imperial.ac.uk> References: <8CCE7727-9F99-4033-9611-C10B62489077@ic.ac.uk> <1D34FD4B-D539-4758-A6B1-EF8295970485@imperial.ac.uk> Message-ID: readBin is intended to read a few items at a time, not 10^9. You are probably getting 32-bit integer overflow inside your OS, since the number of bytes you are trying to read in one go exceeds 2GB. Don't do that: read say a million at time. And BTW, if these really are unsigned ints you will get wraparound. On Thu, 1 Sep 2011, Benton, Paul wrote: > Posting for a friend > > Begin forwarded message: > > From: "Geier, Florian" > > Subject: Fwd: readBin fails to read large files > Date: September 1, 2011 4:10:53 PM GMT+01:00 > To: > > > > Begin forwarded message: > > Date: 1 September 2011 16:01:45 GMT+01:00 > Subject: readBin fails to read large files > > Dear all, > > I am trying to read a large file (~2GB) of unsigned ints into R. Using the command: > > raw<-readBin("file",n=10^8, integer(),endian="little",signed=FALSE) > > It works fine for n=10^8, but fails for n=10^9 (or even at n=6*10^8). My machine$sizeof.long is 8 bit. > I am running R 2.13.1 on a x86_64-apple-darwin9.8.0/x86_64 (64-bit) architecture. > > Thanks for your help > > Florian > > -- > AXA doctoral fellow > Bundy lab - Biomolecular Medicine > Imperial College London > > > > > > -- > AXA doctoral fellow > Bundy lab - Biomolecular Medicine > Imperial College London > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From xkziloj at gmail.com Thu Sep 1 18:37:51 2011 From: xkziloj at gmail.com (. .) Date: Thu, 1 Sep 2011 13:37:51 -0300 Subject: [R] Alternatives to integrate? In-Reply-To: References: <1314885193156-3783645.post@n4.nabble.com> Message-ID: So, please excuse me Michael, you are completely sure. I will try describe I am trying to do, please let me know if I can provide more info. The idea is provide to "func" two probability density functions(PDFs) and obtain another PDF that is a compound of them. In a final analysis this characterize an abundance distribution for me. The two PDFs are provided through "f" and "g" and there is some manipulation here because I need flexibility to easily change this two funcions. In the code provided, "f" is the Exponential distribution and "g" is the Poisson distribution. For this case, I have the analytical solution, below. This way I can check the result. But I am also considering other combinations of "f" and "g" that have difficult, or even does not have analitical solution. This is the reason why I am trying to develop "func". func2 <- function(y, frac, rate, trunc=0, log=FALSE) { is.wholenumber <- function(x, tol = .Machine$double.eps^0.5) abs(x - round(x)) < tol if(FALSE %in% sapply(y,is.wholenumber)) print("y must be integer because dpoix is a discrete PDF.") else { f <- function(y){ b <- y*log(frac) m <- log(rate) n <- (y+1)*log(rate+frac) if(log)b+m-n else exp(b+m-n) } f(y)/(1-f(trunc)) } } > func2(200,0.05,0.001) [1] 0.000381062 In theory, the interval of integration is 0 to Inf, but for some tests I did, go up to 2000 may still provide reasonable results. Also, as it seems, I am still writing my first functions in R and suggestions are welcome, please. Again, appologies for my previous mistake. It was not my intention to blame about "integrate". On Thu, Sep 1, 2011 at 11:49 AM, R. Michael Weylandt wrote: > I'm going to try to put this nicely: > > What you provided is not a problem with integrate. Instead, you provided a > rather unintelligible and badly-written piece of code that (miraculously) > seems to work, though it's not well documented so I have no idea if 1.3e-21 > is what you want to get. > > Let's try this again: per your original request, what is the problem with > integrate? > > If instead you feel there's something wrong with your code, might I suggest > you just say that and ask for help, rather than passing the blame onto a > perfectly useful base function. > > Oh, and since you asked that I propose something: comment your code. > > Michael > > On Thu, Sep 1, 2011 at 10:33 AM, . . wrote: >> >> Hi Michael, >> >> This is the problem: >> >> func <- Vectorize(function(x, a, sad, samp="pois", trunc=0, ...) { >> ?result <- function(x) { >> ? ?f1 <- function(n) { >> ? ? ? ? ? ? ? ? ? ? ? ?f <- function() { >> ? ? ? ?dcom <- paste("d", sad, sep="") >> ? ? ? ?dots <- c(as.name("n"), list(...)) >> ? ? ? ?do.call(dcom, dots) >> ? ? ? ? ? ? ? ? ? ? ? ?} >> ? ? ?g <- function() { >> ? ? ? ?dcom <- paste("d", samp, sep="") >> ? ? ? ?lambda <- a * n >> ? ? ? ?dots <- c(as.name("x"), as.name("lambda")) >> ? ? ? ?do.call(dcom, dots) >> ? ? ?} >> ? ? ?f() * g() >> ? ?} >> ? ?integrate(f1,0,2000)$value >> # ? ? adaptIntegrate(f1,0,2000)$integral >> >> # ? ? n <- 0:2000 >> # ? ? trapz(n,f1(n)) >> >> # ? ? area(f1, 0, 2000, limit=10000, eps=1e-100) >> ?} >> ?return(result(x) / (1 - result(trunc))) >> }, "x") >> func(200, 0.05, "exp", rate=0.001) >> >> If you could propose something I will be gratefull. >> >> Thanks in advance. >> >> On Thu, Sep 1, 2011 at 10:55 AM, R. Michael Weylandt >> wrote: >> > Mr ". .", >> > >> > MASS::area comes to mind but it may be more helpful if you could say >> > what >> > you are looking for / why integrate is not appropriate it is for >> > whatever >> > you are doing. >> > >> > Strictly speaking, I suppose there are all sorts of "alternatives" to >> > integrate() if you are willing to be really creative and build something >> > from scratch: diff(), cumsum(), lm(), hist(), t(), c(), .... >> > >> > Michael Weylandt >> > >> > On Thu, Sep 1, 2011 at 9:53 AM, B77S wrote: >> >> >> >> package "caTools" >> >> see ?trapz >> >> >> >> >> >> . wrote: >> >> > >> >> > Hi all, >> >> > >> >> > is there any alternative to the function integrate? >> >> > >> >> > Any comments are welcome. >> >> > >> >> > Thanks in advance. >> >> > >> >> > ______________________________________________ >> >> > R-help at r-project.org mailing list >> >> > https://stat.ethz.ch/mailman/listinfo/r-help >> >> > PLEASE do read the posting guide >> >> > http://www.R-project.org/posting-guide.html >> >> > and provide commented, minimal, self-contained, reproducible code. >> >> > >> >> >> >> -- >> >> View this message in context: >> >> >> >> http://r.789695.n4.nabble.com/Alternatives-to-integrate-tp3783624p3783645.html >> >> Sent from the R help mailing list archive at Nabble.com. >> >> >> >> ______________________________________________ >> >> R-help at r-project.org mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> > >> > > > From ripley at stats.ox.ac.uk Thu Sep 1 18:41:49 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Thu, 1 Sep 2011 17:41:49 +0100 (BST) Subject: [R] readBin fails to read large files In-Reply-To: References: <8CCE7727-9F99-4033-9611-C10B62489077@ic.ac.uk> <1D34FD4B-D539-4758-A6B1-EF8295970485@imperial.ac.uk> Message-ID: On Thu, 1 Sep 2011, Prof Brian Ripley wrote: > readBin is intended to read a few items at a time, not 10^9. You are > probably getting 32-bit integer overflow inside your OS, since the number of > bytes you are trying to read in one go exceeds 2GB. > > Don't do that: read say a million at time. > > And BTW, if these really are unsigned ints you will get wraparound. It seems someone did not read the help: signed: logical. Only used for integers of sizes 1 and 2, when it determines if the quantity on file should be regarded as a signed or unsigned integer. and you are using size = 4 implicitly this will be ignored. > > On Thu, 1 Sep 2011, Benton, Paul wrote: > >> Posting for a friend >> >> Begin forwarded message: >> >> From: "Geier, Florian" >> > >> Subject: Fwd: readBin fails to read large files >> Date: September 1, 2011 4:10:53 PM GMT+01:00 >> To: >> >> >> >> Begin forwarded message: >> >> Date: 1 September 2011 16:01:45 GMT+01:00 >> Subject: readBin fails to read large files >> >> Dear all, >> >> I am trying to read a large file (~2GB) of unsigned ints into R. Using the >> command: >> >> raw<-readBin("file",n=10^8, integer(),endian="little",signed=FALSE) >> >> It works fine for n=10^8, but fails for n=10^9 (or even at n=6*10^8). My >> machine$sizeof.long is 8 bit. >> I am running R 2.13.1 on a x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> architecture. >> >> Thanks for your help >> >> Florian >> >> -- >> AXA doctoral fellow >> Bundy lab - Biomolecular Medicine >> Imperial College London >> >> >> >> >> >> -- >> AXA doctoral fellow >> Bundy lab - Biomolecular Medicine >> Imperial College London >> >> >> >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From wdunlap at tibco.com Thu Sep 1 18:51:03 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 1 Sep 2011 16:51:03 +0000 Subject: [R] Automatic Recoding In-Reply-To: <9ADF565B-97A5-45A6-8298-093118EA8590@comcast.net> References: <5EAA21940C65214F9C11DA5FBBC14F0B2D13C4B100@EXCHANGE2.ad.nottingham.ac.uk> <9ADF565B-97A5-45A6-8298-093118EA8590@comcast.net> Message-ID: You could also use match() directly instead of going through factors. Any of the following would map your inputs to small integers > match(x, x)-1 [1] 0 1 0 3 0 5 0 7 8 9 > match(x, unique(x))-1 [1] 0 1 0 2 0 3 0 4 5 6 > match(x, sort(unique(x)))-1 [1] 3 4 3 6 3 2 3 0 5 1 Your numbers are pretty big, c. 2^46. If you get bigger than 2^53 you won't always be able to distinguish between adjacent numbers > (2^53 + 5) == (2^53 + 4) [1] TRUE so you may want to input them as character strings. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of David Winsemius > Sent: Thursday, September 01, 2011 9:24 AM > To: Thomas Chesney > Cc: r-help at r-project.org > Subject: Re: [R] Automatic Recoding > > > On Sep 1, 2011, at 10:54 AM, Thomas Chesney wrote: > > > I have a text file full of numbers (it's a edgelist for a graph) and > > I would like to recode the numbers as they are way too big to work > > with. So for instance the following: > > > > 676529098667 1000198767829 > > 676529098667 100867672856227 > > 676529098667 91098726278 > > 676529098667 98928373 > > 1092837363526 716172829 > > > > would become: > > > > 0 1 > > 0 2 > > 0 3 > > 0 4 > > 5 6 > > > > i.e. all 676529098667 would become 0, all 1000198767829 would become > > 1 etc. > > Depending on how that set of numbers was entered see if this is helpful: > > 1) First entering across first then down. > > x <- c(676529098667 , 1000198767829, > 676529098667 , 100867672856227, > 676529098667 , 91098726278, > 676529098667 , 98928373, > 1092837363526 ,716172829) > as.numeric(factor(x, levels=unique(x)) ) > # [1] 1 2 1 3 1 4 1 5 6 7 > > 2( Now entering first down then over. > > x2 <- matrix(x, ncol=2, byrow=TRUE) # Matrices are column first > ordered. > > as.numeric(factor(x2, levels=unique(c(x2))) ) # need c() to avoid > warning. > # [1] 1 1 1 1 2 3 4 5 6 7 > > > If I read all the values into a matrix, is there a pre-existing > > function that can do the recoding? > > You can just subtract one from the factor results. The trick is to use > explicit levels determined to match the sort order you want. Other > wise the levels would be first collated. > > -- > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jroll at lcog.org Thu Sep 1 19:11:28 2011 From: jroll at lcog.org (LCOG1) Date: Thu, 1 Sep 2011 10:11:28 -0700 (PDT) Subject: [R] Oh apply functions, how you confuse me Message-ID: <1314897088411-3784212.post@n4.nabble.com> Hi guys, I have a crap load of data to parse and have enjoyed creating a script that takes this data and creates a number of useful graphics for our area. I am unable to figure out one summary though and its all cause I dont fully understand the apply family of functions. Consider the following: #Create data Df..<-rbind(data.frame(Id=1:1008,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif(1008,0,65), Volume=runif(1008,0,19),Hour=rep(00,1008),Min5Break=rep(1:12,84),Day=rep(1,1008)), data.frame(Id=2009:2016,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif(1008,0,65), Volume=runif(1008,0,19),Hour=rep(01,1008),Min5Break=rep(1:12,84),Day=rep(2,1008))) #Example calc Results_<-list() #Sum Volume by 5 minute break by Day by Direction Results_$FiveMin.Direction<-tapply(Df..$Volume,list(Df..$Min5Break,Df..$Day,Df..$Hour,Df..$Dir),sum) The data is a snap shot of what im working with and I am trying to get to something similar to the last line where the volumes are summed. What i want to do is to do a weighted average for the speed by 5 minute break. So for all the speeds and volumes in a given hour of 5 minute break(12 per hour), i would want to sum(Volumes[1:12]*Speed[1:12]) / sum(Volumes[1:12] The output resembling the one from the above but having these weighted values. I am assuming the sum function in the above would be replaced by a function doing the calculation but I am still not sure how to do this using apply functions, so perhaps this isnt the best option. Hope this is clear and hope you guys(and of course ladies) can offer some guidance. Cheers, Josh -- View this message in context: http://r.789695.n4.nabble.com/Oh-apply-functions-how-you-confuse-me-tp3784212p3784212.html Sent from the R help mailing list archive at Nabble.com. From totangjie at gmail.com Thu Sep 1 19:31:23 2011 From: totangjie at gmail.com (Jie TANG) Date: Fri, 2 Sep 2011 01:31:23 +0800 Subject: [R] two question about plot Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From matt at biostatmatt.com Thu Sep 1 19:36:25 2011 From: matt at biostatmatt.com (Matt Shotwell) Date: Thu, 01 Sep 2011 12:36:25 -0500 Subject: [R] readBin fails to read large files In-Reply-To: References: <8CCE7727-9F99-4033-9611-C10B62489077@ic.ac.uk> <1D34FD4B-D539-4758-A6B1-EF8295970485@imperial.ac.uk> Message-ID: <1314898585.4392.21.camel@pal> On Thu, 2011-09-01 at 17:36 +0100, Prof Brian Ripley wrote: > readBin is intended to read a few items at a time, not 10^9. You are > probably getting 32-bit integer overflow inside your OS, since the > number of bytes you are trying to read in one go exceeds 2GB. > > Don't do that: read say a million at time. > > And BTW, if these really are unsigned ints you will get wraparound. To elaborate, ?readBin reads that the 'signed' argument is only used for integers of size 1 and 2 bytes. These are ultimately converted to signed 4 byte integers, because that's how R stores integers. To be exact, if your file contains integers larger than 2^31-1 = 2147483647, would occur. In actuality, R returns NA for those values. I'm bringing this up because R normally issues a warning: R> 2147483647L + 1L [1] NA Warning message: In 2147483647L + 1L : NAs produced by integer overflow But, a similar warning isn't issued by readBin when NA results from signed integer overflow: #The raw vector below represents 2147483647L and 2147483647L + 1L #in little endian, unsigned, 4 byte integers R> dat <- as.raw(c(0xff,0xff,0xff,0x7f,0x00,0x00,0x00,0x80)) R> writeBin(dat, 'test.bin') R> readBin('test.bin', n=2, integer(), signed=FALSE) [1] 2147483647 NA > On Thu, 1 Sep 2011, Benton, Paul wrote: > > > Posting for a friend > > > > Begin forwarded message: > > > > From: "Geier, Florian" > > > Subject: Fwd: readBin fails to read large files > > Date: September 1, 2011 4:10:53 PM GMT+01:00 > > To: > > > > > > > > Begin forwarded message: > > > > Date: 1 September 2011 16:01:45 GMT+01:00 > > Subject: readBin fails to read large files > > > > Dear all, > > > > I am trying to read a large file (~2GB) of unsigned ints into R. Using the command: > > > > raw<-readBin("file",n=10^8, integer(),endian="little",signed=FALSE) > > > > It works fine for n=10^8, but fails for n=10^9 (or even at n=6*10^8). My machine$sizeof.long is 8 bit. > > I am running R 2.13.1 on a x86_64-apple-darwin9.8.0/x86_64 (64-bit) architecture. > > > > Thanks for your help > > > > Florian > > > > -- > > AXA doctoral fellow > > Bundy lab - Biomolecular Medicine > > Imperial College London > > > > > > > > > > > > -- > > AXA doctoral fellow > > Bundy lab - Biomolecular Medicine > > Imperial College London > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > From pburns at pburns.seanet.com Thu Sep 1 19:37:12 2011 From: pburns at pburns.seanet.com (Patrick Burns) Date: Thu, 01 Sep 2011 18:37:12 +0100 Subject: [R] Oh apply functions, how you confuse me In-Reply-To: <1314897088411-3784212.post@n4.nabble.com> References: <1314897088411-3784212.post@n4.nabble.com> Message-ID: <4E5FC2C8.8010100@pburns.seanet.com> I would suggest using a 'for' loop rather than an apply function. The advantage is that you will probably understand the loop that you write, and it will run in roughly the same amount of time as a complicated call to an apply function that you don't understand. On 01/09/2011 18:11, LCOG1 wrote: > Hi guys, > I have a crap load of data to parse and have enjoyed creating a script that > takes this data and creates a number of useful graphics for our area. I am > unable to figure out one summary though and its all cause I dont fully > understand the apply family of functions. Consider the following: > > > > #Create data > Df..<-rbind(data.frame(Id=1:1008,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif(1008,0,65), > Volume=runif(1008,0,19),Hour=rep(00,1008),Min5Break=rep(1:12,84),Day=rep(1,1008)), > data.frame(Id=2009:2016,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif(1008,0,65), > Volume=runif(1008,0,19),Hour=rep(01,1008),Min5Break=rep(1:12,84),Day=rep(2,1008))) > > #Example calc > Results_<-list() > > #Sum Volume by 5 minute break by Day by Direction > Results_$FiveMin.Direction<-tapply(Df..$Volume,list(Df..$Min5Break,Df..$Day,Df..$Hour,Df..$Dir),sum) > > The data is a snap shot of what im working with and I am trying to get to > something similar to the last line where the volumes are summed. What i > want to do is to do a weighted average for the speed by 5 minute break. So > for all the speeds and volumes in a given hour of 5 minute break(12 per > hour), i would want to > > sum(Volumes[1:12]*Speed[1:12]) / sum(Volumes[1:12] > > The output resembling the one from the above but having these weighted > values. I am assuming the sum function in the above would be replaced by a > function doing the calculation but I am still not sure how to do this using > apply functions, so perhaps this isnt the best option. > > Hope this is clear and hope you guys(and of course ladies) can offer some > guidance. > > Cheers, > Josh > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Oh-apply-functions-how-you-confuse-me-tp3784212p3784212.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Patrick Burns pburns at pburns.seanet.com twitter: @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') From dwinsemius at comcast.net Thu Sep 1 19:42:17 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 1 Sep 2011 13:42:17 -0400 Subject: [R] iSeries and R In-Reply-To: References: Message-ID: <3D8DB829-D14E-4F93-AB90-C484DDA5E9AA@comcast.net> On Sep 1, 2011, at 10:59 AM, Li, Yan wrote: > Hi All, > > Does anyone has experiences installing R in iSeries? I was unable to find any postings to r-help that mentioned that hardware line in particular. But Linux and (IBM's) AIX are capable of supporting R and both those OSes run on iSeries boxes. http://www-03.ibm.com/systems/i/os/linux/ > Does R supports iSeries? Any documentation on this topic? Thank you > very much! The R FAQ? http://cran.r-project.org/doc/FAQ/R-FAQ.html#What-machines-does-R-run-on_003f There is also an AIX section in the R Installation and Administration Guide. And there is an R-forge AIX page: http://r-forge.r-project.org/projects/aix/ -- David Winsemius, MD West Hartford, CT From michael.weylandt at gmail.com Thu Sep 1 19:44:33 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Thu, 1 Sep 2011 13:44:33 -0400 Subject: [R] Alternatives to integrate? In-Reply-To: References: <1314885193156-3783645.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From changbind at gmail.com Thu Sep 1 19:53:13 2011 From: changbind at gmail.com (Changbin Du) Date: Thu, 1 Sep 2011 10:53:13 -0700 Subject: [R] how to split a data frame by two variables Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From landronimirc at gmail.com Thu Sep 1 19:55:53 2011 From: landronimirc at gmail.com (Liviu Andronic) Date: Thu, 1 Sep 2011 19:55:53 +0200 Subject: [R] Treat an Unquoted Character String as a Data Frame In-Reply-To: References: Message-ID: On Thu, Sep 1, 2011 at 5:48 PM, Jean V Adams wrote: > Try this > > data <- eval(parse(text=paste(study, level, ".", population, sep=""))) > ..or this: data <- get(paste(study, level, ".", population, sep="")) Liviu > Jean > > ----- > > dbateman wrote on 08/31/2011 17:44:44: > > I have several datasets that come from different studies (fv02 and fv03), > they represent different levels (patients and lesions), and they have > different patient populations (itt, mitt, mitt3). ?I wanted to write some > code that would pass my three requirements into a function I wrote, > produce the output, but not have to require me to also pass a unique plot > title or output filename for each function call. ?The title and file name > would be created in the function according to the three input parameters. > > My datasets are named like this (all six are repeated for "fv03" in place > of "fv02"): > ? fv02patients.itt > ? fv02patients.mitt > ? fv02patients.mitt3 > ? fv02lesions.itt > ? fv02lesions.mitt > ? fv02lesions.mitt3 > > Taking the first dataset as an example, I currently have this code: > ? study="fv02" > ? level="patients" > ? population="itt" > ? noquote(paste(study,level,".",population,sep="")) > > This produces the desired fv02patients.itt, but it is just an unquoted > character string rather than a data frame. > > Here is a condensed look at my function: > > waterfall=function(data,title,file) > { ? ? ? barplot(height=data$brpercent,main=paste("Best Change in Tumor > Volume\n",title)), > ? ? ? ?savePlot(filename=file,type="pdf") > } > waterfall(data=fv02patients.itt,title="EC-FV-02 ITT > Patients",file=fv02_itt_patients_waterfall") > > > It may be easier to leave it the way I have it, but if this is possible, I > would still be interested in knowing. > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail From jholtman at gmail.com Thu Sep 1 19:59:18 2011 From: jholtman at gmail.com (jim holtman) Date: Thu, 1 Sep 2011 13:59:18 -0400 Subject: [R] how to split a data frame by two variables In-Reply-To: References: Message-ID: try this: > split(x, list(x$let, x$g)) $a.1 num let g 1 10 a 1 11 21 a 1 $b.1 num let g 7 52 b 1 17 56 b 1 $c.1 num let g 3 12 c 1 13 32 c 1 $d.1 num let g 9 12 d 1 19 76 d 1 $e.1 num let g 5 23 e 1 15 24 e 1 On Thu, Sep 1, 2011 at 1:53 PM, Changbin Du wrote: > HI, Dear R community, > > I want to split a data frame by using two variables: let and g > >> x = data.frame(num = > c(10,11,12,43,23,14,52,52,12,23,21,23,32,31,24,45,56,56,76,45), let = > letters[1:5], g = 1:2) >> x > ? num let g > 1 ? 10 ? a 1 > 2 ? 11 ? b 2 > 3 ? 12 ? c 1 > 4 ? 43 ? d 2 > 5 ? 23 ? e 1 > 6 ? 14 ? a 2 > 7 ? 52 ? b 1 > 8 ? 52 ? c 2 > 9 ? 12 ? d 1 > 10 ?23 ? e 2 > 11 ?21 ? a 1 > 12 ?23 ? b 2 > 13 ?32 ? c 1 > 14 ?31 ? d 2 > 15 ?24 ? e 1 > 16 ?45 ? a 2 > 17 ?56 ? b 1 > 18 ?56 ? c 2 > 19 ?76 ? d 1 > 20 ?45 ? e 2 > > I tried the following: > > xs = split(x,x$g*x$let) > > *Warning message: > In Ops.factor(x$g, x$let) : * not meaningful for factors* > > > xs = split(x,c(x$g*x$let)) > > *Warning message: > In Ops.factor(x$g, x$let) : * not meaningful for factors > * > > Can someone give some hints? > > Thanks! > > > -- > Sincerely, > Changbin > -- > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From landronimirc at gmail.com Thu Sep 1 20:00:07 2011 From: landronimirc at gmail.com (Liviu Andronic) Date: Thu, 1 Sep 2011 20:00:07 +0200 Subject: [R] how to split a data frame by two variables In-Reply-To: References: Message-ID: On Thu, Sep 1, 2011 at 7:53 PM, Changbin Du wrote: > HI, Dear R community, > > I want to split a data frame by using two variables: let and g > It's not clear what you want to do, but investigate the following: > require(plyr) Loading required package: plyr > ddply(x, .(let, g), function(y) mean(y$g)) let g V1 1 a 1 1 2 a 2 2 3 b 1 1 4 b 2 2 5 c 1 1 6 c 2 2 7 d 1 1 8 d 2 2 9 e 1 1 10 e 2 2 This splits the df in groups of unique 'let' and 'g', and computes the mean for each such group. Regards Liviu From dwinsemius at comcast.net Thu Sep 1 20:01:21 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 1 Sep 2011 14:01:21 -0400 Subject: [R] how to split a data frame by two variables In-Reply-To: References: Message-ID: <02D94A4D-0C2C-4D74-A149-A53800FAC193@comcast.net> On Sep 1, 2011, at 1:53 PM, Changbin Du wrote: > HI, Dear R community, > > I want to split a data frame by using two variables: let and g > >> x = data.frame(num = > c(10,11,12,43,23,14,52,52,12,23,21,23,32,31,24,45,56,56,76,45), let = > letters[1:5], g = 1:2) >> x > num let g > 1 10 a 1 > 2 11 b 2 > 3 12 c 1 > 4 43 d 2 > 5 23 e 1 > 6 14 a 2 > 7 52 b 1 > 8 52 c 2 > 9 12 d 1 > 10 23 e 2 > 11 21 a 1 > 12 23 b 2 > 13 32 c 1 > 14 31 d 2 > 15 24 e 1 > 16 45 a 2 > 17 56 b 1 > 18 56 c 2 > 19 76 d 1 > 20 45 e 2 > > I tried the following: > > xs = split(x,x$g*x$let) Probably xs = split(x,list(x$g,x$let)) > > *Warning message: > In Ops.factor(x$g, x$let) : * not meaningful for factors* > > > xs = split(x,c(x$g*x$let)) > > *Warning message: > In Ops.factor(x$g, x$let) : * not meaningful for factors > * > > Can someone give some hints? > > Thanks! > > > -- > Sincerely, > Changbin > -- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From sarah.goslee at gmail.com Thu Sep 1 20:04:50 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Thu, 1 Sep 2011 14:04:50 -0400 Subject: [R] two question about plot In-Reply-To: References: Message-ID: The help for boxplot offers suggestions for both those things. You may be particularly interested in: names: group labels which will be printed under each boxplot. Can be a character vector or an expression (see plotmath). add: logical, if true _add_ boxplot to current plot. Sarah On Thu, Sep 1, 2011 at 1:31 PM, Jie TANG wrote: > 1) how to modify the the tickment of x-axis or y-axis. > ? boxplot(data[,1:5]) > ?the tickment in x-axis in V1 V2 V3 V4 V5 ,I want to be some name for > example > ?name<-c("1day","2day","3day","4day","5day") > > 2) how to overlap two plot into one figure? > ?plot(data[1:5]) > ?boxplot(newdata[,1:5]) > ? > > -- > TANG Jie > -- Sarah Goslee http://www.functionaldiversity.org From changbind at gmail.com Thu Sep 1 20:08:22 2011 From: changbind at gmail.com (Changbin Du) Date: Thu, 1 Sep 2011 11:08:22 -0700 Subject: [R] how to split a data frame by two variables In-Reply-To: <02D94A4D-0C2C-4D74-A149-A53800FAC193@comcast.net> References: <02D94A4D-0C2C-4D74-A149-A53800FAC193@comcast.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jholtman at gmail.com Thu Sep 1 20:19:54 2011 From: jholtman at gmail.com (jim holtman) Date: Thu, 1 Sep 2011 14:19:54 -0400 Subject: [R] Oh apply functions, how you confuse me In-Reply-To: <1314897088411-3784212.post@n4.nabble.com> References: <1314897088411-3784212.post@n4.nabble.com> Message-ID: Is this close to what you are asking for: > require(data.table) > Dt.. <- data.table(Df..) > R <- Dt..[, + list( + sum = sum(Volume) + , weight = sum(Volume * Mph) / sum(Volume) + ) + , by = list(Min5Break, Day, Hour, Dir) + ] > R Min5Break Day Hour Dir sum weight 1 1 0 NB 730.8880 32.60224 2 1 0 NB 766.4083 35.88443 3 1 0 SB 776.7592 32.66822 4 1 0 SB 768.0923 33.55988 5 1 0 NB 767.5472 36.00546 6 1 0 NB 767.6600 30.38747 7 1 0 SB 814.9662 31.88483 8 1 0 SB 795.4855 30.91495 9 1 0 NB 828.4439 31.57477 10 1 0 NB 797.7522 29.49832 11 1 0 SB 826.5165 32.74487 12 1 0 SB 824.0942 36.28309 1 2 1 NB 830.0683 29.59320 2 2 1 NB 838.8179 34.59878 3 2 1 SB 877.3518 30.77636 4 2 1 SB 838.9765 30.90577 5 2 1 NB 736.6560 30.54381 6 2 1 NB 772.3622 31.40094 7 2 1 SB 819.2347 29.22674 8 2 1 SB 840.9048 32.59222 9 2 1 NB 818.8383 37.55142 10 2 1 NB 783.8896 32.54565 11 2 1 SB 699.0401 30.76466 12 2 1 SB 773.5594 35.87076 cn Min5Break Day Hour Dir sum weight > On Thu, Sep 1, 2011 at 1:11 PM, LCOG1 wrote: > Hi guys, > I have a crap load of data to parse and have enjoyed creating a script that > takes this data and creates a number of useful graphics for our area. ?I am > unable to figure out one summary though and its all cause I dont fully > understand the apply family of functions. ?Consider the following: > > > > #Create data > Df..<-rbind(data.frame(Id=1:1008,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif(1008,0,65), > Volume=runif(1008,0,19),Hour=rep(00,1008),Min5Break=rep(1:12,84),Day=rep(1,1008)), > data.frame(Id=2009:2016,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif(1008,0,65), > Volume=runif(1008,0,19),Hour=rep(01,1008),Min5Break=rep(1:12,84),Day=rep(2,1008))) > > #Example calc > Results_<-list() > > #Sum Volume by 5 minute break by Day by Direction > Results_$FiveMin.Direction<-tapply(Df..$Volume,list(Df..$Min5Break,Df..$Day,Df..$Hour,Df..$Dir),sum) > > The data is a snap shot of what im working with and I am trying to get to > something similar to the last line where the volumes are summed. ?What i > want to do is to do a weighted average for the speed by 5 minute break. ?So > for all the speeds and volumes in a given hour of 5 minute break(12 per > hour), i would want to > > sum(Volumes[1:12]*Speed[1:12]) / sum(Volumes[1:12] > > The output resembling the one from the above but having these weighted > values. ?I am assuming the sum function in the above would be replaced by a > function doing the calculation but I am still not sure how to do this using > apply functions, so perhaps this isnt the best option. > > Hope this is clear and hope you guys(and of course ladies) can offer some > guidance. > > Cheers, > ?Josh > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Oh-apply-functions-how-you-confuse-me-tp3784212p3784212.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From macqueen1 at llnl.gov Thu Sep 1 20:28:08 2011 From: macqueen1 at llnl.gov (MacQueen, Don) Date: Thu, 1 Sep 2011 11:28:08 -0700 Subject: [R] how to split a data frame by two variables In-Reply-To: Message-ID: Even though it's not needed, here's a small followup. I usually use this split(x, paste(x$let,x$g)) But since split(x, list(x$let,x$g)) works, so does split(x, x[,c('let','g')]) > all.equal( split(x, x[,c('let','g')]) , split(x,list(x$let,x$g))) [1] TRUE As to which is the best, hard to say. If the variable names you want to split by are held in character vector, then the third one has an advantage splt.by <- c('let','g') split(x, x[,splt.by] ) If x were large, and the number of columns to split by were large, there might be performance differences, but I suspect they would have to be *very* large before it mattered. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 9/1/11 11:08 AM, "Changbin Du" wrote: >Thanks for the great helps from David, Jim and Liviu. It solved my >problem. > >Appreciated! > >On Thu, Sep 1, 2011 at 11:01 AM, David Winsemius >wrote: > >> >> On Sep 1, 2011, at 1:53 PM, Changbin Du wrote: >> >> HI, Dear R community, >>> >>> I want to split a data frame by using two variables: let and g >>> >>> x = data.frame(num = >>>> >>> c(10,11,12,43,23,14,52,52,12,**23,21,23,32,31,24,45,56,56,76,**45), >>>let = >>> letters[1:5], g = 1:2) >>> >>>> x >>>> >>> num let g >>> 1 10 a 1 >>> 2 11 b 2 >>> 3 12 c 1 >>> 4 43 d 2 >>> 5 23 e 1 >>> 6 14 a 2 >>> 7 52 b 1 >>> 8 52 c 2 >>> 9 12 d 1 >>> 10 23 e 2 >>> 11 21 a 1 >>> 12 23 b 2 >>> 13 32 c 1 >>> 14 31 d 2 >>> 15 24 e 1 >>> 16 45 a 2 >>> 17 56 b 1 >>> 18 56 c 2 >>> 19 76 d 1 >>> 20 45 e 2 >>> >>> I tried the following: >>> >>> xs = split(x,x$g*x$let) >>> >> >> Probably >> >> xs = split(x,list(x$g,x$let)) >> >>> >>> *Warning message: >>> In Ops.factor(x$g, x$let) : * not meaningful for factors* >>> >>> >>> xs = split(x,c(x$g*x$let)) >>> >>> *Warning message: >>> In Ops.factor(x$g, x$let) : * not meaningful for factors >>> * >>> >>> Can someone give some hints? >>> >>> Thanks! >>> >>> >>> -- >>> Sincerely, >>> Changbin >>> -- >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________**________________ >>> R-help at r-project.org mailing list >>> >>>https://stat.ethz.ch/mailman/**listinfo/r-help>>man/listinfo/r-help> >>> PLEASE do read the posting guide http://www.R-project.org/** >>> posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> David Winsemius, MD >> West Hartford, CT >> >> > > >-- >Sincerely, >Changbin >-- > >Changbin Du >Data Analysis Group, Affymetrix Inc >6550 Emeryville, CA, 94608 > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. From dan.abner99 at gmail.com Thu Sep 1 20:59:00 2011 From: dan.abner99 at gmail.com (Dan Abner) Date: Thu, 1 Sep 2011 14:59:00 -0400 Subject: [R] Including only a subset of the levels of a factor XXXX Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From changbind at gmail.com Thu Sep 1 21:02:24 2011 From: changbind at gmail.com (Changbin Du) Date: Thu, 1 Sep 2011 12:02:24 -0700 Subject: [R] how to split a data frame by two variables In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.weylandt at gmail.com Thu Sep 1 21:11:50 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Thu, 1 Sep 2011 15:11:50 -0400 Subject: [R] Including only a subset of the levels of a factor XXXX In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Thu Sep 1 21:29:05 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 1 Sep 2011 15:29:05 -0400 Subject: [R] Including only a subset of the levels of a factor XXXX In-Reply-To: References: Message-ID: On Sep 1, 2011, at 2:59 PM, Dan Abner wrote: > Hello everyone, > > I have the following factor: > > levels(pp_income) > [1] "" "1" "2" "3" "4" "5" "6" "7" > [9] "8" "9" "Renter" > > I want to subset so that only values 1:9 are included. I have the > following: > >> income<-pp_income[pp_income %in% c(1:9)] >> >> levels(income) > [1] "" "1" "2" "3" "4" "5" "6" "7" > [9] "8" "9" "Renter" > > Why is this not working Actually it could be that it did succeed but you just have levels attributes that are unpopulated in your result. Try: table{income) If that looks correct, then do this: income <- factor(income) # will drop unused levels > and can someone please suggest a solution? If on the other hand you got the wrong values then there was an undesired coercion of either 'factor' class to 'numeric' or of 'numeric' to 'character'. I am fairly sure this will remove any ambiguity: income<-pp_income[as.character(pp_income) %in% as.character(1:9)] (You would still get the puzzling extra levels if you looked with levels(income).) > > Thank you! > > Dan > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From JRoll at lcog.org Thu Sep 1 21:55:52 2011 From: JRoll at lcog.org (ROLL Josh F) Date: Thu, 1 Sep 2011 12:55:52 -0700 Subject: [R] Oh apply functions, how you confuse me In-Reply-To: References: <1314897088411-3784212.post@n4.nabble.com> Message-ID: <506F591A54F21D4480EE337F77369C056B10065D@LCEXG03.lc100.net> Dang Jim this looks to do the trick though I never heard of a data.table, interesting, I will explore more. Thanks you very much. -----Original Message----- From: jim holtman [mailto:jholtman at gmail.com] Sent: Thursday, September 01, 2011 11:20 AM To: ROLL Josh F Cc: r-help at r-project.org Subject: Re: [R] Oh apply functions, how you confuse me Is this close to what you are asking for: > require(data.table) > Dt.. <- data.table(Df..) > R <- Dt..[, + list( + sum = sum(Volume) + , weight = sum(Volume * Mph) / sum(Volume) + ) + , by = list(Min5Break, Day, Hour, Dir) + ] > R Min5Break Day Hour Dir sum weight 1 1 0 NB 730.8880 32.60224 2 1 0 NB 766.4083 35.88443 3 1 0 SB 776.7592 32.66822 4 1 0 SB 768.0923 33.55988 5 1 0 NB 767.5472 36.00546 6 1 0 NB 767.6600 30.38747 7 1 0 SB 814.9662 31.88483 8 1 0 SB 795.4855 30.91495 9 1 0 NB 828.4439 31.57477 10 1 0 NB 797.7522 29.49832 11 1 0 SB 826.5165 32.74487 12 1 0 SB 824.0942 36.28309 1 2 1 NB 830.0683 29.59320 2 2 1 NB 838.8179 34.59878 3 2 1 SB 877.3518 30.77636 4 2 1 SB 838.9765 30.90577 5 2 1 NB 736.6560 30.54381 6 2 1 NB 772.3622 31.40094 7 2 1 SB 819.2347 29.22674 8 2 1 SB 840.9048 32.59222 9 2 1 NB 818.8383 37.55142 10 2 1 NB 783.8896 32.54565 11 2 1 SB 699.0401 30.76466 12 2 1 SB 773.5594 35.87076 cn Min5Break Day Hour Dir sum weight > On Thu, Sep 1, 2011 at 1:11 PM, LCOG1 wrote: > Hi guys, > I have a crap load of data to parse and have enjoyed creating a script > that takes this data and creates a number of useful graphics for our > area. ?I am unable to figure out one summary though and its all cause > I dont fully understand the apply family of functions. ?Consider the following: > > > > #Create data > Df..<-rbind(data.frame(Id=1:1008,Dir=rep(c("NB","NB","SB","SB"),252),M > ph=runif(1008,0,65), > Volume=runif(1008,0,19),Hour=rep(00,1008),Min5Break=rep(1:12,84),Day=r > ep(1,1008)), > data.frame(Id=2009:2016,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif( > 1008,0,65), > Volume=runif(1008,0,19),Hour=rep(01,1008),Min5Break=rep(1:12,84),Day=r > ep(2,1008))) > > #Example calc > Results_<-list() > > #Sum Volume by 5 minute break by Day by Direction > Results_$FiveMin.Direction<-tapply(Df..$Volume,list(Df..$Min5Break,Df. > .$Day,Df..$Hour,Df..$Dir),sum) > > The data is a snap shot of what im working with and I am trying to get > to something similar to the last line where the volumes are summed. ? > What i want to do is to do a weighted average for the speed by 5 > minute break. ?So for all the speeds and volumes in a given hour of 5 > minute break(12 per hour), i would want to > > sum(Volumes[1:12]*Speed[1:12]) / sum(Volumes[1:12] > > The output resembling the one from the above but having these weighted > values. ?I am assuming the sum function in the above would be replaced > by a function doing the calculation but I am still not sure how to do > this using apply functions, so perhaps this isnt the best option. > > Hope this is clear and hope you guys(and of course ladies) can offer > some guidance. > > Cheers, > ?Josh > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Oh-apply-functions-how-you-confuse-me-tp > 3784212p3784212.html Sent from the R help mailing list archive at > Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From Greg.Snow at imail.org Thu Sep 1 22:51:42 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Thu, 1 Sep 2011 14:51:42 -0600 Subject: [R] get arguments passed to function inside a function In-Reply-To: <1314890068.48163.YahooMailClassic@web28201.mail.ukl.yahoo.com> References: <1314890068.48163.YahooMailClassic@web28201.mail.ukl.yahoo.com> Message-ID: The sys.call or match.call functions may be what you are looking for: > tmpfun <- function(x,y,z,...) { + as.list( sys.call() ) + } > > tmpfun( x=5, w=3, 1:10 ) [[1]] tmpfun $x [1] 5 $w [1] 3 [[4]] 1:10 > tmpfun2 <- function(x,y,z,...) { + as.list( match.call() ) + } > tmpfun2( x=5, w=3, 1:10 ) [[1]] tmpfun2 $x [1] 5 $y 1:10 $w [1] 3 > -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Jannis > Sent: Thursday, September 01, 2011 9:14 AM > To: r-help at r-project.org > Subject: [R] get arguments passed to function inside a function > > Dear list, > > > I am wondering whether there is an (easy) way to access all arguments > and their values passed to a function inside this function and (for > example) store them in a list object? > > I could imagine using ls() inside this function and then looping > through all names and assigning list entries with the values of the > respective objects but I could imagine that there is already something > ready made in R to achieve this. > > Does anybody have an idea? > > Thanks > Jannis > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From jared.harpole at gmail.com Thu Sep 1 22:37:54 2011 From: jared.harpole at gmail.com (drj571) Date: Thu, 1 Sep 2011 13:37:54 -0700 (PDT) Subject: [R] Kernel Density Estimation in R Message-ID: <1314909474160-3784714.post@n4.nabble.com> Hello, I am wanting to run a simulation study in R comparing several different bandwidth selection methods for data simulated from several different distribution types (normal, lognormal, bimodal, etc.) and wanted to know how to calculate the mean integrated square errors for the optimal smoothing parameter values. Does anyone have insight on this? Also, I believe in simulating this type of data and trying to solve for the optimal bandwidth parameter some type of grid search may be necessary, does anyone have any insight into this as well? Thanks for your consideration. -- View this message in context: http://r.789695.n4.nabble.com/Kernel-Density-Estimation-in-R-tp3784714p3784714.html Sent from the R help mailing list archive at Nabble.com. From baptiste.auguie at googlemail.com Thu Sep 1 23:02:46 2011 From: baptiste.auguie at googlemail.com (baptiste auguie) Date: Fri, 2 Sep 2011 09:02:46 +1200 Subject: [R] ggplot2 to create a "square" plot In-Reply-To: <1314811136.640.YahooMailNeo@web120110.mail.ne1.yahoo.com> References: <1314811136.640.YahooMailNeo@web120110.mail.ne1.yahoo.com> Message-ID: Hi, Are you after this? last_plot() + opts(aspect.ratio=1) Also, see https://github.com/hadley/ggplot2/wiki/Themes for some settings re: plot margins. HTH, baptiste On 1 September 2011 05:18, Alaios wrote: > Dear all, > I am using ggplot with geom_tile to print as an image a matrix? I have. My matrix is a squared one of 512*512 cells. > > The code that does that is written below > > >> print(v + geom_tile(aes(fill=dB))+ opts(axis.text.x=theme_text(size=20),axis.text.y=theme_text(size=20), axis.title.x=theme_text(size=25) , axis.title.y=theme_text(size=25), legend.title=theme_text(size=25,hjust=-0.4) , legend.text=theme_text(size=20)) + scale_x_continuous('km')? + scale_y_continuous('km')??? ) > > > > as you can see from the picture below > > http://imageshack.us/photo/my-images/171/backupf.jpg/ > > this squared matrix is printed a bit squeezed with the height being bigger than the width. Would be possible somehow to print that plot by keeping the square-look of the matrix in the plot? Of course the other elements like axis and legend will make the over all plot to not be square but I do not care as the blue and red region forms a square. > > I would like to thank you in advance for your help > B.R > Alex > > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From andra_isan at yahoo.com Thu Sep 1 23:07:00 2011 From: andra_isan at yahoo.com (Andra Isan) Date: Thu, 1 Sep 2011 14:07:00 -0700 (PDT) Subject: [R] Question about BIC of two different regression models? how should we compare two regression models? Message-ID: <1314911220.72871.YahooMailClassic@web120611.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From f.harrell at vanderbilt.edu Thu Sep 1 23:53:58 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Thu, 1 Sep 2011 14:53:58 -0700 (PDT) Subject: [R] How to retrieve bias-corrected probability from calibrate.rms In-Reply-To: References: <1314881007611-3783420.post@n4.nabble.com> Message-ID: <1314914038696-3784868.post@n4.nabble.com> I'm not clear on what you would use that for, but you can use approx(original prob from calibrate, calibrated prob from calibrate, xout=vector of original predicted values)$y to get this. Frank yz wrote: > > Thanks Frank > > I got the predicted probability. > > But can I get the bootstrap corrected probability for individual subject. > > for instance, I can get predicted probability from predict(fit, > type="fitted"). Is there similar one to retrieve the bootstrap corrected > probability for individual subject. > > THANKS > > *Yao Zhu* > *Department of Urology > Fudan University Shanghai Cancer Center > Shanghai, China* > > > 2011/9/1 Frank Harrell <f.harrell at vanderbilt.edu> > >> cal <- calibrate(fit, ...); note that cal is a matrix. colnames(cal) >> will >> tell you what to pick, in this case cal[,'calibrated.corrected']. >> >> Be sure to follow the posting guide. >> Frank >> >> >> yz wrote: >> > >> > Dear R users: >> > >> > In Prof. Harrell's library rms, calibrate.rms plot the Bias-corrected >> > Probability and Apparent Probability. >> > The latter one can be retrieved from class calibrate.default. But how >> to >> > retrieve the former one. >> > >> > BW >> > >> > *Yao Zhu* >> > *Department of Urology >> > Fudan University Shanghai Cancer Center >> > Shanghai, China* >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> >> ----- >> Frank Harrell >> Department of Biostatistics, Vanderbilt University >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/How-to-retrieve-bias-corrected-probability-from-calibrate-rms-tp3783160p3783420.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/How-to-retrieve-bias-corrected-probability-from-calibrate-rms-tp3783160p3784868.html Sent from the R help mailing list archive at Nabble.com. From luciano_machain at yahoo.com.ar Thu Sep 1 23:07:13 2011 From: luciano_machain at yahoo.com.ar (housy) Date: Thu, 1 Sep 2011 14:07:13 -0700 (PDT) Subject: [R] MS-VAR introduction In-Reply-To: References: <005901c9ee6f$16a19ac0$43e4d040$@com> <1314825458317-3782271.post@n4.nabble.com> Message-ID: <1314911233499-3784774.post@n4.nabble.com> Anybody knows where I can find the MSVAR for ox package mentioned above? The websited is not working anymore :( -- View this message in context: http://r.789695.n4.nabble.com/MS-VAR-Introduction-tp896008p3784774.html Sent from the R help mailing list archive at Nabble.com. From mathijsdevaan at gmail.com Thu Sep 1 23:29:01 2011 From: mathijsdevaan at gmail.com (mdvaan) Date: Thu, 1 Sep 2011 14:29:01 -0700 (PDT) Subject: [R] Selections in lists In-Reply-To: References: <1314285852902-3768562.post@n4.nabble.com> Message-ID: <1314912541049-3784816.post@n4.nabble.com> Thanks David and Jorge for your comments! -- View this message in context: http://r.789695.n4.nabble.com/Selections-in-lists-tp3768562p3784816.html Sent from the R help mailing list archive at Nabble.com. From mathijsdevaan at gmail.com Fri Sep 2 00:20:43 2011 From: mathijsdevaan at gmail.com (mdvaan) Date: Thu, 1 Sep 2011 15:20:43 -0700 (PDT) Subject: [R] Selecting and multiplying Message-ID: <1314915643870-3784901.post@n4.nabble.com> Hi, I have created two objects: object c contains yearly "distances" between cases and object g contains yearly interactions between cases. For each case and every year I would like to calculate the following value: Vit = sum(Dabt * Iait * Ibit) Where Vit is the value of case i in year t, Dabt is the distance between all cases a and b that have interacted more than 0 times with i, Iait is the number of times i has interacted with a and Ibit is the number of times i has interacted with b. So for 8027 in 1999 Vit becomes: (0.27547644 * 1 * 2) + (0.31481129 * 1 * 1) + (0.09896982 * 2 * 1) = 1.06370381 How do I create a dataframe that accomodates the values for each case in each year? Thanks in advance! Example: library(zoo) DF1 = data.frame(read.table(textConnection(" B C D E F G 8025 1995 0 4 1 2 8025 1997 1 1 3 4 8026 1995 0 7 0 0 8026 1996 1 2 3 0 8026 1997 1 2 3 1 8026 1998 6 0 0 4 8026 1999 3 7 0 3 8027 1997 1 2 3 9 8027 1998 1 2 3 1 8027 1999 6 0 0 2 8028 1999 3 7 0 0 8029 1995 0 2 3 3 8029 1998 1 2 3 2 8029 1999 6 0 0 1"),head=TRUE,stringsAsFactors=FALSE)) a <- read.zoo(DF1, split = 1, index = 2, FUN = identity) sum.na <- function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA b <- rollapply(a, 3, sum.na, align = "right", partial = TRUE) newDF <- lapply(1:nrow(b), function(i) prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) names(newDF) <- time(a) c<-lapply(newDF, function(mat) 1-tcrossprod(mat / sqrt(rowSums(mat^2)))) c<-lapply(c, function (x) ifelse(x<0.000000111, 0, x)) DF2 = data.frame(read.table(textConnection(" A B C 80 8025 1995 80 8026 1995 80 8029 1995 81 8026 1996 82 8025 1997 82 8026 1997 83 8025 1997 83 8027 1997 90 8026 1998 90 8027 1998 90 8029 1998 84 8026 1999 84 8027 1999 85 8028 1999 85 8029 1999"),head=TRUE,stringsAsFactors=FALSE)) e <- function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) years <- sort(unique(DF2$C)) f <- as.data.frame(embed(years, 3)) g<-lapply(split(f, f[, 1]), e) -- View this message in context: http://r.789695.n4.nabble.com/Selecting-and-multiplying-tp3784901p3784901.html Sent from the R help mailing list archive at Nabble.com. From xkziloj at gmail.com Fri Sep 2 02:57:28 2011 From: xkziloj at gmail.com (. .) Date: Thu, 1 Sep 2011 21:57:28 -0300 Subject: [R] Alternatives to integrate? In-Reply-To: References: <1314885193156-3783645.post@n4.nabble.com> Message-ID: Thanks for your reply Michael, it seems I have a lot of things to learn yet but for sure, your response is being very helpful in this proccess. I will try to explore every point you said: A doubt I have is, if I define "func <- function(x,y) x + y" how can I integrate it only in "x"? My solution for this would be to define "func <- function(x) x + y". Is not ok? Also, with respect to the helper functions I'd created, I am wondering if you can see a better organization for my code. It is so because this is the only way I can see. Particularly I do not like how I am using "results", but I can not think in another form. Thanks in advance. On Thu, Sep 1, 2011 at 2:44 PM, R. Michael Weylandt wrote: > Leaving aside some other issues that this whole email chain has opened up, > > I'd guess that your most immediate problem is that you are trying to > numerically integrate the PMF of a discrete distribution but you are > treating it as a continuous distribution. If you took the time to properly > debug (as you were instructed yesterday) you'd probably find that whenever > you call dpois(x, lambda) for x not an integer you get a warning message. > > Specifically, check this out > >> integrate(dpois,0,Inf,1) > 9.429158e-13 with absolute error < 1.7e-12 > >> n = 0:1000; sum(dpois(n,1)) > 1 > > I could be entirely off base here, but I'm guessing that many of your > problems derive from this. > > > > On another basis, please, please read this: > http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html > or this > http://had.co.nz/stat405/resources/r-style-guide.html > > And, perhaps most importantly, don't rely on the black magic of values > moving in and out of functions (lexical scoping). Seriously, just don't do > it. > > If you have helper functions that need values, actively pass them: you will > save yourself hours of trouble when (not if) you debug your functions. I'm > looking, for example, at g() in the first big block of code you provided. > Call it g(a,n) and spend the extra 4 keystrokes to pass the values. It makes > everyone happier. > > Michael > > On Thu, Sep 1, 2011 at 12:37 PM, . . wrote: >> >> So, please excuse me Michael, you are completely sure. I will try >> describe I am trying to do, please let me know if I can provide more >> info. >> >> The idea is provide to "func" two probability density functions(PDFs) >> and obtain another PDF that is a compound of them. In a final analysis >> this characterize an abundance distribution for me. The two PDFs are >> provided through "f" and "g" and there is some manipulation here >> because I need flexibility to easily change this two funcions. >> >> In the code provided, "f" is the Exponential distribution and "g" is >> the Poisson distribution. For this case, I have the analytical >> solution, below. This way I can check the result. But I am also >> considering other combinations of ?"f" and "g" that have difficult, or >> even does not have analitical solution. This is the reason why I am >> trying to develop "func". >> >> func2 <- function(y, frac, rate, trunc=0, log=FALSE) { >> ? ?is.wholenumber <- function(x, tol = .Machine$double.eps^0.5) >> ? ? ? ?abs(x - round(x)) < tol >> ? ?if(FALSE %in% sapply(y,is.wholenumber)) >> ? ? ? ?print("y must be integer because dpoix is a discrete PDF.") >> ? ?else { >> ? ? ? ?f <- function(y){ >> ? ? ? ? ? ?b <- y*log(frac) >> ? ? ? ? ? ?m <- log(rate) >> ? ? ? ? ? ?n <- (y+1)*log(rate+frac) >> ? ? ? ? ? ?if(log)b+m-n else exp(b+m-n) >> ? ? ? ?} >> ? ? ? ?f(y)/(1-f(trunc)) >> ? ?} >> } >> > func2(200,0.05,0.001) >> [1] 0.000381062 >> >> In theory, the interval of integration is 0 to Inf, but for some tests >> I did, go up to 2000 may still provide reasonable results. >> >> Also, as it seems, I am still writing my first functions in R and >> suggestions are welcome, please. >> >> Again, appologies for my previous mistake. It was not my intention to >> blame about "integrate". >> >> On Thu, Sep 1, 2011 at 11:49 AM, R. Michael Weylandt >> wrote: >> > I'm going to try to put this nicely: >> > >> > What you provided is not a problem with integrate. Instead, you provided >> > a >> > rather unintelligible and badly-written piece of code that >> > (miraculously) >> > seems to work, though it's not well documented so I have no idea if >> > 1.3e-21 >> > is what you want to get. >> > >> > Let's try this again: per your original request, what is the problem >> > with >> > integrate? >> > >> > If instead you feel there's something wrong with your code, might I >> > suggest >> > you just say that and ask for help, rather than passing the blame onto a >> > perfectly useful base function. >> > >> > Oh, and since you asked that I propose something: comment your code. >> > >> > Michael >> > >> > On Thu, Sep 1, 2011 at 10:33 AM, . . wrote: >> >> >> >> Hi Michael, >> >> >> >> This is the problem: >> >> >> >> func <- Vectorize(function(x, a, sad, samp="pois", trunc=0, ...) { >> >> ?result <- function(x) { >> >> ? ?f1 <- function(n) { >> >> ? ? ? ? ? ? ? ? ? ? ? ?f <- function() { >> >> ? ? ? ?dcom <- paste("d", sad, sep="") >> >> ? ? ? ?dots <- c(as.name("n"), list(...)) >> >> ? ? ? ?do.call(dcom, dots) >> >> ? ? ? ? ? ? ? ? ? ? ? ?} >> >> ? ? ?g <- function() { >> >> ? ? ? ?dcom <- paste("d", samp, sep="") >> >> ? ? ? ?lambda <- a * n >> >> ? ? ? ?dots <- c(as.name("x"), as.name("lambda")) >> >> ? ? ? ?do.call(dcom, dots) >> >> ? ? ?} >> >> ? ? ?f() * g() >> >> ? ?} >> >> ? ?integrate(f1,0,2000)$value >> >> # ? ? adaptIntegrate(f1,0,2000)$integral >> >> >> >> # ? ? n <- 0:2000 >> >> # ? ? trapz(n,f1(n)) >> >> >> >> # ? ? area(f1, 0, 2000, limit=10000, eps=1e-100) >> >> ?} >> >> ?return(result(x) / (1 - result(trunc))) >> >> }, "x") >> >> func(200, 0.05, "exp", rate=0.001) >> >> >> >> If you could propose something I will be gratefull. >> >> >> >> Thanks in advance. >> >> >> >> On Thu, Sep 1, 2011 at 10:55 AM, R. Michael Weylandt >> >> wrote: >> >> > Mr ". .", >> >> > >> >> > MASS::area comes to mind but it may be more helpful if you could say >> >> > what >> >> > you are looking for / why integrate is not appropriate it is for >> >> > whatever >> >> > you are doing. >> >> > >> >> > Strictly speaking, I suppose there are all sorts of "alternatives" to >> >> > integrate() if you are willing to be really creative and build >> >> > something >> >> > from scratch: diff(), cumsum(), lm(), hist(), t(), c(), .... >> >> > >> >> > Michael Weylandt >> >> > >> >> > On Thu, Sep 1, 2011 at 9:53 AM, B77S wrote: >> >> >> >> >> >> package "caTools" >> >> >> see ?trapz >> >> >> >> >> >> >> >> >> . wrote: >> >> >> > >> >> >> > Hi all, >> >> >> > >> >> >> > is there any alternative to the function integrate? >> >> >> > >> >> >> > Any comments are welcome. >> >> >> > >> >> >> > Thanks in advance. >> >> >> > >> >> >> > ______________________________________________ >> >> >> > R-help at r-project.org mailing list >> >> >> > https://stat.ethz.ch/mailman/listinfo/r-help >> >> >> > PLEASE do read the posting guide >> >> >> > http://www.R-project.org/posting-guide.html >> >> >> > and provide commented, minimal, self-contained, reproducible code. >> >> >> > >> >> >> >> >> >> -- >> >> >> View this message in context: >> >> >> >> >> >> >> >> >> http://r.789695.n4.nabble.com/Alternatives-to-integrate-tp3783624p3783645.html >> >> >> Sent from the R help mailing list archive at Nabble.com. >> >> >> >> >> >> ______________________________________________ >> >> >> R-help at r-project.org mailing list >> >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> >> PLEASE do read the posting guide >> >> >> http://www.R-project.org/posting-guide.html >> >> >> and provide commented, minimal, self-contained, reproducible code. >> >> > >> >> > >> > >> > > > From michael.weylandt at gmail.com Fri Sep 2 03:41:28 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Thu, 1 Sep 2011 21:41:28 -0400 Subject: [R] Alternatives to integrate? In-Reply-To: References: <1314885193156-3783645.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From totangjie at gmail.com Fri Sep 2 04:46:02 2011 From: totangjie at gmail.com (Jie TANG) Date: Fri, 2 Sep 2011 10:46:02 +0800 Subject: [R] how to return back to go on my cycle while read my files Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Fri Sep 2 05:02:25 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 1 Sep 2011 23:02:25 -0400 Subject: [R] how to return back to go on my cycle while read my files In-Reply-To: References: Message-ID: <2C4B7B37-071F-476D-9D04-78A46D49E53B@comcast.net> On Sep 1, 2011, at 10:46 PM, Jie TANG wrote: > hi ,when i read a lots of files > > for (i in 1:totnum) > { > cop_x_data<-read.table(flnm[i],skip=2) > if(i==1) {cop_data=cop_x_data} > else {cop_data=rbind(cop_data,cop_x_data)} > } > > some of the files are missing . so this loop can not go on .I wonder > how can > I go on > the loop cycle while reading the files just like the command > read(unit,err=linenum) in fortran ? I don't know fortran but why not: filevec <- flnm %in% list.files() for (i in seq_along(filevec) ) { ...} -- David Winsemius, MD Heritage Laboratories West Hartford, CT From jholtman at gmail.com Fri Sep 2 05:27:46 2011 From: jholtman at gmail.com (Jim Holtman) Date: Thu, 1 Sep 2011 23:27:46 -0400 Subject: [R] how to return back to go on my cycle while read my files In-Reply-To: References: Message-ID: <599CE33D-C6C9-4AAA-952B-123DDE2BC64C@gmail.com> ?try Sent from my iPad On Sep 1, 2011, at 22:46, Jie TANG wrote: > hi ,when i read a lots of files > > for (i in 1:totnum) > { > cop_x_data<-read.table(flnm[i],skip=2) > if(i==1) {cop_data=cop_x_data} > else {cop_data=rbind(cop_data,cop_x_data)} > } > > some of the files are missing . so this loop can not go on .I wonder how can > I go on > the loop cycle while reading the files just like the command > read(unit,err=linenum) in fortran ? > > thank you . > > -- > TANG Jie > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From worikr at gmail.com Fri Sep 2 07:13:59 2011 From: worikr at gmail.com (Worik R) Date: Fri, 2 Sep 2011 17:13:59 +1200 Subject: [R] Advice on large data structures Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From markm0705 at gmail.com Fri Sep 2 01:47:48 2011 From: markm0705 at gmail.com (markm0705) Date: Thu, 1 Sep 2011 16:47:48 -0700 (PDT) Subject: [R] Background fill and border for a legend in dotplot Message-ID: <1314920868893-3785003.post@n4.nabble.com> Dear R help group I've been working on this plot for a while now and now getting around to the minor adjusments. I would like to be able to put a border and background fill around the legend in this plot. I understand the legend 'bty' should do this have this capablity but not sure how the syntax works in this case ###### initalise library("lattice") library(latticeExtra) # for mergedTrellisLegendGrob() ##read the data to a variable #---------------------------------------------------------------------------------------- Cal_dat <- read.table("Calibration2.dat",header = TRUE,sep = "\t",) ## set up plotting colours #---------------------------------------------------------------------------------------- col.pat<-c("violet","cyan","green","red","blue","black","yellow") sym.pat<-c(19,20,21) ##set up the plot key #---------------------------------------------------------------------------------------- key1 <- draw.key(list(text=list(levels(Cal_dat$Commodity)), title="Ore type", points=list(pch=22, cex=1.3, fill=col.pat, col="black")), draw = FALSE) key2 <- draw.key(list(text=list(levels(factor(Cal_dat$Year))), title="Year", points = list(pch = c(21, 22, 23), cex=1.3, col="black")), draw = FALSE) mkey <- mergedTrellisLegendGrob(list(fun = key2), list(fun = key1), vertical = TRUE ) ##set some parameters for the plot #---------------------------------------------------------------------------------------- trellis.par.set( dot.line=list(col = "grey90", lty="dashed"), axis.line=list(col = "grey50"), axis.text=list(col ="grey50", cex=0.8), panel.background=list(col="transparent"), par.xlab.text= list(col="grey50"), ) ## Create the dot plot #---------------------------------------------------------------------------------------- with(Cal_dat, dotplot(reorder(paste(Mine,Company), Resc_Gt) ~ Resc_Gt, fill_var = Commodity, pch_var = factor(Year), cex=1.2, pch = c(21, 22, 23), col = "black", fill = col.pat, aspect = 2.0, alpha=0.6, legend = list(inside = list(fun = mkey,corner = c(0.95, 0.01))), scales = list(x = list(log = 10)), xscale.components = xscale.components.log10ticks, origin = 0, type = c("p","a"), main = "Mineral resources", xlab= "Total tonnes (billions)", panel = function(x, y, ..., subscripts, fill, pch, fill_var, pch_var) { pch <- pch[pch_var[subscripts]] fill <- fill[fill_var[subscripts]] panel.dotplot(x, y, pch = pch, fill = fill, ...) })) panel = function(x, y, ..., subscripts, fill, pch, fill_var, pch_var) { pch <- pch[pch_var[subscripts]] fill <- fill[fill_var[subscripts]] panel.dotplot(x, y, pch = pch, fill = fill, ...) } http://r.789695.n4.nabble.com/file/n3785003/Calibration2.dat Calibration2.dat -- View this message in context: http://r.789695.n4.nabble.com/Background-fill-and-border-for-a-legend-in-dotplot-tp3785003p3785003.html Sent from the R help mailing list archive at Nabble.com. From yeyefrankd at snsbank.nl Fri Sep 2 05:01:59 2011 From: yeyefrankd at snsbank.nl (Frank Mill) Date: Fri, 2 Sep 2011 04:01:59 +0100 Subject: [R] Urgente Message-ID: <666593251/mk-filter-1.mail.uk.tiscali.com/81.179.237.152/09/02/11/04:02:12@mk-filter-1.mail.uk.tiscali.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bbolker at gmail.com Fri Sep 2 08:35:39 2011 From: bbolker at gmail.com (Ben Bolker) Date: Fri, 2 Sep 2011 06:35:39 +0000 Subject: [R] =?utf-8?q?Question_about_BIC_of_two_different_regression_mode?= =?utf-8?q?ls=3F_how=09should_we_compare_two_regression_models=3F?= References: <1314911220.72871.YahooMailClassic@web120611.mail.ne1.yahoo.com> Message-ID: Andra Isan yahoo.com> writes: > > Hi All,? > In order to compare two different logistic regressions, > I think I need to compare them based on their BIC > values, but I am not sure if the smaller BIC would mean a better > model or the reverse is true? > Thanks a lot,Andra Smaller (i.e. lower value) BIC is always better (even if BIC happens to be negative, as can happen in some cases; i.e. BIC=-1002 is better than BIC=-1000, BIC=1000 is better than BIC=1002). I would suggest however that (a) there are better venues for this question (e.g. stats.stackexchange.com), since it's a stats and not an R question; (b) it might be a good idea to review a stats text, or even http://en.wikipedia.org/wiki/Bayesian_information_criterion , since this is a pretty basic question. From petr.pikal at precheza.cz Fri Sep 2 08:39:32 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Fri, 2 Sep 2011 08:39:32 +0200 Subject: [R] two question about plot In-Reply-To: References: Message-ID: Hi > > Re: [R] two question about plot > > The help for boxplot offers suggestions for both those things. You may be > particularly interested in: > > names: group labels which will be printed under each boxplot. Can > be a character vector or an expression (see plotmath). > > add: logical, if true _add_ boxplot to current plot. > > Sarah It could be also worth to consult also bwplot from lattice or especially ggplot2 package for plotting. Regards Petr > > > On Thu, Sep 1, 2011 at 1:31 PM, Jie TANG wrote: > > 1) how to modify the the tickment of x-axis or y-axis. > > boxplot(data[,1:5]) > > the tickment in x-axis in V1 V2 V3 V4 V5 ,I want to be some name for > > example > > name<-c("1day","2day","3day","4day","5day") > > > > 2) how to overlap two plot into one figure? > > plot(data[1:5]) > > boxplot(newdata[,1:5]) > > ? > > > > -- > > TANG Jie > > > > > > -- > Sarah Goslee > http://www.functionaldiversity.org > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From paul.hiemstra at knmi.nl Fri Sep 2 09:37:36 2011 From: paul.hiemstra at knmi.nl (Paul Hiemstra) Date: Fri, 02 Sep 2011 07:37:36 +0000 Subject: [R] ggplot2 to create a "square" plot In-Reply-To: References: <1314811136.640.YahooMailNeo@web120110.mail.ne1.yahoo.com> Message-ID: <4E6087C0.4070706@knmi.nl> On 09/01/2011 09:02 PM, baptiste auguie wrote: > Hi, > > Are you after this? > > last_plot() + opts(aspect.ratio=1) Even better (I once got correct by Hadley for using aspect.ratio, but this was plotting spatial data...) last_plot() + coord_equal() cheers, Paul > Also, see https://github.com/hadley/ggplot2/wiki/Themes for some > settings re: plot margins. > > HTH, > > baptiste > > On 1 September 2011 05:18, Alaios wrote: >> Dear all, >> I am using ggplot with geom_tile to print as an image a matrix I have. My matrix is a squared one of 512*512 cells. >> >> The code that does that is written below >> >> >>> print(v + geom_tile(aes(fill=dB))+ opts(axis.text.x=theme_text(size=20),axis.text.y=theme_text(size=20), axis.title.x=theme_text(size=25) , axis.title.y=theme_text(size=25), legend.title=theme_text(size=25,hjust=-0.4) , legend.text=theme_text(size=20)) + scale_x_continuous('km') + scale_y_continuous('km') ) >> >> >> as you can see from the picture below >> >> http://imageshack.us/photo/my-images/171/backupf.jpg/ >> >> this squared matrix is printed a bit squeezed with the height being bigger than the width. Would be possible somehow to print that plot by keeping the square-look of the matrix in the plot? Of course the other elements like axis and legend will make the over all plot to not be square but I do not care as the blue and red region forms a square. >> >> I would like to thank you in advance for your help >> B.R >> Alex >> >> [[alternative HTML version deleted]] >> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 From alaios at yahoo.com Fri Sep 2 10:08:04 2011 From: alaios at yahoo.com (Alaios) Date: Fri, 2 Sep 2011 01:08:04 -0700 (PDT) Subject: [R] ggplot2 to create a "square" plot In-Reply-To: References: <1314811136.640.YahooMailNeo@web120110.mail.ne1.yahoo.com> <1314860296.53552.YahooMailNeo@web120112.mail.ne1.yahoo.com> Message-ID: <1314950884.58696.YahooMailNeo@web120116.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From thomas.chesney at nottingham.ac.uk Fri Sep 2 10:14:30 2011 From: thomas.chesney at nottingham.ac.uk (thomas.chesney) Date: Fri, 2 Sep 2011 01:14:30 -0700 (PDT) Subject: [R] Automatic Recoding In-Reply-To: <5EAA21940C65214F9C11DA5FBBC14F0B2D13C4B100@EXCHANGE2.ad.nottingham.ac.uk> References: <5EAA21940C65214F9C11DA5FBBC14F0B2D13C4B100@EXCHANGE2.ad.nottingham.ac.uk> Message-ID: <1314951270314-3785565.post@n4.nabble.com> Thank you both for your replies. I've tried it with a small sample of the data and it works perfectly. I have no idea yet how it works but I will spend some time to figure it out. Thank you! Thomas -- View this message in context: http://r.789695.n4.nabble.com/Automatic-Recoding-tp3784043p3785565.html Sent from the R help mailing list archive at Nabble.com. From emailvessel at gmail.com Fri Sep 2 08:23:06 2011 From: emailvessel at gmail.com (Katerina Karayianni) Date: Fri, 2 Sep 2011 09:23:06 +0300 Subject: [R] calling R from PHP error In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From justin_bem at yahoo.fr Fri Sep 2 09:02:20 2011 From: justin_bem at yahoo.fr (justin bem) Date: Fri, 2 Sep 2011 08:02:20 +0100 (BST) Subject: [R] Re : P values for vglm(zibinomial) function in VGAM In-Reply-To: <1314108257721-3762858.post@n4.nabble.com> References: <1314108257721-3762858.post@n4.nabble.com> Message-ID: <1314946940.92120.YahooMailNeo@web29517.mail.ird.yahoo.com> Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible URL : From kristian.langgaard.lind at gmail.com Fri Sep 2 11:24:26 2011 From: kristian.langgaard.lind at gmail.com (Kristian Lind) Date: Fri, 2 Sep 2011 11:24:26 +0200 Subject: [R] Using capture.output within a function Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From murdoch.duncan at gmail.com Fri Sep 2 12:07:35 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Fri, 02 Sep 2011 06:07:35 -0400 Subject: [R] Using capture.output within a function In-Reply-To: References: Message-ID: <4E60AAE7.1050602@gmail.com> On 11-09-02 5:24 AM, Kristian Lind wrote: > Dear R-users > > I'm running a maximum likelihood procedure using the spg package. I'd like > to save some output produced in each iteration to a file, but if I put the > capture.output() within the function I get the following message; Error in > spg(par = startval, fn = loglik, gr = NULL, method = 3, lower = lo, : > Failure in initial function evaluation!Error in -fn(par, ...) : invalid > argument to unary operator It looks as though you put capture.output() last in your function, so the result of the function is the result of the capture.output call, not the function value. Duncan Murdoch > > I have considered putting the capture.output() after the function, but there > are some issues with R stalling on me so I'd like that the output is saved > for each iteration and not only at completion. > > Any suggestions on how to get this done would be much appreciated. > > Kristian Lind > > *Below an example of what I'm trying to do...* > > > loglik<- function(w){ > > state<- c( b_1 = 0, > b_2 = 0, > a = 0) > #declaring ODEs > Kristian<-function(t, state, w){ > with(as.list(c(state, w)), > { > db_1 = -((w[1]+w[8])*b_1+(w[2]+w[6]*w[8] > +w[7]*w[9])*b_2+0.5*(b_1)^2+w[6]*b_1*b_2+0.5* > ((w[6])^2+(w[7])^2)*(b_2)^2) > db_2 = -w[3]*b_2+1 > da = w[1]*w[4]*b_1+(w[2]*w[4]+w[3]*w[5])*b_2 > list(c(db_1, db_2, da)) > }) > } > > # time making a sequence from t to T evaluated at each delta seq(t, T, > by = delta) > times<- seq(0, 10, by = 0.5) > > outmat<- ode(y = state, times = times, func = Kristian, parms = w) > print(w) > print(outmat) > . > . > . > f<-rep(NA, 1) > f[1]<- 1/(T-1)*sum(log(pJ$p)-log(pJ$J)) > f > capture.output(outmat, file = "spgoutput.txt", append = TRUE) > } > fit<- spg(fn =loglik, ...) > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From mauricio.zambrano at jrc.ec.europa.eu Fri Sep 2 09:55:24 2011 From: mauricio.zambrano at jrc.ec.europa.eu (Mauricio Zambrano-Bigiarini) Date: Fri, 2 Sep 2011 09:55:24 +0200 Subject: [R] [R-pkgs] hydroTSM 0.3-0 and hydroGOF 0.3-0 Message-ID: <4E608BEC.6090600@jrc.ec.europa.eu> Dear R users and hydrological/environmental community, I'm glad to announce that a major (and recommended) update for the packages hydroTSM and hydroGOF are now available on CRAN: -) hydroTSM: http://cran.r-project.org/web/packages/hydroTSM/ -) hydroGOF: http://cran.r-project.org/web/packages/hydroGOF/ ################### # hydroTSM v0.3-0 # ################### hydroTSM is a package for management, analysis, interpolation and plotting of time series used in hydrology and related environmental sciences. This new release collects feedback received since the first public release of the package (~ 1 year ago). Major changes are related to improved plotting of time series, better and faster support for zoo objects, and new features in some functions. A full list of changes can be found on: http://www.rforge.net/hydroTSM/news.html ################### # hydroGOF v0.3-0 # ################### hydroGOF is a package implementing both statistical and graphical goodness-of-fit measures between observed and simulated values, mainly oriented to be used during the calibration, validation, and application of hydrological/environmental models. Major changes in this new release are related to improved plotting of simulated vs observed values and a new vignette. A full list of changes can be found on: http://www.rforge.net/hydroGOF/news.html ################### # Related Links # ################### http://meetingorganizer.copernicus.org/EGU2010/EGU2010-13008.pdf http://www.slideshare.net/hzambran/egu2010-ra-statisticalenvironmentfordoinghydrologicalanalysis-9095709 http://www.r-project.org/conferences/useR-2009/slides/Zambrano+Bigiarini.pdf ################### Bugs / comments / questions / collaboration of any kind are very welcomed, and in particular, datasets that can be included in the packages for academic purposes. Kind regards, Mauricio Zambrano-Bigiarini -- ======================================================= FLOODS Action Land Management and Natural Hazards Unit Institute for Environment and Sustainability (IES) European Commission, Joint Research Centre (JRC) webinfo : http://floods.jrc.ec.europa.eu/ ======================================================= DISCLAIMER:\ "The views expressed are purely those of th...{{dropped:7}} From r_b_hamilton at yahoo.co.uk Fri Sep 2 11:20:40 2011 From: r_b_hamilton at yahoo.co.uk (betty_d) Date: Fri, 2 Sep 2011 02:20:40 -0700 (PDT) Subject: [R] betareg question - keeping the mean fixed? In-Reply-To: <4E5F8793.1090308@jku.at> References: <1314876482055-3783303.post@n4.nabble.com> <4E5F8793.1090308@jku.at> Message-ID: <1314955240369-3785683.post@n4.nabble.com> Thanks for your response, that does work, however, it is still not quite what want. I would like to tell betareg what the mean is (in my case, 0.5) and force it to use that value. Is this possible? -- View this message in context: http://r.789695.n4.nabble.com/betareg-question-keeping-the-mean-fixed-tp3783303p3785683.html Sent from the R help mailing list archive at Nabble.com. From thiem at sipo.gess.ethz.ch Fri Sep 2 12:06:47 2011 From: thiem at sipo.gess.ethz.ch (Thiem Alrik) Date: Fri, 2 Sep 2011 10:06:47 +0000 Subject: [R] Maximum Likelihood using optim() Message-ID: <11CF903D5D22CD42AF2157C898E05EB204A7C43C@MBX12.d.ethz.ch> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jholtman at gmail.com Fri Sep 2 12:33:13 2011 From: jholtman at gmail.com (Jim Holtman) Date: Fri, 2 Sep 2011 06:33:13 -0400 Subject: [R] Advice on large data structures In-Reply-To: References: Message-ID: <6682A7D3-A31E-493D-B046-327407DCE982@gmail.com> i would suggest that if you want to use R that you get a 64-bit version with 24GB of memory to start. if your data is a numeric matrix, you will need 8GB for a single copy. Do you really need it all in memory at once, or can you partition the problem? Can you use a database to access the portion you need at any time? If you only need one, or two, columns at a time, then the use of a database storing the columns might work. You probably need some more analysis on exactly how you want to solve your problem understanding the limitations of the system. Sent from my iPad On Sep 2, 2011, at 1:13, Worik R wrote: > Friends > > I am starting on a (section of the) project where I need to build a matrix > with on the order of 5 million rows and 200 columns > > I am wondering if I can stay in R. > > I need to do rollapply type operations on the columns, including some that > will be functions of (windows of) two columns. > > I have been looking at the ff and bigmemory packages but am not sure that > they will do. > > Before I get too deep can some one offer some wisdom about what the best > direction to go would be? > > Switching to C/C++ is definitely an option if it is all too hard > > cheers > Worik > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From kristian.langgaard.lind at gmail.com Fri Sep 2 12:53:31 2011 From: kristian.langgaard.lind at gmail.com (Kristian Lind) Date: Fri, 2 Sep 2011 12:53:31 +0200 Subject: [R] Using capture.output within a function In-Reply-To: <4E60AAE7.1050602@gmail.com> References: <4E60AAE7.1050602@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From anna.dunietz at gmail.com Fri Sep 2 14:16:45 2011 From: anna.dunietz at gmail.com (Anna Dunietz) Date: Fri, 2 Sep 2011 14:16:45 +0200 Subject: [R] If NA Problem! Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djmuser at gmail.com Fri Sep 2 14:17:21 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Fri, 2 Sep 2011 05:17:21 -0700 Subject: [R] Background fill and border for a legend in dotplot In-Reply-To: <1314920868893-3785003.post@n4.nabble.com> References: <1314920868893-3785003.post@n4.nabble.com> Message-ID: Hi: Try this: key1 <- draw.key(list(text=list(levels(Cal_dat$Commodity)), title="Ore type", border = TRUE, background = 'ivory', points=list(pch=22, cex=1.3, fill=col.pat, col="black")), draw = FALSE) key2 <- draw.key(list(text=list(levels(factor(Cal_dat$Year))), title="Year", border = TRUE, background = 'ivory', points = list(pch = c(21, 22, 23), cex=1.3, col="black")), draw = FALSE) mkey <- mergedTrellisLegendGrob(list(fun = key2), list(fun = key1), vertical = TRUE ) Now rerun your dotplot; from the result I got, you may need to do some positional tweaking and may well want to change the background color of the legend to something else.. HTH, Dennis On Thu, Sep 1, 2011 at 4:47 PM, markm0705 wrote: > Dear R help group > > I've been working on this plot for a while now and now getting around to the > minor adjusments. ?I would like to be able to put a border and background > fill around the legend in this plot. > > I understand the legend 'bty' should do this have this capablity but not > sure how the syntax works in this case > > ###### initalise > library("lattice") > library(latticeExtra) # for mergedTrellisLegendGrob() > > ##read the data to a variable > #---------------------------------------------------------------------------------------- > > Cal_dat <- read.table("Calibration2.dat",header = TRUE,sep = "\t",) > > ## set up plotting colours > #---------------------------------------------------------------------------------------- > col.pat<-c("violet","cyan","green","red","blue","black","yellow") > sym.pat<-c(19,20,21) > > ##set up the plot key > #---------------------------------------------------------------------------------------- > key1 <- > ? draw.key(list(text=list(levels(Cal_dat$Commodity)), > ? ? ? ? ? ? ? ? title="Ore type", > ? ? ? ? ? ? ? ? points=list(pch=22, cex=1.3, fill=col.pat, col="black")), > ? ? ? ? ? ?draw = FALSE) > key2 <- > ? draw.key(list(text=list(levels(factor(Cal_dat$Year))), > ? ? ? ? ? ? ? ? title="Year", > ? ? ? ? ? ? ? ? points = list(pch = c(21, 22, 23), cex=1.3, col="black")), > ? ? ? ? ? ?draw = FALSE) > > mkey <- > ? mergedTrellisLegendGrob(list(fun = key2), > ? ? ? ? ? ? ? ? ? ? ? ? ? list(fun = key1), > ? ? ? ? ? ? ? ? ? ? ? ? ? vertical = TRUE > ) > > ##set some parameters for the plot > #---------------------------------------------------------------------------------------- > trellis.par.set( > ? ? ? ?dot.line=list(col = "grey90", lty="dashed"), > ? ? ? ?axis.line=list(col = "grey50"), > ? ? ? ?axis.text=list(col ="grey50", cex=0.8), > ? ? ? ?panel.background=list(col="transparent"), > ? ? ? ?par.xlab.text= list(col="grey50"), > ) > > ## Create the dot plot > #---------------------------------------------------------------------------------------- > with(Cal_dat, > ? ?dotplot(reorder(paste(Mine,Company), Resc_Gt) ~ Resc_Gt, > ? ? ? ? ? ?fill_var = Commodity, > ? ? ? ? ? ?pch_var = factor(Year), > ? ? ? ? ? ?cex=1.2, > ? ? ? ? ? ?pch = c(21, 22, 23), > ? ? ? ? ? ?col = "black", > ? ? ? ? ? ?fill = col.pat, > ? ? ? ? ? ?aspect = 2.0, > ? ? ? ? ? ? ? ?alpha=0.6, > ? ? ? ? ? ?legend = list(inside = list(fun = mkey,corner = c(0.95, 0.01))), > ? ? ? ? ? ? ? ?scales = list(x = list(log = 10)), > ? ? ? ? ? ? ? ?xscale.components = xscale.components.log10ticks, > ? ? ? ? ? ?origin = 0, > ? ? ? ? ? ?type = c("p","a"), > ? ? ? ? ? ?main = "Mineral resources", > ? ? ? ? ? ?xlab= "Total tonnes (billions)", > ? ? ? ? ? ?panel = function(x, y, ..., subscripts, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? fill, pch, fill_var, pch_var) { > ? ? ? ? ? ? ? ?pch <- pch[pch_var[subscripts]] > ? ? ? ? ? ? ? ?fill <- fill[fill_var[subscripts]] > ? ? ? ? ? ? ? ?panel.dotplot(x, y, pch = pch, fill = fill, ...) > ? ? ? ? ? ?})) > > ? ? ? ? ? ?panel = function(x, y, ..., subscripts, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? fill, pch, fill_var, pch_var) { > ? ? ? ? ? ? ? ?pch <- pch[pch_var[subscripts]] > ? ? ? ? ? ? ? ?fill <- fill[fill_var[subscripts]] > ? ? ? ? ? ? ? ?panel.dotplot(x, y, pch = pch, fill = fill, ...) > ? ? ? ? ? ?} > > http://r.789695.n4.nabble.com/file/n3785003/Calibration2.dat > Calibration2.dat > > -- > View this message in context: http://r.789695.n4.nabble.com/Background-fill-and-border-for-a-legend-in-dotplot-tp3785003p3785003.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From Joao.Fadista at med.lu.se Fri Sep 2 14:34:08 2011 From: Joao.Fadista at med.lu.se (Joao Fadista) Date: Fri, 2 Sep 2011 12:34:08 +0000 Subject: [R] merge some columns Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jvadams at usgs.gov Fri Sep 2 14:39:30 2011 From: jvadams at usgs.gov (Jean V Adams) Date: Fri, 2 Sep 2011 07:39:30 -0500 Subject: [R] If NA Problem! In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From JSorkin at grecc.umaryland.edu Fri Sep 2 14:48:39 2011 From: JSorkin at grecc.umaryland.edu (John Sorkin) Date: Fri, 2 Sep 2011 08:48:39 -0400 Subject: [R] Question about BIC of two different regression models? how should we compare two regression models? In-Reply-To: References: <1314911220.72871.YahooMailClassic@web120611.mail.ne1.yahoo.com> Message-ID: <4E609867020000CB00095193@med-webappgwia1.medicine.umaryland.edu> I believe when using BIC one needs to compare nested models, i.e. , when comparing models A and B one must make sure that model A contains all the parameters of model B and additionally A contains one or more extra parameter beyond those in B. Further the comparison of BICs requires that models A and B be run on the same data. Thus if we have model A y=age+sex, model B y=age, if some subjects are missing data on their sex, model A would be run on a subset of the data used when running model B. In this case the comparison of BICs would not be valid. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) >>> Ben Bolker 9/2/2011 2:35 AM >>> Andra Isan yahoo.com> writes: > > Hi All, > In order to compare two different logistic regressions, > I think I need to compare them based on their BIC > values, but I am not sure if the smaller BIC would mean a better > model or the reverse is true? > Thanks a lot,Andra Smaller (i.e. lower value) BIC is always better (even if BIC happens to be negative, as can happen in some cases; i.e. BIC=-1002 is better than BIC=-1000, BIC=1000 is better than BIC=1002). I would suggest however that (a) there are better venues for this question (e.g. stats.stackexchange.com), since it's a stats and not an R question; (b) it might be a good idea to review a stats text, or even http://en.wikipedia.org/wiki/Bayesian_information_criterion , since this is a pretty basic question. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} From djmuser at gmail.com Fri Sep 2 15:15:12 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Fri, 2 Sep 2011 06:15:12 -0700 Subject: [R] merge some columns In-Reply-To: References: Message-ID: Hi: Here's one approach: d <- read.table(textConnection(" V1 V2 V3 V4 V5 V6 1 G A G G G G 2 A A G A A G"), header = TRUE, stringsAsFactors = FALSE) closeAllConnections() # Create two vectors of variable names, one for odd numbered, # one for even numbered vars1 <- names(d)[seq_along(names(d)) %% 2 == 1] vars2 <- names(d)[seq_along(names(d)) %% 2 == 0] # Apply the paste sequentially to corresponding pairs # in vars1 and vars2; get() is used to get the data associated # with the variable names in vars1 and vars2 d2 <- sapply(seq_along(vars1), function(j) with(d, paste(get(vars1[j]), get(vars2[j]), sep = '/'))) # Convert to data frame: d2 <- as.data.frame(d2, stringsAsFactors = FALSE) str(d2) HTH, Dennis On Fri, Sep 2, 2011 at 5:34 AM, Joao Fadista wrote: > Dear all, > > I would like to know how to merge columns like: > > Input file: > ?V1 V2 V3 V4 V5 V6 > 1 ?G ?A ?G ?G ?G ?G > 2 ?A ?A ?G ?A ?A ?G > > Desired output file: > ? ?V1 ?V2 ? V3 > 1 ?G/A G/G G/G > 2 ?A/A G/A A/G > > So for every 2 consecutive columns merge their content into one. > Thanks in advance. > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From dwinsemius at comcast.net Fri Sep 2 15:21:01 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 2 Sep 2011 09:21:01 -0400 Subject: [R] Automatic Recoding In-Reply-To: <1314951270314-3785565.post@n4.nabble.com> References: <5EAA21940C65214F9C11DA5FBBC14F0B2D13C4B100@EXCHANGE2.ad.nottingham.ac.uk> <1314951270314-3785565.post@n4.nabble.com> Message-ID: On Sep 2, 2011, at 4:14 AM, thomas.chesney wrote: > Thank you both for your replies. I've tried it with a small sample > of the > data and it works perfectly. I have no idea yet how it works but I > will > spend some time to figure it out. > When you get around to putting in the time to figure it out, the solution to start with would be Dunlap's match strategy. Mine is really a perversion of an aspect of how factors work but isn't really how you are supposed to use them. -- David Winsemius, MD West Hartford, CT From borisberanger at gmail.com Fri Sep 2 14:27:39 2011 From: borisberanger at gmail.com (Boris Beranger) Date: Fri, 2 Sep 2011 05:27:39 -0700 (PDT) Subject: [R] Weights using Survreg In-Reply-To: <1314883419.2400.7.camel@nemo> References: <1314883419.2400.7.camel@nemo> Message-ID: <1314966459758-3785931.post@n4.nabble.com> Thank you for your reply, it has been helpful. Do you know if the parameters estimators are MLE estimators? One more question: In my case study I have failures that occured on different objects that have different age and length, could I use weight to find the estimates of a weibull law and so to find the probabilty of failure per unit of length for example? Thank you very much again for your help, Boris -- View this message in context: http://r.789695.n4.nabble.com/Weights-using-Survreg-tp3781803p3785931.html Sent from the R help mailing list archive at Nabble.com. From alexandra.soberon at unican.es Fri Sep 2 14:26:03 2011 From: alexandra.soberon at unican.es (Soberon Velez, Alexandra Pilar) Date: Fri, 2 Sep 2011 12:26:03 +0000 Subject: [R] Bandwith selectors for multivariate local regression Message-ID: <7546B009C5D8DF4FA41549EE1FBB2EDE23ECF276@mbx01.unican.es> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Fri Sep 2 15:30:02 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 2 Sep 2011 09:30:02 -0400 Subject: [R] merge some columns In-Reply-To: References: Message-ID: <6DA8097B-3142-41A0-A022-CAA68FF1AE91@comcast.net> On Sep 2, 2011, at 8:34 AM, Joao Fadista wrote: > Dear all, > > I would like to know how to merge columns like: > > Input file: > V1 V2 V3 V4 V5 V6 > 1 G A G G G G > 2 A A G A A G > Looked like an mapply-type problem: > with(dat, mapply(paste, list(V1, V3, V5), list(V2, V4, V6), MoreArgs=list(sep="/") ) ) [,1] [,2] [,3] [1,] "G/A" "G/G" "G/G" [2,] "A/A" "G/A" "A/G" > Desired output file: > V1 V2 V3 > 1 G/A G/G G/G > 2 A/A G/A A/G > > So for every 2 consecutive columns merge their content into one. > Thanks in advance. > > > [[alternative HTML version deleted]] -- David Winsemius, MD West Hartford, CT From erik.svensson at biol.lu.se Fri Sep 2 12:41:40 2011 From: erik.svensson at biol.lu.se (ErikiLund) Date: Fri, 2 Sep 2011 03:41:40 -0700 (PDT) Subject: [R] Standard errors of sexual dimorphism? Message-ID: <1314960100934-3785770.post@n4.nabble.com> Hello! I am working on a manuscript on sexual dimorphism in an aquatic invertebrate, where we have estimated sexual dimorphism (SD) for 7 different traits in four populations (a total of 28 SD-estimates). We have used the following formula for estimating SD: 100 * (mean male trait value - mean female trait value)/overall trait mean). Then, we have used these SD-estimates to perform a GLM against other interesting variables, such as the intersexual genetic correlations for each of the traits. Here are my questions: 1. Is there any procedure in "R" you would recommend that takes in to account the sampling variance of the SD-estimates, rather than using the mean value of each (which is supposed to reduce error and increase Type I-error rates? 2. Is there a procedure to estimate SE for ratios such as this SD-estimate? 3. The data in these GLM:s might not be entirely statistically non-independent (i. e. intersexual genetic correlations). Can you recommend any R-procedure (package) that can deal with this problem (e. g. bootstrapping or resampling)? Many thanks in advance for input! Erik Svensson -- View this message in context: http://r.789695.n4.nabble.com/Standard-errors-of-sexual-dimorphism-tp3785770p3785770.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Fri Sep 2 15:42:51 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 2 Sep 2011 09:42:51 -0400 Subject: [R] merge some columns In-Reply-To: <6DA8097B-3142-41A0-A022-CAA68FF1AE91@comcast.net> References: <6DA8097B-3142-41A0-A022-CAA68FF1AE91@comcast.net> Message-ID: On Sep 2, 2011, at 9:30 AM, David Winsemius wrote: > > On Sep 2, 2011, at 8:34 AM, Joao Fadista wrote: > >> Dear all, >> >> I would like to know how to merge columns like: >> >> Input file: >> V1 V2 V3 V4 V5 V6 >> 1 G A G G G G >> 2 A A G A A G >> > > Looked like an mapply-type problem: > > > with(dat, > mapply(paste, > list(V1, V3, V5), > list(V2, V4, V6), > MoreArgs=list(sep="/") ) > ) There is a further refinement that is possible that will result in naming of the columns made possible by the behavior of the USE.NAMES feature of mapply. From the help page: "use names if the first ... argument has names, or if it is a character vector, use that character vector as the names"; with(dat, mapply(paste, list(V1 =V1, V2=V3, V3=V5), list(V2, V4, V6), MoreArgs=list(sep="/") ) ) V1 V2 V3 [1,] "G/A" "G/G" "G/G" [2,] "A/A" "G/A" "A/G" > > [,1] [,2] [,3] > [1,] "G/A" "G/G" "G/G" > [2,] "A/A" "G/A" "A/G" > > >> Desired output file: >> V1 V2 V3 >> 1 G/A G/G G/G >> 2 A/A G/A A/G >> >> So for every 2 consecutive columns merge their content into one. >> Thanks in advance. >> >> >> [[alternative HTML version deleted]] > -- > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From rhelp.stats at gmail.com Fri Sep 2 16:15:31 2011 From: rhelp.stats at gmail.com (Sam Stewart) Date: Fri, 2 Sep 2011 11:15:31 -0300 Subject: [R] Maximum Likelihood using optim() In-Reply-To: <11CF903D5D22CD42AF2157C898E05EB204A7C43C@MBX12.d.ethz.ch> References: <11CF903D5D22CD42AF2157C898E05EB204A7C43C@MBX12.d.ethz.ch> Message-ID: I think the following pdf will explain the details of how to use the optim function. http://www.unc.edu/~monogan/computing/r/MLE_in_R.pdf Hope that helps, Sam On Fri, Sep 2, 2011 at 7:06 AM, Thiem Alrik wrote: > Dear mailing list, > > I would like to use the optim() command in order to maximize the logged likelihood of the following function, where p is the parameter of interest and should be constrained between 0 and positive infinity. > > y = ?1/2 * ((te - x)/(te - tc))^p > > x and y are given by > > x <- c(5.18, 6.28, 7.00, 7.08, 7.54, 7.90, 8.24, 8.64, 12.17, 12.89, 14.27, 15.38, 15.80, 16.46, 20.41, 21.27, 22.91) > y <- c(0.63, 0.64, 0.66, 0.68, 0.69, 0.71, 0.73, 0.75, 0.76, 0.78, 0.8, 0.81, 0.83, 0.85, 0.86, 0.88, 0.9) > > te and tc are fixed at 5 and 25. > > What is a good way to achieve this? Thanks a lot. > > Alrik Thiem > Department of Humanities, Social and Political Sciences > ETH Zurich > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From jiliguala at mail.com Fri Sep 2 16:50:00 2011 From: jiliguala at mail.com (jiliguala) Date: Fri, 2 Sep 2011 07:50:00 -0700 (PDT) Subject: [R] !!!function to do the knn!!! In-Reply-To: References: <1314801332615-3781137.post@n4.nabble.com> <1314812135626-3781738.post@n4.nabble.com> Message-ID: <1314975000484-3786251.post@n4.nabble.com> really thxs to David Winsemius.. this websits helps a lot, -- View this message in context: http://r.789695.n4.nabble.com/function-to-do-the-knn-tp3781137p3786251.html Sent from the R help mailing list archive at Nabble.com. From jiliguala at mail.com Fri Sep 2 16:50:30 2011 From: jiliguala at mail.com (jiliguala) Date: Fri, 2 Sep 2011 07:50:30 -0700 (PDT) Subject: [R] !!!function to do the knn!!! In-Reply-To: References: <1314801332615-3781137.post@n4.nabble.com> Message-ID: <1314975030987-3786253.post@n4.nabble.com> ths a lot, david. it helps a lot -- View this message in context: http://r.789695.n4.nabble.com/function-to-do-the-knn-tp3781137p3786253.html Sent from the R help mailing list archive at Nabble.com. From suuz_beck at hotmail.com Fri Sep 2 16:30:46 2011 From: suuz_beck at hotmail.com (suuz) Date: Fri, 2 Sep 2011 07:30:46 -0700 (PDT) Subject: [R] post hoc testing of glmer in lme4 Message-ID: <1314973846309-3786201.post@n4.nabble.com> I have a mixed model with a binomial response, four factor variables and one random factor. m1=glmer(nbhf.hour~Season+Diel+Tidal.phase+Tidal.cycle+(1|POD.ID),family=binomial,data =bl1,control=list(msVerbose=TRUE)) I have really need to try and find a post hoc test for this model and finding the pariwise comparisons, note the dataset is unbalanced. I had read many questions on this and there doesn't seem to be an acceptable/agreeable answer although perhaps this was some time ago and the question can be answered? Any help is greatly appreciated. Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/post-hoc-testing-of-glmer-in-lme4-tp3786201p3786201.html Sent from the R help mailing list archive at Nabble.com. From patrick.breheny at uky.edu Fri Sep 2 17:09:50 2011 From: patrick.breheny at uky.edu (Patrick Breheny) Date: Fri, 2 Sep 2011 11:09:50 -0400 Subject: [R] Question about BIC of two different regression models? how should we compare two regression models? In-Reply-To: <4E609867020000CB00095193@med-webappgwia1.medicine.umaryland.edu> References: <1314911220.72871.YahooMailClassic@web120611.mail.ne1.yahoo.com> <4E609867020000CB00095193@med-webappgwia1.medicine.umaryland.edu> Message-ID: <4E60F1BE.4060000@uky.edu> On 09/02/2011 08:48 AM, John Sorkin wrote: > I believe when using BIC one needs to compare nested models This is wrong. Hypothesis tests rely on nested models; information criteria do not. -- Patrick Breheny Assistant Professor Department of Biostatistics Department of Statistics University of Kentucky From ehlers at ucalgary.ca Fri Sep 2 17:21:35 2011 From: ehlers at ucalgary.ca (Peter Ehlers) Date: Fri, 02 Sep 2011 08:21:35 -0700 Subject: [R] If NA Problem! In-Reply-To: References: Message-ID: <4E60F47F.60301@ucalgary.ca> On 2011-09-02 05:16, Anna Dunietz wrote: > Hi All! > > Please find code and the respective lists below. My problem: I specify the > case that lilwin[[p]] is not an NA and want the code found in iwish to be > returned ONLY for that case. Why do I get a list of length 2 (and why is > NULL the first element)? I understand that the code below is quite > senseless. I have run into a problem while working on a large project and > wanted to simplify it in order for it to be more understandable and > accessible. If I should not be using the if function, please let me know > what I should be doing instead. I know that I must use the for function for > my project. The thing I most want to understand is how, after specifying a > certain condition, one may save certain data that occurs when that condition > is met. I hope I have been clear enough! > > Thank you very much for your help! > Anna > > biglist<-list(a=1:4,b=2:6) > lilwin<-list(x=NA,y=2) > lilloss<-list(m=1,n=3) > > > >> biglist$a > [1] 1 2 3 4 > > $b > [1] 2 3 4 5 6 >> lilwin$x > [1] NA > > $y > [1] 2 >> lilloss$m > [1] 1 > > $n > [1] 3 > > > iwish<-list() > for(p in 1:length(biglist)){ > if(is.na(lilwin[[p]])==F) iwish[p]<-list(biglist[[p]][lilwin[[p]]]) > } > > >> iwish[[1]] > NULL > > [[2]] > [1] 3 Jean has given you one fix. Here's another (see ?'c'): iwish<-list() for(p in seq_along(biglist)){ if(!is.na(lilwin[[p]])) iwish <- c(iwish, biglist[[p]][lilwin[[p]]]) } BTW, it's not a good idea to use 'F' instead of FALSE and the negation operator is usually a better way to test. Peter Ehlers > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ripley at stats.ox.ac.uk Fri Sep 2 17:26:06 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Fri, 2 Sep 2011 16:26:06 +0100 Subject: [R] Question about BIC of two different regression models? how should we compare two regression models? In-Reply-To: <4E60F1BE.4060000@uky.edu> References: <1314911220.72871.YahooMailClassic@web120611.mail.ne1.yahoo.com> <4E609867020000CB00095193@med-webappgwia1.medicine.umaryland.edu> <4E60F1BE.4060000@uky.edu> Message-ID: On Fri, 2 Sep 2011, Patrick Breheny wrote: > On 09/02/2011 08:48 AM, John Sorkin wrote: >> I believe when using BIC one needs to compare nested models > > This is wrong. Hypothesis tests rely on nested models; information criteria > do not. Actually, this is off-topic on this list. But blanket statements are often themselves untrue: there are hypothesis tests of non-nested models (most famously due to Cox, 1961), and Akaike explicitly considered only nested models in his paper introducing AIC. Certainly criteria such as AIC and BIC (in the sense of Schwarz: there are several criteria of that name) can be used with non-nested models but are much sharper tools for nested models. > > -- > Patrick Breheny > Assistant Professor > Department of Biostatistics > Department of Statistics > University of Kentucky > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From gunter.berton at gene.com Fri Sep 2 17:27:14 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Fri, 2 Sep 2011 08:27:14 -0700 Subject: [R] Question about BIC of two different regression models? how should we compare two regression models? In-Reply-To: <4E60F1BE.4060000@uky.edu> References: <1314911220.72871.YahooMailClassic@web120611.mail.ne1.yahoo.com> <4E609867020000CB00095193@med-webappgwia1.medicine.umaryland.edu> <4E60F1BE.4060000@uky.edu> Message-ID: Inline: On Fri, Sep 2, 2011 at 8:09 AM, Patrick Breheny wrote: > On 09/02/2011 08:48 AM, John Sorkin wrote: >> >> I believe when using BIC one needs to compare nested models > > This is wrong. ?Hypothesis tests rely on nested models; information criteria > do not. > Yes, indeed. It may additionally be worth noting what has has been noted on this list before: the actual definition of such criteria is given only up to a constant, so different **software** may give different answers on the same data. Hence be sure to compare results using the same software or make any necessary additive adjustments based on the details of how the software does the calculation when results from different software are being compared. Cheers, Bert > -- > Patrick Breheny > Assistant Professor > Department of Biostatistics > Department of Statistics > University of Kentucky > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics From scott.mcgrane at abdn.ac.uk Fri Sep 2 17:35:16 2011 From: scott.mcgrane at abdn.ac.uk (ScottM) Date: Fri, 2 Sep 2011 08:35:16 -0700 (PDT) Subject: [R] Mann Kendall Test for Trend Message-ID: <1314977716843-3786392.post@n4.nabble.com> Hi there, I'm trying to apply the Mann Kendall test for trend analysis of a time series. I have downloaded and installed the package Kendall and subsequently loaded it into the software. My time series is a .txt file with 2 columns - column 1 is the year (1985 - 2009) and column 2 is the corresponding entry variable. According to the R guidelines, the call should be: MannKendall(x) [whereby x is a data vector, usually a time series] As such, I've loaded in my file 'data.txt' and then called on: MannKendall(data), to which I get the following: Error in Kendall(1:length(x), x) : length(x)<3 What do I need to do to get beyond this highly annoying error? I've tried MannKendall(1:27(data), data), but then keep getting this: Error in Kendall(1:length(x), x) : attempt to apply non-function Any help greatly received! S -- View this message in context: http://r.789695.n4.nabble.com/Mann-Kendall-Test-for-Trend-tp3786392p3786392.html Sent from the R help mailing list archive at Nabble.com. From ccquant at gmail.com Fri Sep 2 17:35:39 2011 From: ccquant at gmail.com (Ben qant) Date: Fri, 2 Sep 2011 09:35:39 -0600 Subject: [R] previous monday date Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mail at joeconway.com Fri Sep 2 17:36:12 2011 From: mail at joeconway.com (Joe Conway) Date: Fri, 02 Sep 2011 08:36:12 -0700 Subject: [R] Advice on large data structures In-Reply-To: References: Message-ID: <4E60F7EC.6070100@joeconway.com> On 09/01/2011 10:13 PM, Worik R wrote: > I am starting on a (section of the) project where I need to build a matrix > with on the order of 5 million rows and 200 columns > > I am wondering if I can stay in R. > > I need to do rollapply type operations on the columns, including some that > will be functions of (windows of) two columns. Perhaps useful to you -- I recently added WINDOW FUNCTION support to PL/R*. Currently this new feature is only available in git master, but within a few days I will push a new release. You can download the source from git here if you want: https://github.com/jconway/plr The official docs have not been updated yet, but see the pre-release docs here (specifically chapter 9): http://www.joeconway.com/plr/doc/plr-git-US.pdf HTH, Joe *PL/R allows you to execute R functions from within a PostgreSQL database -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support From patrick.breheny at uky.edu Fri Sep 2 17:39:54 2011 From: patrick.breheny at uky.edu (Patrick Breheny) Date: Fri, 2 Sep 2011 11:39:54 -0400 Subject: [R] Question about BIC of two different regression models? how should we compare two regression models? In-Reply-To: References: <1314911220.72871.YahooMailClassic@web120611.mail.ne1.yahoo.com> <4E609867020000CB00095193@med-webappgwia1.medicine.umaryland.edu> <4E60F1BE.4060000@uky.edu> Message-ID: <4E60F8CA.3080902@uky.edu> On 09/02/2011 11:26 AM, Prof Brian Ripley wrote: >> This is wrong. Hypothesis tests rely on nested models; information criteria >> do not. > > Actually, this is off-topic on this list. But blanket statements are > often themselves untrue: there are hypothesis tests of non-nested > models (most famously due to Cox, 1961), and Akaike explicitly > considered only nested models in his paper introducing AIC. > Certainly criteria such as AIC and BIC (in the sense of Schwarz: there > are several criteria of that name) can be used with non-nested models > but are much sharper tools for nested models. Good point; my remark was only meant to refer to the simple case of logistic regression in the original post, and certainly should not be construed as a blanket statement applying to all possible hypothesis tests of all possible models. -- Patrick Breheny Assistant Professor Department of Biostatistics Department of Statistics University of Kentucky From michael.weylandt at gmail.com Fri Sep 2 17:46:25 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 2 Sep 2011 11:46:25 -0400 Subject: [R] Mann Kendall Test for Trend In-Reply-To: <1314977716843-3786392.post@n4.nabble.com> References: <1314977716843-3786392.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From marc_schwartz at me.com Fri Sep 2 17:59:37 2011 From: marc_schwartz at me.com (Marc Schwartz) Date: Fri, 02 Sep 2011 10:59:37 -0500 Subject: [R] previous monday date In-Reply-To: References: Message-ID: On Sep 2, 2011, at 10:35 AM, Ben qant wrote: > Hello, > > I'm attempting to return the date (in form '%Y-%m-%d') of the Monday > previous to the current date. For example: since it is 2011-09-02 today, I > would expect 2011-08-29 to be the return value. > > I found the following in: > http://www.mail-archive.com/r-help at r-project.org/msg144184.html > > Start quote from link: > prevmonday <- function(x) 7 * floor(as.numeric(x-1+4) / 7) + as.Date(1-4) > > For example, > >> prevmonday(Sys.Date()) > [1] "2011-08-15" >> prevmonday(prevmonday(Sys.Date())) > [1] "2011-08-15" > > End quote from link. > > But when I do it I get: >> prevmonday <- function(x) 7 * floor(as.numeric(x-1+4) / 7) + as.Date(1-4) >> prevmonday(Sys.Date()) > Error in as.Date.numeric(1 - 4) : 'origin' must be supplied > > I've tried setting the 'origin' argument in as.Date() in different ways, but > it returns inaccurate results. > > Thanks, > > Ben If memory serves, this is because Gabor used the version of as.Date() from his 'zoo' package in that post, which does not require an origin to be specified, whereas the default as.Date() function in R's base package does: prevmonday <- function(x) 7 * floor(as.numeric(x-1+4) / 7) + as.Date(1-4) > prevmonday(Sys.Date()) Error in as.Date.numeric(1 - 4) : 'origin' must be supplied > require(zoo) Loading required package: zoo Attaching package: 'zoo' The following object(s) are masked from 'package:base': as.Date > prevmonday(Sys.Date()) [1] "2011-08-29" # Remove 'zoo' to use the base function detach(package:zoo) > prevmonday(Sys.Date()) Error in as.Date.numeric(1 - 4) : 'origin' must be supplied # Fix the function to use base::as.Date() prevmonday <- function(x) 7 * floor(as.numeric(x-1+4) / 7) + as.Date(1-4, origin = "1970-01-01") > prevmonday(Sys.Date()) [1] "2011-08-29" See ?as.Date HTH, Marc Schwartz From scott.mcgrane at abdn.ac.uk Fri Sep 2 18:00:05 2011 From: scott.mcgrane at abdn.ac.uk (ScottM) Date: Fri, 2 Sep 2011 09:00:05 -0700 (PDT) Subject: [R] Mann Kendall Test for Trend In-Reply-To: References: <1314977716843-3786392.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ccquant at gmail.com Fri Sep 2 18:00:29 2011 From: ccquant at gmail.com (Ben qant) Date: Fri, 2 Sep 2011 10:00:29 -0600 Subject: [R] previous monday date In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ccquant at gmail.com Fri Sep 2 18:02:18 2011 From: ccquant at gmail.com (Ben qant) Date: Fri, 2 Sep 2011 10:02:18 -0600 Subject: [R] previous monday date In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.weylandt at gmail.com Fri Sep 2 18:10:20 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 2 Sep 2011 12:10:20 -0400 Subject: [R] Mann Kendall Test for Trend In-Reply-To: References: <1314977716843-3786392.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From marc_schwartz at me.com Fri Sep 2 18:25:08 2011 From: marc_schwartz at me.com (Marc Schwartz) Date: Fri, 02 Sep 2011 11:25:08 -0500 Subject: [R] previous monday date In-Reply-To: References: Message-ID: > On Fri, Sep 2, 2011 at 9:59 AM, Marc Schwartz wrote: > >> On Sep 2, 2011, at 10:35 AM, Ben qant wrote: >> >>> Hello, >>> >>> I'm attempting to return the date (in form '%Y-%m-%d') of the Monday >>> previous to the current date. For example: since it is 2011-09-02 today, >> I >>> would expect 2011-08-29 to be the return value. >>> >>> I found the following in: >>> http://www.mail-archive.com/r-help at r-project.org/msg144184.html >>> >>> Start quote from link: >>> prevmonday <- function(x) 7 * floor(as.numeric(x-1+4) / 7) + as.Date(1-4) >>> >>> For example, >>> >>>> prevmonday(Sys.Date()) >>> [1] "2011-08-15" >>>> prevmonday(prevmonday(Sys.Date())) >>> [1] "2011-08-15" >>> >>> End quote from link. >>> >>> But when I do it I get: >>>> prevmonday <- function(x) 7 * floor(as.numeric(x-1+4) / 7) + >> as.Date(1-4) >>>> prevmonday(Sys.Date()) >>> Error in as.Date.numeric(1 - 4) : 'origin' must be supplied >>> >>> I've tried setting the 'origin' argument in as.Date() in different ways, >> but >>> it returns inaccurate results. >>> >>> Thanks, >>> >>> Ben >> >> >> If memory serves, this is because Gabor used the version of as.Date() from >> his 'zoo' package in that post, which does not require an origin to be >> specified, whereas the default as.Date() function in R's base package does: >> >> prevmonday <- function(x) 7 * floor(as.numeric(x-1+4) / 7) + as.Date(1-4) >> >>> prevmonday(Sys.Date()) >> Error in as.Date.numeric(1 - 4) : 'origin' must be supplied >> >>> require(zoo) >> Loading required package: zoo >> >> Attaching package: 'zoo' >> >> The following object(s) are masked from 'package:base': >> >> as.Date >> >>> prevmonday(Sys.Date()) >> [1] "2011-08-29" >> >> >> # Remove 'zoo' to use the base function >> detach(package:zoo) >> >>> prevmonday(Sys.Date()) >> Error in as.Date.numeric(1 - 4) : 'origin' must be supplied >> >> >> # Fix the function to use base::as.Date() >> prevmonday <- function(x) 7 * floor(as.numeric(x-1+4) / 7) + as.Date(1-4, >> origin = "1970-01-01") >> >>> prevmonday(Sys.Date()) >> [1] "2011-08-29" >> >> >> See ?as.Date >> >> HTH, >> >> Marc Schwartz On Sep 2, 2011, at 11:02 AM, Ben qant wrote: > Oh OK, missed that. > > Here is a solution using base: (already posted) > > I didn't sort out the issue in my email below but here is a (not very > R'ish) solution: > >> pm = function(x) { > + for(i in 1:7){ > + if(format(as.Date(Sys.Date()- > i),'%w') == 1){ > + d = Sys.Date() - i; > + } > + } > + d > + } >> pm(Sys.Date()) > [1] "2011-08-29" It occurs to me that another solution would be: > as.Date(cut(Sys.Date(), "weeks")) [1] "2011-08-29" This uses cut.Date() to create a vector that contains the beginning of weekly intervals for each date value, with the week beginning on a Monday by default. Note that cut.Date() returns a factor, not a Date class object, hence the additional coercion of the result. See ?cut.Date So if you want that in a function wrapper, you could do: prev.monday <- function(x) as.Date(cut(x, "weeks")) > prev.monday(Sys.Date()) [1] "2011-08-29" HTH, Marc From noahsilverman at ucla.edu Fri Sep 2 18:43:56 2011 From: noahsilverman at ucla.edu (Noah Silverman) Date: Fri, 2 Sep 2011 09:43:56 -0700 Subject: [R] Avoiding for Loop for moving average Message-ID: <197716DF-0074-4CF2-9B83-E1B29182C7EC@ucla.edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.weylandt at gmail.com Fri Sep 2 18:47:59 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 2 Sep 2011 12:47:59 -0400 Subject: [R] Avoiding for Loop for moving average In-Reply-To: <197716DF-0074-4CF2-9B83-E1B29182C7EC@ucla.edu> References: <197716DF-0074-4CF2-9B83-E1B29182C7EC@ucla.edu> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From lorenzo.isella at gmail.com Fri Sep 2 18:50:04 2011 From: lorenzo.isella at gmail.com (Lorenzo Isella) Date: Fri, 2 Sep 2011 18:50:04 +0200 Subject: [R] Hints for Data Clustering Message-ID: <4E61093C.8030508@gmail.com> Dear All, I will be confronted (relatively soon) with the following problem: given a set of known statistical indicators {s_i} , i=1,2...N for a N countries I would like to be able to do some data clustering i.e. determining the best way to partition the N countries according to their known properties, encoded by the {s_i} set of indicators for those countries. Some properties of these countries may be categorical or anyway non-numerical variables (e.g. the fact of belonging/not belonging to a certain group; joining/not joining a certain treaty etc...). I have seen some data clustering examples, but without categorical variables and I wonder if this is an inherent limitation of the methodology (on the top of my head, I would not know how to define the distance between non-numerical variables). Any suggestions about the general methodology and R packages/code snippets is really appreciated. And also: do the units in which I express a statistical indicator play a role? For instance: for 2 given countries I could have the average age of the population, the average life expectancy and the average income per year in thousands of dollars. This would give rise e.g. to (40,72,26) and (44,75,36), but if I measure the average income in dollars, then I would get (40,72,26000) (44,75,36000). Would the units that I choose for an indicator impact on the clustering results? They should not, in my view, since the income does not change whichever way I express it, but I am not sure about the algorithm results. Many thanks Lorenzo From josh.m.ulrich at gmail.com Fri Sep 2 18:58:56 2011 From: josh.m.ulrich at gmail.com (Joshua Ulrich) Date: Fri, 2 Sep 2011 11:58:56 -0500 Subject: [R] Avoiding for Loop for moving average In-Reply-To: References: <197716DF-0074-4CF2-9B83-E1B29182C7EC@ucla.edu> Message-ID: On Fri, Sep 2, 2011 at 11:47 AM, R. Michael Weylandt wrote: > Have you looked at SMA/EMA from the TTR package? That's a pretty quick > implementation. > > runmean from caTools is even better for the SMA but I don't think there's an > easy way to turn that into an EWMA. > SMA still calls Fortran code, so that's why it's slower than caTools::runmean. I've moved the EMA code to C, so it's about as fast as it can be. Noah, use EMA's ratio argument to replicate your for loop. > Hope this helps, > > Michael Weylandt > Best, -- Joshua Ulrich | FOSS Trading: www.fosstrading.com > On Fri, Sep 2, 2011 at 12:43 PM, Noah Silverman wrote: > >> Hello, >> >> I need to calculate a moving average and an exponentially weighted moving >> average over a fairly large data set (500K rows). >> >> Doing this in a for loop works nicely, but is slow. >> >> ewma <- data$col[1] >> N <- dim(data)[1] >> for(i in 2:N){ >> ? ? ? ?data$ewma <- alpha * data$ewma[i-1] + (1-alpha) * data$value[i] >> } >> >> >> Since the moving average "accumulates" as we move through the data, I'm not >> sure on the best/fastest way to do this. >> >> Does anyone have any suggestions on how to avoid a loop doing this? >> >> >> >> >> -- >> Noah Silverman >> UCLA Department of Statistics >> 8117 Math Sciences Building #8208 >> Los Angeles, CA 90095 >> >> >> ? ? ? ?[[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From noahsilverman at ucla.edu Fri Sep 2 19:06:03 2011 From: noahsilverman at ucla.edu (Noah Silverman) Date: Fri, 2 Sep 2011 10:06:03 -0700 Subject: [R] Avoiding for Loop for moving average In-Reply-To: References: <197716DF-0074-4CF2-9B83-E1B29182C7EC@ucla.edu> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From aikidasgupta at gmail.com Fri Sep 2 19:11:55 2011 From: aikidasgupta at gmail.com (Abhijit Dasgupta) Date: Fri, 02 Sep 2011 13:11:55 -0400 Subject: [R] Avoiding for Loop for moving average In-Reply-To: <197716DF-0074-4CF2-9B83-E1B29182C7EC@ucla.edu> References: <197716DF-0074-4CF2-9B83-E1B29182C7EC@ucla.edu> Message-ID: <4E610E5B.5090608@araastat.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From josh.m.ulrich at gmail.com Fri Sep 2 19:32:15 2011 From: josh.m.ulrich at gmail.com (Joshua Ulrich) Date: Fri, 2 Sep 2011 12:32:15 -0500 Subject: [R] Avoiding for Loop for moving average In-Reply-To: References: <197716DF-0074-4CF2-9B83-E1B29182C7EC@ucla.edu> Message-ID: On Fri, Sep 2, 2011 at 12:06 PM, Noah Silverman wrote: > Joshua, > > Thanks for the tip. > > I need to "roll my own" code on this. ?But perhaps I can borrow some code from the package you mentioned. > > Is the package just performing the loop, but in a faster language? > As I said, the function is in C. You could also use the compiler package to compile your pure R function for a 3-4x speedup. Best, -- Joshua Ulrich | FOSS Trading: www.fosstrading.com > > -- > Noah Silverman > UCLA Department of Statistics > 8117 Math Sciences Building #8208 > Los Angeles, CA 90095 > > On Sep 2, 2011, at 9:58 AM, Joshua Ulrich wrote: > >> On Fri, Sep 2, 2011 at 11:47 AM, R. Michael Weylandt >> wrote: >>> Have you looked at SMA/EMA from the TTR package? That's a pretty quick >>> implementation. >>> >>> runmean from caTools is even better for the SMA but I don't think there's an >>> easy way to turn that into an EWMA. >>> >> SMA still calls Fortran code, so that's why it's slower than >> caTools::runmean. ?I've moved the EMA code to C, so it's about as fast >> as it can be. >> >> Noah, use EMA's ratio argument to replicate your for loop. >> >>> Hope this helps, >>> >>> Michael Weylandt >>> >> >> Best, >> -- >> Joshua Ulrich ?| ?FOSS Trading: www.fosstrading.com >> >> >> >>> On Fri, Sep 2, 2011 at 12:43 PM, Noah Silverman wrote: >>> >>>> Hello, >>>> >>>> I need to calculate a moving average and an exponentially weighted moving >>>> average over a fairly large data set (500K rows). >>>> >>>> Doing this in a for loop works nicely, but is slow. >>>> >>>> ewma <- data$col[1] >>>> N <- dim(data)[1] >>>> for(i in 2:N){ >>>> ? ? ? ?data$ewma <- alpha * data$ewma[i-1] + (1-alpha) * data$value[i] >>>> } >>>> >>>> >>>> Since the moving average "accumulates" as we move through the data, I'm not >>>> sure on the best/fastest way to do this. >>>> >>>> Does anyone have any suggestions on how to avoid a loop doing this? >>>> >>>> >>>> >>>> >>>> -- >>>> Noah Silverman >>>> UCLA Department of Statistics >>>> 8117 Math Sciences Building #8208 >>>> Los Angeles, CA 90095 >>>> >>>> >>>> ? ? ? ?[[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> ? ? ? ?[[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From rruizeuler at ucsd.edu Fri Sep 2 19:29:02 2011 From: rruizeuler at ucsd.edu (Alex Ruiz Euler) Date: Fri, 2 Sep 2011 10:29:02 -0700 Subject: [R] Advice on large data structures In-Reply-To: <6682A7D3-A31E-493D-B046-327407DCE982@gmail.com> References: <6682A7D3-A31E-493D-B046-327407DCE982@gmail.com> Message-ID: <20110902102902.3c86666c@XXX> Along the lines of one of Jim's suggestions, if you have some basic MySQL knowledge check out the RMySQL package. I use it to convert / partition a matrix similar to yours to R objects and it works fine. Hope this helps, A. On Fri, 2 Sep 2011 06:33:13 -0400 Jim Holtman wrote: > i would suggest that if you want to use R that you get a 64-bit version with 24GB of memory to start. if your data is a numeric matrix, you will need 8GB for a single copy. > > Do you really need it all in memory at once, or can you partition the problem? Can you use a database to access the portion you need at any time? > > If you only need one, or two, columns at a time, then the use of a database storing the columns might work. You probably need some more analysis on exactly how you want to solve your problem understanding the limitations of the system. > > Sent from my iPad > > On Sep 2, 2011, at 1:13, Worik R wrote: > > > Friends > > > > I am starting on a (section of the) project where I need to build a matrix > > with on the order of 5 million rows and 200 columns > > > > I am wondering if I can stay in R. > > > > I need to do rollapply type operations on the columns, including some that > > will be functions of (windows of) two columns. > > > > I have been looking at the ff and bigmemory packages but am not sure that > > they will do. > > > > Before I get too deep can some one offer some wisdom about what the best > > direction to go would be? > > > > Switching to C/C++ is definitely an option if it is all too hard > > > > cheers > > Worik > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jvadams at usgs.gov Fri Sep 2 19:59:28 2011 From: jvadams at usgs.gov (Jean V Adams) Date: Fri, 2 Sep 2011 12:59:28 -0500 Subject: [R] Hints for Data Clustering In-Reply-To: <4E61093C.8030508@gmail.com> References: <4E61093C.8030508@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From noahsilverman at ucla.edu Fri Sep 2 20:08:31 2011 From: noahsilverman at ucla.edu (Noah Silverman) Date: Fri, 2 Sep 2011 11:08:31 -0700 Subject: [R] Avoiding for Loop for moving average In-Reply-To: References: <197716DF-0074-4CF2-9B83-E1B29182C7EC@ucla.edu> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pdalgd at gmail.com Fri Sep 2 20:19:24 2011 From: pdalgd at gmail.com (peter dalgaard) Date: Fri, 2 Sep 2011 20:19:24 +0200 Subject: [R] Including only a subset of the levels of a factor XXXX In-Reply-To: References: Message-ID: <1D1B4043-9289-48D6-AFCF-E8469EACD56A@gmail.com> On Sep 1, 2011, at 21:11 , R. Michael Weylandt wrote: > Dropping all occurences of a factor does not drop that level. This actually > turns out to be much more useful than it first might appear, but if you > really need to get around it, it can be done. ...most expediently by using factor(), as others have pointed out. Or droplevels() for data frames. We had the converse issue just the other day (Aug 30) when someone had problems with "showing zero frequencies in xtabs", which turned out to be caused by the tabulated data _not_ being factors, hence not containing information about which values could have been there but wasn't. The behavior of subsetting operators is so as to make things like tables and barplots consistent across subsets, but there are cases where you want the extra levels dropped. However, the default is as it is, because it is easier to drop levels than to reinstate them. Neither is impossible, of course. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com "D?den skal tape!" --- Nordahl Grieg From pburns at pburns.seanet.com Fri Sep 2 20:37:51 2011 From: pburns at pburns.seanet.com (Patrick Burns) Date: Fri, 02 Sep 2011 19:37:51 +0100 Subject: [R] Avoiding for Loop for moving average In-Reply-To: References: <197716DF-0074-4CF2-9B83-E1B29182C7EC@ucla.edu> Message-ID: <4E61227F.5040507@pburns.seanet.com> The 'filter' function should be able to do what you want efficiently. On 02/09/2011 18:06, Noah Silverman wrote: > Joshua, > > Thanks for the tip. > > I need to "roll my own" code on this. But perhaps I can borrow some code from the package you mentioned. > > Is the package just performing the loop, but in a faster language? > > > -- > Noah Silverman > UCLA Department of Statistics > 8117 Math Sciences Building #8208 > Los Angeles, CA 90095 > > On Sep 2, 2011, at 9:58 AM, Joshua Ulrich wrote: > >> On Fri, Sep 2, 2011 at 11:47 AM, R. Michael Weylandt >> wrote: >>> Have you looked at SMA/EMA from the TTR package? That's a pretty quick >>> implementation. >>> >>> runmean from caTools is even better for the SMA but I don't think there's an >>> easy way to turn that into an EWMA. >>> >> SMA still calls Fortran code, so that's why it's slower than >> caTools::runmean. I've moved the EMA code to C, so it's about as fast >> as it can be. >> >> Noah, use EMA's ratio argument to replicate your for loop. >> >>> Hope this helps, >>> >>> Michael Weylandt >>> >> >> Best, >> -- >> Joshua Ulrich | FOSS Trading: www.fosstrading.com >> >> >> >>> On Fri, Sep 2, 2011 at 12:43 PM, Noah Silvermanwrote: >>> >>>> Hello, >>>> >>>> I need to calculate a moving average and an exponentially weighted moving >>>> average over a fairly large data set (500K rows). >>>> >>>> Doing this in a for loop works nicely, but is slow. >>>> >>>> ewma<- data$col[1] >>>> N<- dim(data)[1] >>>> for(i in 2:N){ >>>> data$ewma<- alpha * data$ewma[i-1] + (1-alpha) * data$value[i] >>>> } >>>> >>>> >>>> Since the moving average "accumulates" as we move through the data, I'm not >>>> sure on the best/fastest way to do this. >>>> >>>> Does anyone have any suggestions on how to avoid a loop doing this? >>>> >>>> >>>> >>>> >>>> -- >>>> Noah Silverman >>>> UCLA Department of Statistics >>>> 8117 Math Sciences Building #8208 >>>> Los Angeles, CA 90095 >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Patrick Burns pburns at pburns.seanet.com twitter: @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') From listanand at gmail.com Fri Sep 2 20:23:28 2011 From: listanand at gmail.com (andy1234) Date: Fri, 2 Sep 2011 11:23:28 -0700 (PDT) Subject: [R] Classifying large text corpora using R Message-ID: <1314987808606-3786787.post@n4.nabble.com> Dear everyone, I am new to R, and I am looking at doing text classification on a huge collection of documents (>500,000) which are distributed among 300 classes (so basically, this is my training data). Would someone please be kind enough to let me know about the R packages to use and their scalability (time and space)? I am very new to R and do not know of the right packages to use. I started off by trying to use the tm package (http://cran.r-project.org/package=tm) for pre-processing and FSelector (http://cran.r-project.org/web/packages/FSelector/index.html) package for feature selection - but both of these are incredibly slow and completely unusable for my task. So the question is what are the right packages to use (for pre-processing, feature selection, and classification)? Please consider the fact that I may be dealing with data of millions of dimensions which may not even fit in memory. I posted on this issue twice (http://r.789695.n4.nabble.com/Entropy-based-feature-selection-in-R-td3708056.html , http://r.789695.n4.nabble.com/R-s-handling-of-high-dimensional-data-td3741758.html) but did not get any response. This is a very critical piece of my research and I have been struggling with this issue for a long time. Please consider helping me out, directly or by pointing me to any other software/website that you think may be more appropriate. Many thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/Classifying-large-text-corpora-using-R-tp3786787p3786787.html Sent from the R help mailing list archive at Nabble.com. From hzd3 at cdc.gov Fri Sep 2 21:13:57 2011 From: hzd3 at cdc.gov (Durant, James T. (ATSDR/DTEM/PRMSB)) Date: Fri, 2 Sep 2011 19:13:57 +0000 Subject: [R] Chemical Names in Data Frames Message-ID: <448D194A2ED2F1468E14C86F539ECCEA284AC446@EMBX-CHAM3.cdc.gov> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Fri Sep 2 21:24:39 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 2 Sep 2011 15:24:39 -0400 Subject: [R] Chemical Names in Data Frames In-Reply-To: <448D194A2ED2F1468E14C86F539ECCEA284AC446@EMBX-CHAM3.cdc.gov> References: <448D194A2ED2F1468E14C86F539ECCEA284AC446@EMBX-CHAM3.cdc.gov> Message-ID: <229F7E85-A2CF-4C35-83BF-CD0762231B55@comcast.net> On Sep 2, 2011, at 3:13 PM, Durant, James T. (ATSDR/DTEM/PRMSB) wrote: > Greetings - > > I am working on some data that contain chemical names with air > concentrations, and I am creating a data frame with date/time and > each chemical having its own column. However, these are organic > chemicals (e.g. 1-butene, 2,3,4-trimethylbenzene etc). The package I > am going to be using the data with is openair, and many of the great > functions require you to specify a column name which does not seem > to work with improper column names- e.g. smoothTrend(mydata, > pollutant="1-Butene" and smoothTrend(mydata, pollutant=mydata[,"1- > Butene"]) > > I was wondering if there was a function to automatically convert > these chemical names (with all sorts of numbers and minuses in the > beginning) to something openair can handle? Or am I going to be > stuck recoding several hundred chemical names in my database? > Try using back-ticks on the invalid names, rather than quotes. -- David Winsemius, MD West Hartford, CT From gustavo.bio+R at gmail.com Fri Sep 2 21:28:48 2011 From: gustavo.bio+R at gmail.com (Gustavo Carvalho) Date: Fri, 2 Sep 2011 16:28:48 -0300 Subject: [R] Chemical Names in Data Frames In-Reply-To: <448D194A2ED2F1468E14C86F539ECCEA284AC446@EMBX-CHAM3.cdc.gov> References: <448D194A2ED2F1468E14C86F539ECCEA284AC446@EMBX-CHAM3.cdc.gov> Message-ID: ?make.names perhaps. On Fri, Sep 2, 2011 at 4:13 PM, Durant, James T. (ATSDR/DTEM/PRMSB) wrote: > Greetings - > > I am working on some data that contain chemical names with air concentrations, and I am creating a data frame with date/time and each chemical having its own column. However, these are organic chemicals (e.g. 1-butene, 2,3,4-trimethylbenzene etc). The package I am going to be using the data with is openair, and many of the great functions require you to specify a column name which does not seem to work with improper column names- e.g. smoothTrend(mydata, pollutant="1-Butene" and smoothTrend(mydata, pollutant=mydata[,"1-Butene"]) > > I was wondering if there was a function to automatically convert these chemical names (with all sorts of numbers and minuses in the beginning) to something openair can handle? ?Or am I going to be stuck recoding several hundred chemical names in my database? > > VR > > Jim > > James T. Durant, MSPH CIH > Emergency Response Coordinator > US Agency for Toxic Substances and Disease Registry > Atlanta, GA 30341 > 770-378-1695 > > > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From spencer.graves at structuremonitoring.com Fri Sep 2 22:02:31 2011 From: spencer.graves at structuremonitoring.com (Spencer Graves) Date: Fri, 02 Sep 2011 13:02:31 -0700 Subject: [R] Chemical Names in Data Frames In-Reply-To: References: <448D194A2ED2F1468E14C86F539ECCEA284AC446@EMBX-CHAM3.cdc.gov> Message-ID: <4E613657.9010000@structuremonitoring.com> possibly with unique = TRUE: > make.names(c("'", "'")) [1] "X." "X." > make.names(c("'", "'"), unique=TRUE) [1] "X." "X..1" > Spencer On 9/2/2011 12:28 PM, Gustavo Carvalho wrote: > ?make.names perhaps. > > On Fri, Sep 2, 2011 at 4:13 PM, Durant, James T. (ATSDR/DTEM/PRMSB) > wrote: >> Greetings - >> >> I am working on some data that contain chemical names with air concentrations, and I am creating a data frame with date/time and each chemical having its own column. However, these are organic chemicals (e.g. 1-butene, 2,3,4-trimethylbenzene etc). The package I am going to be using the data with is openair, and many of the great functions require you to specify a column name which does not seem to work with improper column names- e.g. smoothTrend(mydata, pollutant="1-Butene" and smoothTrend(mydata, pollutant=mydata[,"1-Butene"]) >> >> I was wondering if there was a function to automatically convert these chemical names (with all sorts of numbers and minuses in the beginning) to something openair can handle? Or am I going to be stuck recoding several hundred chemical names in my database? >> >> VR >> >> Jim >> >> James T. Durant, MSPH CIH >> Emergency Response Coordinator >> US Agency for Toxic Substances and Disease Registry >> Atlanta, GA 30341 >> 770-378-1695 >> >> >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> -- Spencer Graves, PE, PhD President and Chief Technology Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San Jos?, CA 95126 ph: 408-655-4567 web: www.structuremonitoring.com From emammendes at gmail.com Fri Sep 2 22:05:46 2011 From: emammendes at gmail.com (Eduardo M. A. M.Mendes) Date: Fri, 2 Sep 2011 17:05:46 -0300 Subject: [R] How to keep the same class? Message-ID: <0a8d01cc69ab$baab62e0$300228a0$@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wwwhsd at gmail.com Fri Sep 2 22:12:25 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Fri, 2 Sep 2011 17:12:25 -0300 Subject: [R] How to keep the same class? In-Reply-To: <0a8d01cc69ab$baab62e0$300228a0$@gmail.com> References: <0a8d01cc69ab$baab62e0$300228a0$@gmail.com> Message-ID: Try this: predict(fit10,testX[1,,drop = FALSE]) On Fri, Sep 2, 2011 at 5:05 PM, Eduardo M. A. M.Mendes wrote: > Hello > > > > Please see the example below > > > >> class(testX) > > [1] "matrix" > >> class(testX[1,]) > > [1] "numeric" > > > > Why not matrix? ? What am I missing here? ? Is there a way to keep the same > class? > > > > The reason for the question is that I want to implement a k-step ahead > prediction for my own routines and R wrecks does not seem to like [1,] as > shown below. > > > >> predict(fit10,testX[1,]) > Error in knnregTrain(train = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ?: > ?dims of 'test' and 'train differ >> predict(fit10,testX[1:2,]) > [1] 81.00 76.36 > > > > Many thanks > > > > Ed > > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O From marc_schwartz at me.com Fri Sep 2 22:16:25 2011 From: marc_schwartz at me.com (Marc Schwartz) Date: Fri, 02 Sep 2011 15:16:25 -0500 Subject: [R] How to keep the same class? In-Reply-To: <0a8d01cc69ab$baab62e0$300228a0$@gmail.com> References: <0a8d01cc69ab$baab62e0$300228a0$@gmail.com> Message-ID: <0AD9E009-C76B-47FD-A2BB-127DE6A351F1@me.com> On Sep 2, 2011, at 3:05 PM, Eduardo M. A. M.Mendes wrote: > Hello > > > > Please see the example below > > > >> class(testX) > > [1] "matrix" > >> class(testX[1,]) > > [1] "numeric" > > > > Why not matrix? What am I missing here? Is there a way to keep the same > class? > > > > The reason for the question is that I want to implement a k-step ahead > prediction for my own routines and R wrecks does not seem to like [1,] as > shown below. > > > >> predict(fit10,testX[1,]) > Error in knnregTrain(train = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, : > dims of 'test' and 'train differ >> predict(fit10,testX[1:2,]) > [1] 81.00 76.36 > > > > Many thanks > > > > Ed Ed, See: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-my-matrices-lose-dimensions_003f and then use: predict(fit10, testX[1, , drop = FALSE]) HTH, Marc Schwartz From emammendes at gmail.com Fri Sep 2 23:47:48 2011 From: emammendes at gmail.com (Eduardo M. A. M.Mendes) Date: Fri, 2 Sep 2011 18:47:48 -0300 Subject: [R] How to keep the same class? In-Reply-To: <0AD9E009-C76B-47FD-A2BB-127DE6A351F1@me.com> References: <0a8d01cc69ab$baab62e0$300228a0$@gmail.com> <0AD9E009-C76B-47FD-A2BB-127DE6A351F1@me.com> Message-ID: <0ae801cc69b9$f72512d0$e56f3870$@gmail.com> Many thanks to all for the reply. I do apologize for bothering the list with a FAQ but I have to confess that, although I read Faq in the past, I did not remember to do it again. Cheers Ed -----Original Message----- From: Marc Schwartz [mailto:marc_schwartz at me.com] Sent: Friday, September 02, 2011 5:16 PM To: Eduardo M. A. M.Mendes Cc: r-help at r-project.org Subject: Re: [R] How to keep the same class? On Sep 2, 2011, at 3:05 PM, Eduardo M. A. M.Mendes wrote: > Hello > > > > Please see the example below > > > >> class(testX) > > [1] "matrix" > >> class(testX[1,]) > > [1] "numeric" > > > > Why not matrix? What am I missing here? Is there a way to keep the same > class? > > > > The reason for the question is that I want to implement a k-step ahead > prediction for my own routines and R wrecks does not seem to like [1,] > as shown below. > > > >> predict(fit10,testX[1,]) > Error in knnregTrain(train = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, : > dims of 'test' and 'train differ >> predict(fit10,testX[1:2,]) > [1] 81.00 76.36 > > > > Many thanks > > > > Ed Ed, See: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-my-matrices-lose-dimensi ons_003f and then use: predict(fit10, testX[1, , drop = FALSE]) HTH, Marc Schwartz From tzaihra at alcor.concordia.ca Fri Sep 2 21:33:13 2011 From: tzaihra at alcor.concordia.ca (tzaihra at alcor.concordia.ca) Date: Fri, 2 Sep 2011 15:33:13 -0400 Subject: [R] Hessian Matrix Issue Message-ID: Dear All, I am running a simulation to obtain coverage probability of Wald type confidence intervals for my parameter d in a function of two parameters (mu,d). I am optimizing it using "optim" method "L-BFGS-B" to obtain MLE. As, I want to invert the Hessian matrix to get Standard errors of the two parameter estimates. However, my Hessian matrix at times becomes non-invertible that is it is no more positive definite and I get the following error msg: "Error in solve.default(ac$hessian) : system is computationally singular: reciprocal condition number = 6.89585e-21" Thank you Following is the code I am running I would really appreciate your comments and suggestions: #Start Code #option to trace /recover error #options(error = recover) #Sample Size n<-30 mu<-5 size<- 2 #true values of parameter d d.true<-1+mu/size d.true #true value of zero inflation index phi= 1+log(d)/(1-d) z.true<-1+(log(d.true)/(1-d.true)) z.true # Allocating space for simulation vectors and setting counters for simulation counter<-0 iter<-10000 lower.d<-numeric(iter) upper.d<-numeric(iter) #set.seed(987654321) #begining of simulation loop######## for (i in 1:iter){ r.NB<-rnbinom(n, mu = mu, size = size) y<-sort(r.NB) iter.num<-i print(y) print(iter.num) #empirical estimates or sample moments xbar<-mean(y) variance<-(sum((y-xbar)^2))/length(y) dbar<-variance/xbar #sample estimate of proportion of zeros and zero inflation index pbar<-length(y[y==0])/length(y) ### Simplified function ############################################# NegBin<-function(th){ mu<-th[1] d<-th[2] n<-length(y) arg1<-n*mean(y)*ifelse(mu >= 0, log(mu),0) #arg1<-n*mean(y)*log(mu) #arg2<-n*log(d)*((mean(y))+mu/(d-1)) arg2<-n*ifelse(d>=0, log(d), 0)*((mean(y))+mu/ifelse((d-1)>= 0, (d-1), 0.0000001)) aa<-numeric(length(max(y))) a<-numeric(length(y)) for (i in 1:n) { for (j in 1:y[i]){ aa[j]<-ifelse(((j-1)*(d-1))/mu >0,log(1+((j-1)*(d-1))/mu),0) #aa[j]<-log(1+((j-1)*(d-1))/mu) #print(aa[j]) } a[i]<-sum(aa) #print(a[i]) } a arg3<-sum(a) llh<-arg1+arg2+arg3 if(! is.finite(llh)) llh<-1e+20 -llh } ac<-optim(NegBin,par=c(xbar,dbar),method="L-BFGS-B",hessian=TRUE,lower= c(0,1) ) ac print(ac$hessian) muhat<-ac$par[1] dhat<-ac$par[2] zhat<- 1+(log(dhat)/(1-dhat)) infor<-solve(ac$hessian) var.dhat<-infor[2,2] se.dhat<-sqrt(var.dhat) var.muhat<-infor[1,1] se.muhat<-sqrt(var.muhat) var.func<-dhat*muhat var.func d.prime<-cbind(dhat,muhat) se.var.func<-d.prime%*%infor%*%t(d.prime) se.var.func lower.d[i]<-dhat-1.96*se.dhat upper.d[i]<-dhat+1.96*se.dhat if(lower.d[i] <= d.true & d.true<= upper.d[i]) counter <-counter+1 } counter covg.prob<-counter/iter covg.prob From huangsiying at gmail.com Fri Sep 2 22:53:58 2011 From: huangsiying at gmail.com (syhuang) Date: Fri, 2 Sep 2011 13:53:58 -0700 (PDT) Subject: [R] Parameters in Gamma Frailty model Message-ID: <1314996838737-3787013.post@n4.nabble.com> Dear all, I'm new to frailty model. I have a question on the output from 'survival' pack. Below is the output. What does gamma1,2,3 refer to? How do I calculate joint hazard function or marginal hazard function using info below? Many thanks! Call: coxph(formula = surv ~ as.factor(tibia) + frailty(as.factor(bdcat)), data = try) n=877 (1 observation deleted due to missingness) coef se(coef) se2 Chisq DF p as.factor(tibia)2 0.214 0.126 0.125 2.89 1.00 0.0890 frailty(as.factor(bdcat)) 10.24 1.65 0.0038 exp(coef) exp(-coef) lower .95 upper .95 as.factor(tibia)2 1.238 0.808 0.968 1.58 gamma:1 0.716 1.398 0.496 1.03 gamma:2 1.036 0.965 0.750 1.43 gamma:3 1.248 0.801 0.901 1.73 Iterations: 10 outer, 27 Newton-Raphson Variance of random effect= 0.0648 I-likelihood = -1756.4 Degrees of freedom for terms= 1.0 1.6 Rsquare= 0.016 (max possible= 0.982 ) Likelihood ratio test= 14.3 on 2.64 df, p=0.00171 Wald test = 11.5 on 2.64 df, p=0.00668 -- View this message in context: http://r.789695.n4.nabble.com/Parameters-in-Gamma-Frailty-model-tp3787013p3787013.html Sent from the R help mailing list archive at Nabble.com. From tewksjj at uw.edu Fri Sep 2 21:51:01 2011 From: tewksjj at uw.edu (Josh Tewksbury) Date: Fri, 2 Sep 2011 12:51:01 -0700 Subject: [R] conditional replacement of character strings in vectors Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jwelsh at sdibr.org Fri Sep 2 22:05:48 2011 From: jwelsh at sdibr.org (John Welsh) Date: Fri, 2 Sep 2011 13:05:48 -0700 Subject: [R] Platform of image Message-ID: <3c1d344cc8521996097ca48a48d58c41@mail.gmail.com> Dear R users, When I Save Workspace... and then reopen it, my platform switches from 64-bit to 32-bit, i.e. the Gui switches between these: R version 2.13.1 (2011-07-08) Copyright (C) 2011 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-pc-mingw32/x64 (64-bit) R version 2.13.1 (2011-07-08) Copyright (C) 2011 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: i386-pc-mingw32/i386 (32-bit) How come? This is a Windows x64 system. John Welsh, Ph.D. Associate Professor Molecular and Cancer Biology Vaccine Research Institute of San Diego 10835 Road to the Cure San Diego, CA 92121 Phone: (858) 581-3960 ex.248 Email: jwelsh at sdibr.org From ferdaous.somrani at gmail.com Fri Sep 2 23:29:40 2011 From: ferdaous.somrani at gmail.com (Doussa) Date: Fri, 2 Sep 2011 14:29:40 -0700 (PDT) Subject: [R] misclassification rate Message-ID: <1314998980979-3787075.post@n4.nabble.com> Hi users I'm student who is struggling with basic R programming. Would you please help me with this problem. "My english is bad" I hope that my question is clear: I have a matrix in wich there are two colmns( yp, yt) Yp: predicted values from my model. yt: true values ( my dependante variable y is a categorical;3 modalities (0,1,2) I don't know how to procede to calculate the misclassification rate and the error Types. Thank you for answring Doussa -- View this message in context: http://r.789695.n4.nabble.com/misclassification-rate-tp3787075p3787075.html Sent from the R help mailing list archive at Nabble.com. From michael.weylandt at gmail.com Sat Sep 3 01:51:34 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 2 Sep 2011 19:51:34 -0400 Subject: [R] conditional replacement of character strings in vectors In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Sat Sep 3 01:52:03 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 2 Sep 2011 19:52:03 -0400 Subject: [R] conditional replacement of character strings in vectors In-Reply-To: References: Message-ID: <9A9D3A8E-D3C5-4EBF-8A59-68427BF0B5CC@comcast.net> On Sep 2, 2011, at 3:51 PM, Josh Tewksbury wrote: > Hello, I have a dataframe that looks like this: > > a b NA Honduras China NA NA Sudan Japan NA NA Mexico NA Mexico > I would like to replace the NA values in column b with the non-NA > values in > column a. I have tried a number of techniques, (if, ifelse) but I > must have > the logic wrong. Mangled data but no matter: dfrm$b[is.na(dfrm$b)] <- dfrm$a[is.na(dfrm$b)] (Learn to post in plain text. This is a plain text list.) David Winsemius, MD West Hartford, CT From andra_isan at yahoo.com Sat Sep 3 02:32:44 2011 From: andra_isan at yahoo.com (Andra Isan) Date: Fri, 2 Sep 2011 17:32:44 -0700 (PDT) Subject: [R] ROCR package question for evaluating two regression models Message-ID: <1315009964.93061.YahooMailClassic@web120609.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jdnewmil at dcn.davis.ca.us Sat Sep 3 02:33:14 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Fri, 02 Sep 2011 17:33:14 -0700 Subject: [R] Platform of image In-Reply-To: <3c1d344cc8521996097ca48a48d58c41@mail.gmail.com> References: <3c1d344cc8521996097ca48a48d58c41@mail.gmail.com> Message-ID: <01af2655-2092-432a-b134-aba6474f8bb2@email.android.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Sat Sep 3 03:03:29 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 2 Sep 2011 21:03:29 -0400 Subject: [R] conditional replacement of character strings in vectors In-Reply-To: References: Message-ID: <0494B9E0-995C-46D9-825B-A889A4EB8A85@comcast.net> On Sep 2, 2011, at 7:51 PM, R. Michael Weylandt wrote: > Your data frame didn't come across legibly, try sending it in plain > text > using the dput() command. > > That said, I'd guess you want something like this: > > d[is.na(d$a),"a"] <- d[is.na(d$b),"b"] One of the rare instances where I disagree with Michael. The row index on the right hand side must be the same as the row index on the left hand side. > > The idea is that is.na(d$a) selects only those rows where column "a" > is NA > and then moves b values into a for only those rows. Right, that is the idea. > > Write back with the dput() data frame if this doesn't work. > > Hope this helps, > > Michael Weylandt > > On Fri, Sep 2, 2011 at 3:51 PM, Josh Tewksbury wrote: > >> Hello, I have a dataframe that looks like this: >> >> a b NA Honduras China NA NA Sudan Japan NA NA Mexico NA Mexico >> I would like to replace the NA values in column b with the non-NA >> values in >> column a. I have tried a number of techniques, (if, ifelse) but I >> must >> have >> the logic wrong. >> >> Thanks >> -- >> Josh David Winsemius, MD West Hartford, CT From michael.weylandt at gmail.com Sat Sep 3 04:44:26 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Fri, 2 Sep 2011 22:44:26 -0400 Subject: [R] conditional replacement of character strings in vectors In-Reply-To: <0494B9E0-995C-46D9-825B-A889A4EB8A85@comcast.net> References: <0494B9E0-995C-46D9-825B-A889A4EB8A85@comcast.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ferdaous.somrani at gmail.com Sat Sep 3 03:46:42 2011 From: ferdaous.somrani at gmail.com (Doussa) Date: Fri, 2 Sep 2011 18:46:42 -0700 (PDT) Subject: [R] confusion matrix Message-ID: <1315014402783-3787363.post@n4.nabble.com> hi users I have a data frame in with there are two colomns real values and predicted ones (for a dichotomic response). How can i obtain a confusion matrix (miscalssification rat and errors)? The costs are egal. Thanks -- View this message in context: http://r.789695.n4.nabble.com/confusion-matrix-tp3787363p3787363.html Sent from the R help mailing list archive at Nabble.com. From maya.d.joshi at gmail.com Sat Sep 3 05:18:54 2011 From: maya.d.joshi at gmail.com (Maya Joshi) Date: Fri, 2 Sep 2011 23:18:54 -0400 Subject: [R] problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Sat Sep 3 06:28:38 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 3 Sep 2011 00:28:38 -0400 Subject: [R] problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome In-Reply-To: References: Message-ID: On Sep 2, 2011, at 11:18 PM, Maya Joshi wrote: > Dear R experts. > > I might be missing something obvious. I have been trying to fix this > problem > for some weeks. Please help. > > #data > ped <- c(rep(1, 4), rep(2, 3), rep(3, 3)) > y <- rnorm(10, 8, 2) > > # variable set 1 > M1a <- sample (c(1, 2,3), 10, replace= T) > M1b <- sample (c(1, 2,3), 10, replace= T) > M1aP1 <- sample (c(1, 2,3), 10, replace= T) > M1bP2 <- sample (c(1, 2,3), 10, replace= T) > > # variable set 2 > M2a <- sample (c(1, 2,3), 10, replace= T) > M2b <- sample (c(1, 2,3), 10, replace= T) > M2aP1 <- sample (c(1, 2,3), 10, replace= T) > M2bP2 <- sample (c(1, 2,3), 10, replace= T) > > # variable set 3 > M3a <- sample (c(1, 2,3), 10, replace= T) > M3b <- sample (c(1, 2,3), 10, replace= T) > M3aP1 <- sample (c(1, 2,3), 10, replace= T) > M3bP2 <- sample (c(1, 2,3), 10, replace= T) > > mydf <- data.frame (ped, M1a,M1b,M1aP1,M1bP2, M2a,M2b,M2aP1,M2bP2, > M3a,M3b,M3aP1,M3bP2, y) > > # functions and further calculations > > mmat <- matrix > (c("M1a","M2a","M3a","M1b","M2b","M3b","M1aP1","M2aP1","M3aP1", > "M1bP2","M2bP2","M3bP2"), ncol = 4) > > # first function > myfun <- function(x) { > x<- as.vector(x) > ot1 <- ifelse(mydf[x[1]] == mydf[x[3]], 1, -1) You really ought to explain what you are trying to do. This code will compare two lists. The list from mydf[x2]] will be the column mydf["M2b"] which will have as its first element the four element vector assigned above. I am guessing that was not what you wanted. Notice this simple case using the "[" function as you are attempting throws an error: > list(M2b = c(1,2,3)) == list(M2a = c(1,2,3)) Error in list(M2b = c(1, 2, 3)) == list(M2a = c(1, 2, 3)) : comparison of these types is not implemented' So ... now that your code has failed to explain what you wanted why don't your try some natural language explanations. Notice that the "[[" function is generally what people want when extracting from lists: > list(M2b = c(1,2,3))[["M2b"]] == list(M2a = c(1,2,3))[["M2a"]] [1] TRUE TRUE TRUE > ot2 <- ifelse(mydf[x[2]] == mydf[x[4]], 1, -1) > qt <- ot1 + ot2 > return(qt) > } > qt <- apply(mmat, 1, myfun) > ydv <- c((y - mean(y))^2) > qtd <- data.frame(ped, ydv, qt) > > # second function > myfun2 <- function(dataframe) { > vydv <- sum(ydv)*0.25 > sumD <- sum(ydv * qt) > Rt <- vydv / sumD > return(Rt) > } > > # using plyr > require(plyr) > dfsumd1 <- ddply(mydf,.(mydf$ped),myfun2) > > Here are 2 issues: > (1) The output just one, I need the output for all three set of > variables > (as listed above) An incredibly vague description. > > (2) all three values of dfsumd is returning to same for all level > of ped: > 1,2, 3 I sympathize with those forced to adopt the English language, but that is the standard this decade. So givne your apparent difficulties, you need to exert more effort at making explicit what is _supposed_ to be returned. > Means that the function is applied to whole dataset but only > replicated in > output !!! > > I tried with plyr not being lazy but due to my limited R knowledge, > If you > have a different suggestion, you are welcome too. > > Thank you in advance... > > Maya > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From ripley at stats.ox.ac.uk Sat Sep 3 07:48:32 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Sat, 3 Sep 2011 06:48:32 +0100 (BST) Subject: [R] Platform of image In-Reply-To: <01af2655-2092-432a-b134-aba6474f8bb2@email.android.com> References: <3c1d344cc8521996097ca48a48d58c41@mail.gmail.com> <01af2655-2092-432a-b134-aba6474f8bb2@email.android.com> Message-ID: On Fri, 2 Sep 2011, Jeff Newmiller wrote: > You don't say how you re-open it, but I would guess you are > double-clicking the .RData file and letting Windows look up which > program to run. If so, you can either open the 64-bit GUI directly > and use File Open Data to open your data, or you can change which > version of the R GUI program is linked to .RData files in Windows. See the rw-FAQ Q2.29 for why (and how to change it). > Jeff Newmiller > > John Welsh wrote: > > Dear R users, > > When I Save Workspace... and then reopen it, my platform switches > from 64-bit to 32-bit, i.e. the Gui switches between these: > > R version 2.13.1 (2011-07-08) > Copyright (C) 2011 The R Foundation for Statistical Computing > ISBN 3-900051-07-0 > Platform: x86_64-pc-mingw32/x64 (64-bit) > > R version 2.13.1 (2011-07-08) > Copyright (C) 2011 The R Foundation for Statistical Computing > ISBN 3-900051-07-0 > Platform: i386-pc-mingw32/i386 (32-bit) > > How come? This is a Windows x64 system. > > John Welsh, Ph.D. > Associate Professor > Molecular and Cancer Biology > Vaccine Research Institute of San Diego > 10835 Road to the Cure > San Diego, CA 92121 > Phone: (858) 581-3960 ex.248 > Email: jwelsh at sdibr.org -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From landronimirc at gmail.com Sat Sep 3 09:14:35 2011 From: landronimirc at gmail.com (Liviu Andronic) Date: Sat, 3 Sep 2011 09:14:35 +0200 Subject: [R] problem with "plm" package In-Reply-To: References: <80098b6d0911270708g216f6b01m5dd57131b199df66@mail.gmail.com> <1314843915121-3782639.post@n4.nabble.com> Message-ID: (please reply to the list, too, as it increases your chances of getting a helpful answer) On Sat, Sep 3, 2011 at 2:51 AM, loth lorien wrote: > Hi Liviu, > > Thank you for your suggestions, but they didn't seem to work. > > I got R to estimate a fixed effects model for me: > formula<-plm(RoE~RoA+NIM+Cash.TL+CAR+NPL.Loans, data=ComBank, > model="within") > > But when I try to estimate a random effects model for a panel data set I > still get the error message:? "missing value where TRUE/FALSE needed" > I am suspecting that something may relate to the data. Do you have persistent variables that change seldom? Maybe that's causing a problem for the 'random' estimator. Is your panel balanced? If not, does it contain very few observations for some ids? To make sure that the functions work as expected on your system, try to replicate some of the examples in the 'plm' vignette. Regards Liviu > There must be something simple that I am missing, but at the moment, I can't > figure out what it is. > > Thanks again, > > Ashraf > >> From: landronimirc at gmail.com >> Date: Thu, 1 Sep 2011 10:37:57 +0200 >> Subject: Re: [R] problem with "plm" package >> To: lothlorien90 at hotmail.com >> CC: r-help at r-project.org >> >> On Thu, Sep 1, 2011 at 4:25 AM, Ash wrote: >> > Hi, >> > >> > I am trying to complete a very simple panel analysis on some bank data. >> > >> > Call: >> > Formula<-plm(RoE~RoA+CAR+Inc.Dep+Cash.TL+NPL.Loans, data=Banks1, >> > model="random", index=c("Bank.I.D.","Year")) >> > summary(Formula) >> > >> > I get the following error code: >> > >> > ERROR: >> > ?missing value where TRUE/FALSE needed >> > >> > Does anyone know what true/false field I am missing? >> > >> I am not sure what goes wrong, but try first to >> Banks1.p <- pdata.frame(Banks1, index=c("Bank.I.D.","Year") >> >> and see if that went fine, and then >> Banks1.fit <- plm(RoE~RoA+CAR+Inc.Dep+Cash.TL+NPL.Loans, >> data=Banks1.p, model="random") >> >> It may help if you posted >> str(Banks1) >> >> Liviu > -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail From djmuser at gmail.com Sat Sep 3 10:07:26 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Sat, 3 Sep 2011 01:07:26 -0700 Subject: [R] problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome In-Reply-To: References: Message-ID: Hi: I tried to figure out what you were doing...some of it I think I grasped, other parts not so much. On Fri, Sep 2, 2011 at 8:18 PM, Maya Joshi wrote: > Dear R experts. > > I might be missing something obvious. I have been trying to fix this problem > for some weeks. Please help. Let's start by reducing some of your code: ped <- rep(1:3, c(4, 3, 3)) y <- rnorm(10, 8, 2) # This replaces all of your sample() statements, and is equivalent: smat <- matrix(sample(1:3, 120, replace = TRUE), ncol = 12) colnames(smat) <- c('M1a', 'M1b', 'M1aP1', 'M1bP2', 'M2a', 'M2b', 'M2aP1', 'M2bP2', 'M3a', 'M3b', 'M3aP1', 'M3bP2') mydf <- as.data.frame(cbind(ped, y, smat)) > > #data > ped <- c(rep(1, 4), rep(2, 3), rep(3, 3)) > y <- rnorm(10, 8, 2) > > # variable set 1 > M1a <- sample (c(1, 2,3), 10, replace= T) > M1b <- sample (c(1, 2,3), 10, replace= T) > M1aP1 <- sample (c(1, 2,3), 10, replace= T) > M1bP2 <- sample (c(1, 2,3), 10, replace= T) > > # variable set 2 > M2a <- sample (c(1, 2,3), 10, replace= T) > M2b <- sample (c(1, 2,3), 10, replace= T) > M2aP1 <- sample (c(1, 2,3), 10, replace= T) > M2bP2 <- sample (c(1, 2,3), 10, replace= T) > > # variable set 3 > M3a <- sample (c(1, 2,3), 10, replace= T) > M3b <- sample (c(1, 2,3), 10, replace= T) > M3aP1 <- sample (c(1, 2,3), 10, replace= T) > M3bP2 <- sample (c(1, 2,3), 10, replace= T) > > mydf <- data.frame (ped, M1a,M1b,M1aP1,M1bP2, M2a,M2b,M2aP1,M2bP2, > M3a,M3b,M3aP1,M3bP2, y) > > # functions and further calculations > > mmat <- matrix > (c("M1a","M2a","M3a","M1b","M2b","M3b","M1aP1","M2aP1","M3aP1", > "M1bP2","M2bP2","M3bP2"), ncol = 4) As far as I can tell, you want to compare a with aP1 and b with bP1 for all three M values. Given the way your data are arranged, each row of the matrix smat contains all 12 values. apply() on index 1 generally takes a row vector as its input source. Your function to pass to apply should respect that. My version of myfun() takes the row vector as input, divides it into two subgroups of six, uses the ifelse() function to do the comparisons and reshapes it into a 3 x 2 matrix, and then finally takes the row sums of the matrix, which is the return value from the function. myfun <- function(x) { # Indices of the input vector to be compared u <- c(1, 2, 5, 6, 9, 10) v <- c(3, 4, 7, 8, 11, 12) ot <- matrix(ifelse(x[u] == x[v], 1, -1), ncol = 2, byrow = TRUE) rowSums(ot) } # Apply myfun() to the matrix of samples (smat) qt <- t(apply(smat, 1, myfun)) colnames(qt) <- paste('M', 1L:3L, sep = '') mydf2 <- data.frame(ped, y, as.data.frame(qt), Msum = rowSums(qt)) I wasn't sure what you wanted to do with M1-M3, so I added a row sum variable just in case. > > # first function > myfun <- function(x) { > x<- as.vector(x) > ot1 <- ifelse(mydf[x[1]] == mydf[x[3]], 1, -1) > ot2 <- ifelse(mydf[x[2]] == mydf[x[4]], 1, -1) > qt <- ot1 + ot2 > return(qt) > } > qt <- apply(mmat, 1, myfun) # Mean deviation vector for y - OK. > ydv <- c((y - mean(y))^2) > qtd <- data.frame(ped, ydv, qt) > Here's the point where things get more confusing. I'm not sure what you intend from sumD; as it stands, it will, in each row, multiply y by the values in qt, which will generate an n x 3 matrix, n representing the number of rows in the sub-data frame indexed by ped. sumD will then sum the entire n x 3 matrix. Is that what you wanted? More generally, when you write a function to pass to ddply(), the input argument should be a data frame and the output should be a data frame, especially if you intend to return more than one value. Part of the problem with your function is that the variables inside the body have no reference to the input data frame. Moreover, your mean squared deviation function always uses 4 as the denominator rather than the number of rows of the input data frame. (We'll ignore the bias in the estimator since we're concentrating on the code.) I have absolutely no idea if the following is what you intended, but I'm showing this to illustrate how to create a function for input into ddply() that can be applied groupwise. myfun2 <- function(d) { # input argument d is a data frame # vector of squared mean deviations of y ydv <- with(d, (y - mean(y))^2) vydv <- mean(ydv) # multiply the squared deviations vector by the sum # of M1-M3 and then add the cross-products sumD <- sum(ydv * d$Msum) # or equivalently, as an inner product: # sumD <- ydv %*% d$Msum # Return the ratio as a data frame data.frame(Rt = vydv / sumD) } ddply(mydf2, .(ped), myfun2) ## My result: > ddply(mydf2, .(ped), myfun2) ped Rt 1 1 0.35787388 2 2 -0.17739049 3 3 -0.09958257 I'm reasonably sure that myfun() is OK, but myfun2() contains more than a little guesswork. Assuming it's wrong, what the code can tell you is how to pass the variables in from the input data frame and to output a data frame when called within ddply(). HTH, Dennis > # second function > myfun2 <- function(dataframe) { > vydv <- sum(ydv)*0.25 > sumD <- sum(ydv * qt) > Rt <- vydv / sumD > return(Rt) > } > > # using plyr > require(plyr) > dfsumd1 <- ddply(mydf,.(mydf$ped),myfun2) > > Here are 2 issues: > (1) The output just one, I need the output for all three set of variables > (as listed above) > > (2) ?all three values of dfsumd is returning to same for all level of ped: > 1,2, 3 > Means that the function is applied to whole dataset but only replicated in > output !!! > > I tried with plyr not being lazy but due to my limited R knowledge, If you > have a different suggestion, you are welcome too. > > Thank you in advance... > > Maya > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From patrick.breheny at uky.edu Sat Sep 3 12:03:13 2011 From: patrick.breheny at uky.edu (Patrick Breheny) Date: Sat, 3 Sep 2011 06:03:13 -0400 Subject: [R] misclassification rate In-Reply-To: <1314998980979-3787075.post@n4.nabble.com> References: <1314998980979-3787075.post@n4.nabble.com> Message-ID: <4E61FB61.90609@uky.edu> On 09/02/2011 05:29 PM, Doussa wrote: > I have a matrix in wich there are two colmns( yp, yt) > Yp: predicted values from my model. > yt: true values ( my dependante variable y is a categorical;3 modalities > (0,1,2) > I don't know how to procede to calculate the misclassification rate and the > error Types. Suppose your data looks like this: > yp <- sample(0:2,50,replace=TRUE) > yt <- sample(0:2,50,replace=TRUE) You can create a cross-classification table with: > tab <- table(yp,yt) > tab yt yp 0 1 2 0 5 8 5 1 2 11 9 2 1 2 7 And the misclassification rate is > 1-sum(diag(tab))/sum(tab) [1] 0.54 -- Patrick Breheny Assistant Professor Department of Biostatistics Department of Statistics University of Kentucky From maya.d.joshi at gmail.com Sat Sep 3 13:57:59 2011 From: maya.d.joshi at gmail.com (Maya Joshi) Date: Sat, 3 Sep 2011 07:57:59 -0400 Subject: [R] problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From f.harrell at vanderbilt.edu Sat Sep 3 15:24:35 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Sat, 3 Sep 2011 06:24:35 -0700 (PDT) Subject: [R] ROCR package question for evaluating two regression models In-Reply-To: <1315009964.93061.YahooMailClassic@web120609.mail.ne1.yahoo.com> References: <1315009964.93061.YahooMailClassic@web120609.mail.ne1.yahoo.com> Message-ID: <1315056275932-3787855.post@n4.nabble.com> It is not possible to have one cutoff point unless you have a very strange utility function. Nor is there a need for a cutoff when using a probability model. It is not advisable to compare models based on ROC area as this loses power. A likelihood-based approach is recommended. Frank Andra Isan wrote: > > Hello All,? > I have used logistic regression glm in R and I am evaluating two models > both learned with glm but with different predictors.?model1 <- glm (Y ~ > x4+ x5+ x6+ x7, data = dat, family = binomial(link=logit))model2 <- glm > (Y~ x1 + x2 +x3 , data = dat,?family = binomial(link=logit))? > and I would like to compare these two models based on the prediction that > I get from each model: > pred1 = predict(model1, test.data, type = "response")pred2 = > predict(model2, test.data, type = "response") > I have used ROCR package to compare them:pr1 = prediction(pred1,test.y)pf1 > = performance(pr1, measure = "prec", x.measure = "rec") ?plot(pf1) which > cutoff this plot is based on? > pr2 = prediction(pred2,test.y)pf2 = performance(pr2, measure = "prec", > x.measure = "rec")pf2_roc ?= performance(pr2,measure="err")plot(pf2) > First of all, I would like to use cutoff = 0.5 and plot the ROC, > precision-recall curves based on that cutoff value. In other words, how to > define a cut off value in performance function?For example, in?pf2_roc ?= > performance(pr2,measure="err"), when I do plot(pf2_roc), it plots for > every single cutoff point. I only want to have one cut off point, is there > any way to do that?Second, I would like to see the performance of the two > models based on the above measures on the same plot so the comparison > would be easier. In other words, how can I plot (pf1, pf2) and compare > them?together?plot(pf1, pf2) would give me an error as follows:Error in > as.double(x) :?? cannot coerce type 'S4' to vector of type 'double' > Could you please help me with that? > Thanks a lot,Andra > > > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/ROCR-package-question-for-evaluating-two-regression-models-tp3787301p3787855.html Sent from the R help mailing list archive at Nabble.com. From hannah.hlx at gmail.com Sat Sep 3 18:18:28 2011 From: hannah.hlx at gmail.com (li li) Date: Sat, 3 Sep 2011 12:18:28 -0400 Subject: [R] question with uniroot function Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From maya.d.joshi at gmail.com Sat Sep 3 15:21:29 2011 From: maya.d.joshi at gmail.com (Maya Joshi) Date: Sat, 3 Sep 2011 09:21:29 -0400 Subject: [R] problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From noxyport at gmail.com Sat Sep 3 15:57:02 2011 From: noxyport at gmail.com (Pete Pete) Date: Sat, 3 Sep 2011 06:57:02 -0700 (PDT) Subject: [R] Loop with random sampling and write.table Message-ID: <1315058222247-3787895.post@n4.nabble.com> Hi! I need to perform this simple sampling function several hundred times: x1=as.character(rnorm(1000, 100, 15)) x2=as.character(rnorm(1000, 150, 10)) y1=as.data.frame(x1,x2) sample1=as.data.frame(sample(y1$x1, 12, replace = FALSE, prob = NULL)) sample1 write.table(sample1, "sample1.txt", sep=" ",row.names=F,quote=F) My knowledge of loops is quite low. How can I produce 100 loops of the sampling leading to 100 files from sample1.txt to sample100.txt? Thanks for you help! -- View this message in context: http://r.789695.n4.nabble.com/Loop-with-random-sampling-and-write-table-tp3787895p3787895.html Sent from the R help mailing list archive at Nabble.com. From eeadie at unm.edu Sat Sep 3 18:04:35 2011 From: eeadie at unm.edu (Elizabeth C Eadie) Date: Sat, 03 Sep 2011 10:04:35 -0600 Subject: [R] help with glmm.admb Message-ID: R glmmADMB question I am trying to use glmm.admb (the latest alpha version from the R forge website 0.6.4) to model my count data that is overdispersed using a negative binomial family but keep getting the following error message: Error in glmm.admb(data$total_bites_rounded ~ age_class_back, random = ~food.dif.id, : Argument "group" must be a character string specifying the name of the grouping variable (also when "random" is missing) Here is what I have tried so far (along with some similar variations): model_nb<-glmm.admb(data$total_bites_rounded~age_class_back+(1|"subject")+(1|food.dif.id)+offset(log(forage_time)),data=data,family="nbinom") modelnb<-glmm.admb(data$total_bites_rounded~age_class_back, random=~food.dif.id, group="subject", data=data, offset=offset,family="nbinom") I am not sure what I am doing wrong. My model in lmer that seemed to work was: modelc References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jorgeivanvelez at gmail.com Sat Sep 3 18:37:37 2011 From: jorgeivanvelez at gmail.com (Jorge I Velez) Date: Sat, 3 Sep 2011 12:37:37 -0400 Subject: [R] Loop with random sampling and write.table In-Reply-To: <1315058222247-3787895.post@n4.nabble.com> References: <1315058222247-3787895.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ligges at statistik.tu-dortmund.de Sat Sep 3 18:50:10 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sat, 03 Sep 2011 18:50:10 +0200 Subject: [R] rJava Installation Problems: 'cannot open compressed file 'rJava/DESCRIPTION', probable reason 'No such file or directory'' In-Reply-To: References: Message-ID: <4E625AC2.9090700@statistik.tu-dortmund.de> Your internet connection is flaky, your first try just downloaded half the package, on the second try you got almost nothing (your output says 2519 bytes). So this is your connection rather than R or rJava. Uwe Ligges On 01.09.2011 15:37, R. Michael Weylandt wrote: > Good Morning, > > I'm trying to install the rJava package on a local (work) machine and having > some trouble. The following occurred in an RGui session. > >> sessionInfo() > > R version 2.13.0 (2011-04-13) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 > [3] LC_MONETARY=English_United States.1252 > LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] tools_2.13.0 > >> install.packages("rJava") # Same error thrown for other CRAN mirrors > --- Please select a CRAN mirror for use in this session --- > trying URL ' > http://cran.sixsigmaonline.org/bin/windows/contrib/2.13/rJava_0.9-1.zip' > Content type 'application/zip' length 654936 bytes (639 Kb) > opened URL > downloaded 338 Kb > > Error in gzfile(file, "r") : cannot open the connection > In addition: Warning messages: > 1: In download.file(url, destfile, method, mode = "wb", ...) : > downloaded length 347116 != reported length 654936 > 2: In unzip(zipname, exdir = dest) : error 1 in extracting from zip file > 3: In gzfile(file, "r") : > cannot open compressed file 'rJava/DESCRIPTION', probable reason 'No such > file or directory' > >> install.packages("rJava",repos="http://www.rforge.net") > trying URL 'http://www.rforge.net/bin/windows/contrib/2.13/rJava_0.9-2.zip' > Content type 'text/html; charset=utf-8' length unknown > opened URL > downloaded 2519 bytes > > Error in gzfile(file, "r") : cannot open the connection > In addition: Warning messages: > 1: In unzip(zipname, exdir = dest) : error 1 in extracting from zip file > 2: In gzfile(file, "r") : > cannot open compressed file 'rJava/DESCRIPTION', probable reason 'No such > file or directory' > > > I'm running as a (temporary) local admin on a Windows 7 platform that's not > my own. I'm able to install other packages so I believe the problem is > specific to rJava, but I'm by no means certain of that. It's not super > important so I'd like to avoid the Rtools + .tar.gz route to avoid the wrath > of the IT guys, but if there's something obvious I've missed, any help would > be much appreciated. I've tried to download and look inside the .zip files > manually from the rforge site, but I haven't been able to get them to > download. > > Thank you, > > Michael Weylandt > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Sat Sep 3 18:59:39 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sat, 03 Sep 2011 18:59:39 +0200 Subject: [R] question with uniroot function In-Reply-To: References: Message-ID: <4E625CFB.9000106@statistik.tu-dortmund.de> On 03.09.2011 18:31, li li wrote: > Dear all, > I forgot to mention the values for the parameters in the first email. > > u1<- -3 > > u2<- 4 > > alpha<- 0.05 > > p1<- 0.15 > > > > Thank you very much. > > 2011/9/3 li li > >> Dear all, >> I have the following problem with the uniroot function. I want to find >> roots for the fucntion "Fp2" which is defined as below. >> >> >> Fz<- function(z){0.8*pnorm(z)+p1*pnorm(z-u1)+(0.2-p1)*pnorm(z-u2)} >> >> >> Fp<- function(t){(1-Fz(abs(qnorm(1-(t/2)))))+(Fz(-abs(qnorm(1-(t/2)))))} >> >> >> Fp2<- function(t) {Fp(t)-0.8*t/alpha} >> >> >> >> th<- uniroot(Fp2, lower =0, upper =1, >> >> tol = 0.0001)$root >> >> >> The result is 0 as shown below. >> >> >>> th >> [1] 0 No surprise, since Fp2(0) is 0. And uniroot is finished then and does not need to evaluate at a lower starting at some positive value such as: uniroot(Fp2, lower=1e-7, upper=1, tol=1e-7) # 0.009521749 Uwe Ligges >> >> However, there should be a root between 0.00952 and 0.00955, since the >> function values are of opposite signs as below. >> >> >>> Fp2(0.00952) >> [1] 2.264272e-05 >>> Fp2(0.00955) >> [1] -0.0003657404 >> >> Can any one give me a hand here? Thanks a lot. >> Hannah >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Sat Sep 3 19:06:03 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sat, 03 Sep 2011 19:06:03 +0200 Subject: [R] Hessian Matrix Issue In-Reply-To: References: Message-ID: <4E625E7B.1030604@statistik.tu-dortmund.de> I have not really looked into the details of the lengthy and almost unreadable code below. In any case, there are good reasons why numerics software typically uses Fisher scoring / IWLS in order to fit GLMs..... And if your matrix is that "singular", even the common numerical tricks may not save the day anymore. 7e-21 is very close to exact singularity! Uwe Ligges On 02.09.2011 21:33, tzaihra at alcor.concordia.ca wrote: > Dear All, > > I am running a simulation to obtain coverage probability of Wald type > confidence intervals for my parameter d in a function of two parameters > (mu,d). > > I am optimizing it using "optim" method "L-BFGS-B" to obtain MLE. As, I > want to invert the Hessian matrix to get Standard errors of the two > parameter estimates. However, my Hessian matrix at times becomes > non-invertible that is it is no more positive definite and I get the > following error msg: > > "Error in solve.default(ac$hessian) : system is computationally singular: > reciprocal condition number = 6.89585e-21" > Thank you > > Following is the code I am running I would really appreciate your comments > and suggestions: > > #Start Code > #option to trace /recover error > #options(error = recover) > > #Sample Size > n<-30 > mu<-5 > size<- 2 > > #true values of parameter d > d.true<-1+mu/size > d.true > > #true value of zero inflation index phi= 1+log(d)/(1-d) > z.true<-1+(log(d.true)/(1-d.true)) > z.true > > # Allocating space for simulation vectors and setting counters for simulation > counter<-0 > iter<-10000 > lower.d<-numeric(iter) > upper.d<-numeric(iter) > > #set.seed(987654321) > > #begining of simulation loop######## > > for (i in 1:iter){ > r.NB<-rnbinom(n, mu = mu, size = size) > y<-sort(r.NB) > iter.num<-i > print(y) > print(iter.num) > #empirical estimates or sample moments > xbar<-mean(y) > variance<-(sum((y-xbar)^2))/length(y) > dbar<-variance/xbar > #sample estimate of proportion of zeros and zero inflation index > pbar<-length(y[y==0])/length(y) > > ### Simplified function ############################################# > > NegBin<-function(th){ > mu<-th[1] > d<-th[2] > n<-length(y) > > arg1<-n*mean(y)*ifelse(mu>= 0, log(mu),0) > #arg1<-n*mean(y)*log(mu) > > #arg2<-n*log(d)*((mean(y))+mu/(d-1)) > arg2<-n*ifelse(d>=0, log(d), 0)*((mean(y))+mu/ifelse((d-1)>= 0, (d-1), > 0.0000001)) > > aa<-numeric(length(max(y))) > a<-numeric(length(y)) > for (i in 1:n) > { > for (j in 1:y[i]){ > aa[j]<-ifelse(((j-1)*(d-1))/mu>0,log(1+((j-1)*(d-1))/mu),0) > #aa[j]<-log(1+((j-1)*(d-1))/mu) > #print(aa[j]) > } > > a[i]<-sum(aa) > #print(a[i]) > } > a > arg3<-sum(a) > llh<-arg1+arg2+arg3 > if(! is.finite(llh)) > llh<-1e+20 > -llh > } > ac<-optim(NegBin,par=c(xbar,dbar),method="L-BFGS-B",hessian=TRUE,lower= > c(0,1) ) > ac > print(ac$hessian) > muhat<-ac$par[1] > dhat<-ac$par[2] > zhat<- 1+(log(dhat)/(1-dhat)) > infor<-solve(ac$hessian) > var.dhat<-infor[2,2] > se.dhat<-sqrt(var.dhat) > var.muhat<-infor[1,1] > se.muhat<-sqrt(var.muhat) > var.func<-dhat*muhat > var.func > d.prime<-cbind(dhat,muhat) > > se.var.func<-d.prime%*%infor%*%t(d.prime) > se.var.func > lower.d[i]<-dhat-1.96*se.dhat > upper.d[i]<-dhat+1.96*se.dhat > > if(lower.d[i]<= d.true& d.true<= upper.d[i]) > counter<-counter+1 > } > counter > covg.prob<-counter/iter > covg.prob > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Sat Sep 3 19:09:50 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sat, 03 Sep 2011 19:09:50 +0200 Subject: [R] calling R from PHP error In-Reply-To: References: Message-ID: <4E625F5E.30702@statistik.tu-dortmund.de> On 02.09.2011 08:23, Katerina Karayianni wrote: > Hello, > I am having the following error while calling an R script through PHP. > > /usr/local/bin/R: line 227: /kk/Programs/R-2.13.0/etc/ldpaths: Permission > denied > ERROR: R_HOME ('/kk/Programs/R-2.13.0') not found > > I had compiled R from source and placed the generated R shell script in > /usr/local/bin. So you said make install or did you copy it manually? The latter may have been your first glitch. > > Can you give me an insight of how to give permission to access the ldpaths > file and why is the R_HOME tree not found? Does that directory exist? Do you have read/execute permissions? Uwe Ligges > Thank you and regards > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From daniel at umd.edu Sat Sep 3 19:34:09 2011 From: daniel at umd.edu (Daniel Malter) Date: Sat, 3 Sep 2011 10:34:09 -0700 (PDT) Subject: [R] Classifying large text corpora using R In-Reply-To: <1314987808606-3786787.post@n4.nabble.com> References: <1314987808606-3786787.post@n4.nabble.com> Message-ID: <1315071249621-3788196.post@n4.nabble.com> Take a look here: http://www.jstatsoft.org/v25/i05/paper HTH, Da. andy1234 wrote: > > Dear everyone, > > I am new to R, and I am looking at doing text classification on a huge > collection of documents (>500,000) which are distributed among 300 classes > (so basically, this is my training data). Would someone please be kind > enough to let me know about the R packages to use and their scalability > (time and space)? > > I am very new to R and do not know of the right packages to use. I started > off by trying to use the tm package (http://cran.r-project.org/package=tm) > for pre-processing and FSelector > (http://cran.r-project.org/web/packages/FSelector/index.html) package for > feature selection - but both of these are incredibly slow and completely > unusable for my task. > > So the question is what are the right packages to use (for pre-processing, > feature selection, and classification)? Please consider the fact that I > may be dealing with data of millions of dimensions which may not even fit > in memory. > > I posted on this issue twice > (http://r.789695.n4.nabble.com/Entropy-based-feature-selection-in-R-td3708056.html > , > http://r.789695.n4.nabble.com/R-s-handling-of-high-dimensional-data-td3741758.html) > but did not get any response. This is a very critical piece of my research > and I have been struggling with this issue for a long time. Please > consider helping me out, directly or by pointing me to any other > software/website that you think may be more appropriate. > > Many thanks in advance. > -- View this message in context: http://r.789695.n4.nabble.com/Classifying-large-text-corpora-using-R-tp3786787p3788196.html Sent from the R help mailing list archive at Nabble.com. From nashjc at uottawa.ca Sat Sep 3 19:39:42 2011 From: nashjc at uottawa.ca (John C Nash) Date: Sat, 03 Sep 2011 13:39:42 -0400 Subject: [R] Hessian matrix issue In-Reply-To: References: Message-ID: <4E62665E.2020602@uottawa.ca> Unless you are supplying analytic hessian code, you are almost certainly getting an approximation. Worse, if you do not provide gradients, these are the result of two levels of differencing, so you should expect some loss of precision in the approximate Hessian. Moreover, if your estimate of the optimum is a little bit off, or the optimizer has terminated (algorithms converge, programs terminate) to a point that is not an optimum, there is no reason the Hessian should be positive definite. Package optimx() uses the Jacobian of the gradient if the analytic gradient is available. This drops the differencing to 1 level. Even better is to code the Hessian, but that is messy and tedious in most cases. Best, JN On 09/03/2011 06:00 AM, r-help-request at r-project.org wrote: > Message: 59 > Date: Fri, 2 Sep 2011 15:33:13 -0400 > From: tzaihra at alcor.concordia.ca > To: r-help at r-project.org > Subject: [R] Hessian Matrix Issue > Message-ID: > > Content-Type: text/plain;charset=iso-8859-1 > > Dear All, > > I am running a simulation to obtain coverage probability of Wald type > confidence intervals for my parameter d in a function of two parameters > (mu,d). > > I am optimizing it using "optim" method "L-BFGS-B" to obtain MLE. As, I > want to invert the Hessian matrix to get Standard errors of the two > parameter estimates. However, my Hessian matrix at times becomes > non-invertible that is it is no more positive definite and I get the > following error msg: > > "Error in solve.default(ac$hessian) : system is computationally singular: > reciprocal condition number = 6.89585e-21" > Thank you > > Following is the code I am running I would really appreciate your comments > and suggestions: > > #Start Code > #option to trace /recover error > #options(error = recover) > > #Sample Size > n<-30 > mu<-5 > size<- 2 > > #true values of parameter d > d.true<-1+mu/size > d.true > > #true value of zero inflation index phi= 1+log(d)/(1-d) > z.true<-1+(log(d.true)/(1-d.true)) > z.true > > # Allocating space for simulation vectors and setting counters for simulation > counter<-0 > iter<-10000 > lower.d<-numeric(iter) > upper.d<-numeric(iter) > > #set.seed(987654321) > > #begining of simulation loop######## > > for (i in 1:iter){ > r.NB<-rnbinom(n, mu = mu, size = size) > y<-sort(r.NB) > iter.num<-i > print(y) > print(iter.num) > #empirical estimates or sample moments > xbar<-mean(y) > variance<-(sum((y-xbar)2))/length(y) > dbar<-variance/xbar > #sample estimate of proportion of zeros and zero inflation index > pbar<-length(y[y==0])/length(y) > > ### Simplified function ############################################# > > NegBin<-function(th){ > mu<-th[1] > d<-th[2] > n<-length(y) > > arg1<-n*mean(y)*ifelse(mu >= 0, log(mu),0) > #arg1<-n*mean(y)*log(mu) > > #arg2<-n*log(d)*((mean(y))+mu/(d-1)) > arg2<-n*ifelse(d>=0, log(d), 0)*((mean(y))+mu/ifelse((d-1)>= 0, (d-1), > 0.0000001)) > > aa<-numeric(length(max(y))) > a<-numeric(length(y)) > for (i in 1:n) > { > for (j in 1:y[i]){ > aa[j]<-ifelse(((j-1)*(d-1))/mu >0,log(1+((j-1)*(d-1))/mu),0) > #aa[j]<-log(1+((j-1)*(d-1))/mu) > #print(aa[j]) > } > > a[i]<-sum(aa) > #print(a[i]) > } > a > arg3<-sum(a) > llh<-arg1+arg2+arg3 > if(! is.finite(llh)) > llh<-1e+20 > -llh > } > ac<-optim(NegBin,par=c(xbar,dbar),method="L-BFGS-B",hessian=TRUE,lower= > c(0,1) ) > ac > print(ac$hessian) > muhat<-ac$par[1] > dhat<-ac$par[2] > zhat<- 1+(log(dhat)/(1-dhat)) > infor<-solve(ac$hessian) > var.dhat<-infor[2,2] > se.dhat<-sqrt(var.dhat) > var.muhat<-infor[1,1] > se.muhat<-sqrt(var.muhat) > var.func<-dhat*muhat > var.func > d.prime<-cbind(dhat,muhat) > > se.var.func<-d.prime%*%infor%*%t(d.prime) > se.var.func > lower.d[i]<-dhat-1.96*se.dhat > upper.d[i]<-dhat+1.96*se.dhat > > if(lower.d[i] <= d.true & d.true<= upper.d[i]) > counter <-counter+1 > } > counter > covg.prob<-counter/iter > covg.prob > > > From hb at biostat.ucsf.edu Sat Sep 3 20:43:00 2011 From: hb at biostat.ucsf.edu (Henrik Bengtsson) Date: Sat, 3 Sep 2011 11:43:00 -0700 Subject: [R] UNC Windows path beginning with backslashes In-Reply-To: References: <4E44091E.7090309@statistik.tu-dortmund.de> Message-ID: [I found this message still sitting in my outbox - better late than never] Hi, it should have been "K:" instead of "K" in those examples. However, forget about what I said, because it (subst) turns out it will not work with UNC paths, cf. http://support.microsoft.com/kb/218740. Sorry for the noise /Henrik On Fri, Aug 19, 2011 at 10:39 AM, Keith Jewell wrote: > Thanks Henrik, but I have 2 reasons for not using that approach: > > A) If I don't map the drive until after R starts the UNC path is already > present in several places I know about and probably some I don't, leading to > the problems I started with. > > So reason 'B' doesn't really matter to me, but as author of R.utils you may > be interested that... > B) On my system those calls don't seem to work. Details here... > -------------------------- >> library("R.utils") > Loading required package: R.oo > Loading required package: R.methodsS3 > R.methodsS3 v1.2.1 (2010-09-18) successfully loaded. See ?R.methodsS3 for > help. > R.oo v1.8.1 (2011-07-10) successfully loaded. See ?R.oo for help. > R.utils v1.7.8 (2011-07-24) successfully loaded. See ?R.utils for help. >> sessionInfo() > R version 2.13.1 (2011-07-08) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_United Kingdom.1252 > [2] LC_CTYPE=English_United Kingdom.1252 > [3] LC_MONETARY=English_United Kingdom.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United Kingdom.1252 > > attached base packages: > ?[1] datasets ?grDevices splines ? graphics ?stats ? ? utils ? ? tcltk > ?[8] tools ? ? methods ? base > > other attached packages: > ?[1] R.utils_1.7.8 ? ? ?R.oo_1.8.1 ? ? ? ? R.methodsS3_1.2.1 ?RODBC_1.3-3 > ?[5] tree_1.0-29 ? ? ? ?nlme_3.1-102 ? ? ? MASS_7.3-14 > xlsReadWrite_1.5.4 > ?[9] svSocket_0.9-51 ? ?TinnR_1.0.3 ? ? ? ?R2HTML_2.2 ? ? ? ? Hmisc_3.8-3 > [13] survival_2.36-9 > > loaded via a namespace (and not attached): > [1] cluster_1.14.0 ?grid_2.13.1 ? ? lattice_0.19-31 svMisc_0.9-61 > # ?It seems to think I have no mapped drives.... >> System$getMappedDrivesOnWindows() > named character(0) > # Although I clearly have (in fact I'm running R from Z:), so I can't > # find a 'spare' drive letter >> system("net use") > New connections will not be remembered. > > Status ? ? ? Local ? ? Remote ? ? ? ? ? ? ? ? ? ?Network > ------------------------------------------------------------------------------- > OK ? ? ? ? ? F: ? ? ? ?\\server10\microbiology ? Microsoft Windows Network > OK ? ? ? ? ? L: ? ? ? ?\\server23\Stats ? ? ? ? ?Microsoft Windows Network > OK ? ? ? ? ? M: ? ? ? ?\\server10\jewell ? ? ? ? Microsoft Windows Network > OK ? ? ? ? ? Q: ? ? ? ?\\server04\pccommon (not backed up) > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Microsoft Windows Network > OK ? ? ? ? ? R: ? ? ? ?\\server23\Template ? ? ? Microsoft Windows Network > ? ? ? ? ? ? Z: ? ? ? ?\\campden\shares\Workgroup\Stats > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Microsoft Windows Network > ? ? ? ? ? ? ? ? ? ? ? \\TSCLIENT\C ? ? ? ? ? ? ?Microsoft Terminal Services > ? ? ? ? ? ? ? ? ? ? ? \\TSCLIENT\D ? ? ? ? ? ? ?Microsoft Terminal Services > ? ? ? ? ? ? ? ? ? ? ? \\TSCLIENT\E ? ? ? ? ? ? ?Microsoft Terminal Services > ? ? ? ? ? ? ? ? ? ? ? \\TSCLIENT\F ? ? ? ? ? ? ?Microsoft Terminal Services > ? ? ? ? ? ? ? ? ? ? ? \\TSCLIENT\G ? ? ? ? ? ? ?Microsoft Terminal Services > ? ? ? ? ? ? ? ? ? ? ? \\TSCLIENT\H ? ? ? ? ? ? ?Microsoft Terminal Services > ? ? ? ? ? ? ? ? ? ? ? \\TSCLIENT\I ? ? ? ? ? ? ?Microsoft Terminal Services > ? ? ? ? ? ? ? ? ? ? ? \\TSCLIENT\L ? ? ? ? ? ? ?Microsoft Terminal Services > ? ? ? ? ? ? ? ? ? ? ? \\TSCLIENT\M ? ? ? ? ? ? ?Microsoft Terminal Services > ? ? ? ? ? ? ? ? ? ? ? \\TSCLIENT\Q ? ? ? ? ? ? ?Microsoft Terminal Services > ? ? ? ? ? ? ? ? ? ? ? \\TSCLIENT\R ? ? ? ? ? ? ?Microsoft Terminal Services > The command completed successfully. > # ?The commands you cited throw errors... >> System$mapDriveOnWindows("K", "\\\\campden\\shares\\Workgroup\\Stats") > Error in list(`System$mapDriveOnWindows("K", > "\\\\campden\\shares\\Workgroup\\Stats")` = , ?: > > [2011-08-19 09:16:28] Exception: Argument 'drive' is not a valid drive (e.g. > 'Y:'): K > ?at throw(Exception(...)) > ?at throw.default("Argument 'drive' is not a valid drive (e.g. 'Y:'): ", > drive) > ?at throw("Argument 'drive' is not a valid drive (e.g. 'Y:'): ", drive) > ?at method(static, ...) > ?at System$mapDriveOnWindows("K", "\\\\campden\\shares\\Workgroup\\Stats") >> driveLetters <- System$getMappedDrivesOnWindows() >> driveLetters > named character(0) >> System$unmapDriveOnWindows("K") > Error in list(`System$unmapDriveOnWindows("K")` = , > `method(static, ...)` = , ?: > > [2011-08-19 09:29:09] Exception: Argument 'drive' is not a valid drive (e.g. > 'Y:'): K > ?at throw(Exception(...)) > ?at throw.default("Argument 'drive' is not a valid drive (e.g. 'Y:'): ", > drive) > ?at throw("Argument 'drive' is not a valid drive (e.g. 'Y:'): ", drive) > ?at method(static, ...) > ?at System$unmapDriveOnWindows("K") > > Thanks for your interest, > > Keith Jewell > --------------------------------------------- > "Henrik Bengtsson" wrote in message > news:CAFDcVCQE3uUkmmqSjJ0fpEVfJgrAbrgBT1g8drCXGpnsJebEHw at mail.gmail.com... > I think you can also do this from within R (e.g. in your .Rprofile) > using the R.utils package; > > library("R.utils") > System$mapDriveOnWindows("K", "\\\\campden\\shares\\Workgroup\\Stats") > driveLetters <- System$getMappedDrivesOnWindows() > System$unmapDriveOnWindows("K") > > These methods utilize 'subst' of MS Windows. > > /Henrik > > On Thu, Aug 18, 2011 at 6:12 PM, Keith Jewell > wrote: >> Just to close this off, in case it helps anyone else in a similar >> situation... >> >> Background: I have R installed on a UNC share with a site library named by >> major and minor version, thus: >> \\campden\shares\Workgroup\Stats 'root' >> \\campden\shares\Workgroup\Stats\R base for R related things >> \\campden\shares\Workgroup\Stats\R\R-2.13.1 one R installation >> \\campden\shares\Workgroup\Stats\R\library site libraries go here >> \\campden\shares\Workgroup\Stats\R\library\2.13 site library for R 2.13.x >> >> I took the hint and mapped a drive from which to start R. >> Because I don't have a pre-determined drive letter to use I wrote a little >> .bat file to do the job: >> ------------------ >> REM find or 'net use' a drive mapped to stats share >> set remote=\\campden\shares\Workgroup\Stats >> set drive= >> :check >> for /f "delims=*" %%a in ('net use ^| find "%remote%"') do set drive=%%a >> if not defined drive net use * %remote% /persistent:NO>nul & goto check >> set StatsDrive=%drive:~13,2% >> REM using that drive >> REM a) ensure GTK+ is in the path (for packages such as 'playwith') >> REM b) start 32 bit R >> set path=%StatsDrive%/R/GTK+/bin;%path% >> start %StatsDrive%\R\R-2.13.1\bin\i386\Rgui.exe >> ---------------------------------------- >> That's a bit flakey, depending on the exact format of the output from 'net >> use'. If anyone has a better solution, I'll appreciate it! >> >> Now the site library: >> Putting a UNC path into Renviron.site thus... >> R_LIBS_SITE=//campden/shares/workgroup/stats/R/library/%v >> ... was the cause of my original problem. >> I can't put it in as a mapped drive, because I don't know the drive until >> run time. >> I tried to work up and down from the drive mapped R_HOME... >> R_LIBS_SITE=${R_HOME}/.../library/%v >> ... but that didn't work in Renviron.site. >>> Sys.getenv("R_HOME") >> [1] "Z:/R/R-2.13.1" >> ... which is fine, but... >>> Sys.getenv("R_LIBS_SITE") >> [1] "Z:RR-2.13.1/.../library/2.13" >> I think this may be something to do with this quote from ?Startup... >> "value is then processed in a similar way to a Unix shell: in particular >> the >> outermost level of (single or double) quotes is stripped, and backslashes >> are removed except inside quotes" >> ...but I don't have any control over R_HOME, specifically how and when >> forward- or back-slashes are used or removed. >> >> In the end I used Renviron.site just to pass the version information... >> R_Libs_Site=%v >> That directory doesn't exist so doesn't get added to .libPaths() >> In Rprofile.site I worked up and down from R_HOME... >> .libPaths(file.path(dirname(R.home()),"library",Sys.getenv("R_Libs_Site"))) >> ... which seems to do the job. >> >> It isn't pretty, and the .bat file will probably need changing in future >> versions of Windows. >> But by the time R has started there isn't a UNC path in sight. >> I still think I've probably re-invented a wheel and ended up with >> something >> square, but it is going round. >> >> Best regards, >> >> Keith Jewell >> >> "Keith Jewell" wrote in message >> news:j22q11$9u9$1 at dough.gmane.org... >>> Thanks Uwe. >>> >>> I'm aware (and have been forcefully reminded) that using a mapped drive >>> avoids these problems. But there is no single drive letter which I can >>> use >>> site-wide, so I have problems with things like R_LIBS_SITE. As I've >>> outlined I'm exploring a range of solutions, including mapping a drive >>> where I can. >>> >>> I posted in the hope of learning from and perhaps helping those with >>> similar problems. I hope that it is permissible to discuss non-canonical >>> use of R on this list, I certainly did not intend disrespect for the R >>> developers (or to make typing errors). >>> >>> Best regards >>> >>> Keith Jewell >>> >>> "Uwe Ligges" wrote in message >>> news:4E44091E.7090309 at statistik.tu-dortmund.de... >>>> This is extremely tricky since Windows does not always accept "//" >>>> rather >>>> than "\\". Additionally, there is not implemented system call in >>>> Windows, >>>> hence ?Sys.glob tells us a "partial emulation" is provided and "An >>>> attempt is made to handle UNC paths starting with a double backslash." >>>> >>>> As you have seenm this does not work everywhere, therefore it is >>>> advisable to run R from mapped drives - as I am doing in the network of >>>> our university for 13 years without problems now. >>>> >>>> Best, >>>> Uwe Ligges >>>> >>>> > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From Peter.Brecknock at bp.com Sat Sep 3 21:12:13 2011 From: Peter.Brecknock at bp.com (Pete Brecknock) Date: Sat, 3 Sep 2011 12:12:13 -0700 (PDT) Subject: [R] Saving a list as a Matrix In-Reply-To: <1315075057473-3788274.post@n4.nabble.com> References: <1315075057473-3788274.post@n4.nabble.com> Message-ID: <1315077133453-3788327.post@n4.nabble.com> wizykid wrote: > > Hi there. > > I went through the manual but I couldn't find a solution for my problem. > > I have list like this one : >> lst1 > [[1]] > [1] 0 1 2 3 > > [[2]] > [1] 0 1 5 > > [[3]] > [1] 2 3 4 > > and I want to save it as Matrix in Matlab mat format like : > 0 1 2 3 > 0 1 5 0 > 2 3 4 0 > > > can any body help me ? Appreciate your help and thanks in advance. > > Reza > Not pretty but this works ... lst1 = list(c(0,1,2,3),c(0,1,5),c(2,3,4)) t(sapply(lst1, function(x) c(x,rep(0,4-length(x))))) HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/Saving-a-list-as-a-Matrix-tp3788274p3788327.html Sent from the R help mailing list archive at Nabble.com. From alexandra.soberon at unican.es Sat Sep 3 19:22:23 2011 From: alexandra.soberon at unican.es (Soberon Velez, Alexandra Pilar) Date: Sat, 3 Sep 2011 17:22:23 +0000 Subject: [R] automathic bandwidth for multivariate local regression Message-ID: <7546B009C5D8DF4FA41549EE1FBB2EDE23ED2891@mbx01.unican.es> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wizy.kid at gmail.com Sat Sep 3 20:37:37 2011 From: wizy.kid at gmail.com (wizykid) Date: Sat, 3 Sep 2011 11:37:37 -0700 (PDT) Subject: [R] Saving a list as a Matrix Message-ID: <1315075057473-3788274.post@n4.nabble.com> Hi there. I went through the manual but I couldn't find a solution for my problem. I have list like this one : > lst1 [[1]] [1] 0 1 2 3 [[2]] [1] 0 1 5 [[3]] [1] 2 3 4 and I want to save it as Matrix in Matlab mat format like : 0 1 2 3 0 1 5 0 2 3 4 0 can any body help me ? Appreciate your help and thanks in advance. Reza -- View this message in context: http://r.789695.n4.nabble.com/Saving-a-list-as-a-Matrix-tp3788274p3788274.html Sent from the R help mailing list archive at Nabble.com. From salvo_mac at yahoo.com Sat Sep 3 22:14:43 2011 From: salvo_mac at yahoo.com (Salvo Mac) Date: Sat, 3 Sep 2011 13:14:43 -0700 (PDT) Subject: [R] plot.validate.cph Message-ID: <1315080883.68290.YahooMailNeo@web121516.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Ulrich.Halekoh at agrsci.dk Sat Sep 3 23:56:35 2011 From: Ulrich.Halekoh at agrsci.dk (Ulrich Halekoh) Date: Sat, 3 Sep 2011 23:56:35 +0200 Subject: [R] glht (multcomp): NA's for confidence intervals using univariate_calpha Message-ID: <9F0721FDD4F12D4B95AD894274F388EC020C644EA636@DJFEXMBX01.djf.agrsci.dk> Hej, Calculation of confidence intervals for means based on a model fitted with lmer using the package multcomp - yields results for calpha=adjusted_calpha - NA's for calpha=univariate_calpha Example: library(lme4) library(multcomp) ### Generate data set.seed(8) d<-expand.grid(treat=1:2,block=1:3) e<-rnorm(3) names(e)<-1:3 d$y<-rnorm(nrow(d)) + e[d$block] d<-transform(d,treat=factor(treat),block=factor(block)) ##### lmer fit Mod<-lmer(y~treat+ (1|block), data=d) ### estimate treatment means L<-cbind(c(1,0),c(0,1)) s<-glht(Mod,linfct=L) ## confidence intervals confint(s,calpha=adjusted_calpha()) #produces NA's for the confidence limits confint(s,calpha=univariate_calpha()) #for models fitted with lm the problem does not occur G<-lm(y~treat+ block, data=d) L<-matrix( c(1,0,1/3,1/3,1,1,1/3,1/3),2,4,byrow=TRUE) s<-glht(G,linfct=L) confint(s,calpha=adjusted_calpha()) confint(s,calpha=univariate_calpha()) multcomp version 1.2-7 R:platform i386-pc-mingw32 version.string R version 2.13.1 Patched (2011-08-19 r56767) Regards Ulrich Halekoh Department of Molecular Biology and Genetics, Aarhus University, Denmark Ulrich.Halekoh at agrsci.dk From liov2067 at gmail.com Sun Sep 4 00:13:32 2011 From: liov2067 at gmail.com (=?ISO-8859-1?Q?Luis_Iv=E1n_Ortiz_Valencia?=) Date: Sat, 3 Sep 2011 19:13:32 -0300 Subject: [R] about raw type Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wizy.kid at gmail.com Sat Sep 3 22:18:55 2011 From: wizy.kid at gmail.com (wizykid) Date: Sat, 3 Sep 2011 13:18:55 -0700 (PDT) Subject: [R] Saving a list as a Matrix In-Reply-To: <1315077133453-3788327.post@n4.nabble.com> References: <1315075057473-3788274.post@n4.nabble.com> <1315077133453-3788327.post@n4.nabble.com> Message-ID: <1315081135579-3788414.post@n4.nabble.com> Thank you so much Pete. Have a good one as you made mine -- View this message in context: http://r.789695.n4.nabble.com/Saving-a-list-as-a-Matrix-tp3788274p3788414.html Sent from the R help mailing list archive at Nabble.com. From davef at otter-rsch.com Sat Sep 3 22:22:54 2011 From: davef at otter-rsch.com (dave fournier) Date: Sat, 03 Sep 2011 13:22:54 -0700 Subject: [R] Hessian Matrix Issue In-Reply-To: References: Message-ID: <4E628C9E.1060200@otter-rsch.com> I wonder if your code is correct? I ran your script until an error was reported. the data set of 30 obs was [1] 0 0 1 3 3 3 4 4 4 4 5 5 5 5 5 7 7 7 7 7 7 8 9 10 11 [26] 12 12 12 15 16 I created a tiny AD Model Builder program to do MLE on it. DATA_SECTION init_int nobs init_vector y(1,nobs) PARAMETER_SECTION init_number log_mu init_number log_alpha sdreport_number mu sdreport_number tau objective_function_value f PROCEDURE_SECTION mu=exp(log_mu); tau=1.0+exp(log_alpha); for (int i=1;i<=nobs;i++) { f-=log_negbinomial_density(y(i),mu,tau); } It converged quickly and The eigenvalues of the Hessian were 4.711089774 78.27632341 and the estimates and std devs of the parameters mu and tau were index name value std dev 3 mu 6.6000e+00 7.7318e-01 4 tau 2.7173e+00 7.8944e-01 where tau is the variance divided by the mean. This was all so simple that I suspect your (rather difficult to read) R code is wrong, otherwise R must really suck at this kind of problem. Dave From ferdaous.somrani at gmail.com Sat Sep 3 23:10:10 2011 From: ferdaous.somrani at gmail.com (Doussa) Date: Sat, 3 Sep 2011 14:10:10 -0700 (PDT) Subject: [R] misclassification rate In-Reply-To: <4E61FB61.90609@uky.edu> References: <1314998980979-3787075.post@n4.nabble.com> <4E61FB61.90609@uky.edu> Message-ID: <1315084210277-3788456.post@n4.nabble.com> Thank you very much Patrick. -- View this message in context: http://r.789695.n4.nabble.com/misclassification-rate-tp3787075p3788456.html Sent from the R help mailing list archive at Nabble.com. From dadrivr at gmail.com Sun Sep 4 00:24:40 2011 From: dadrivr at gmail.com (dadrivr) Date: Sat, 3 Sep 2011 15:24:40 -0700 (PDT) Subject: [R] Problem with by statement for spaghetti plots Message-ID: <1315088680053-3788536.post@n4.nabble.com> Hi, I am trying to apply the example at the bottom of the following page to my own data: http://128.97.141.26/stat/R/faq/spagplot.htm http://128.97.141.26/stat/R/faq/spagplot.htm The code from the example is: /tolerance<-read.table("http://www.ats.ucla.edu/stat/R/faq/tolpp.csv",sep=",", header=T) fit <- by(tolerance, tolerance$id,function(x) fitted.values(lm(tolerance ~ time, data=x))) fit1 <- unlist(fit) names(fit1) <- NULL interaction.plot(tolerance$age, tolerance$id, fit1,xlab="time", ylab="tolerance", legend=F)/ Here is my code: /mydata <- read.table("https://www.sugarsync.com/pf/D000507_6529035_6683114", header=TRUE) fit <- by(mydata, mydata$id, function(x) fitted.values(lm(outcome ~ age, data=x))) fit1 <- unlist(fit) names(fit1) <- NULL interaction.plot(mydata$age, mydata$id, fit1,legend=F)/ I get the following error: fit <- by(mydata, mydata$id, function(x) fitted.values(lm(outcome ~ age, data=x))) "/Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 0 (non-NA) cases/" The error suggests that there is an error with the lm statement. I think it may have to do with missing values. Nonetheless, I don't get an error with the following statements: fitted.values(lm(outcome ~ age, data=mydata)) lm(outcome ~ age, data=mydata) If I specify 'data = mydata' instead of 'data = x', there is no error, but the result is not expected (compared to the example code): fit <- by(mydata, mydata$id, function(x) fitted.values(lm(outcome ~ age, data=mydata))) I'm expecting each id value to have 3 fitted values representing each of the three ages. That's not what I get, however --- I get hundreds of fitted values per id value, which is unexpected. I have my data in tall format, just like the example. Any ideas? Any help would be greatly appreciated. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-by-statement-for-spaghetti-plots-tp3788536p3788536.html Sent from the R help mailing list archive at Nabble.com. From meddee1000 at gmail.com Sun Sep 4 00:34:46 2011 From: meddee1000 at gmail.com (meddee) Date: Sat, 3 Sep 2011 15:34:46 -0700 (PDT) Subject: [R] Bootstrapping a covariance matrix Message-ID: <1315089286925-3788553.post@n4.nabble.com> Dear all I am a bit new to R so please keep your swords sheathed! I would simply like to bootstrap a covariance matrix from a multivariate gaussian density. At face value that seemed like a very straightforward problem to solve but I somehow could not get the boot package to work and did not really understand the documentation so I tried to do the bootstrap manually. Hence: x<-rmvnorm(n = 5, mean, diag(1,length(mean))) Var<-function(a) var(a) Var(x) sample<-matrix(sample(x,replace=T),ncol=length(mean))#single BS sample Var(sample)# sqr matrix of length(mean) #generate 1000 bootstrap samples boot <- array(NA, c(1000, 3, 3)) #assign the var for bootstrap sample i as the ith element in the vector boot, using a for loop for (i in 1:1000) boot[i,,] <- Var(sample) mean(boot) For output I expect to see a 3x3 covariance matrix but i am getting a single scalar value. So can some person(s) do either (or all) of the following: - point out how I can get the intended result from the above code - point out how the boot function can be used to to solve this problem - point me to further documentation for the boot function p.s: rmvnorm is from the mvtnorm package. Thanks in advance! -- View this message in context: http://r.789695.n4.nabble.com/Bootstrapping-a-covariance-matrix-tp3788553p3788553.html Sent from the R help mailing list archive at Nabble.com. From wwwhsd at gmail.com Sun Sep 4 00:45:25 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Sat, 3 Sep 2011 19:45:25 -0300 Subject: [R] Saving a list as a Matrix In-Reply-To: <1315075057473-3788274.post@n4.nabble.com> References: <1315075057473-3788274.post@n4.nabble.com> Message-ID: Try this: t(sapply(lst1, '[', 1:max(sapply(lst1, length)))) On Sat, Sep 3, 2011 at 3:37 PM, wizykid wrote: > Hi there. > > I went through the manual but I couldn't find a solution for my problem. > > I have list like this one : >> lst1 > [[1]] > [1] 0 1 2 3 > > [[2]] > [1] 0 1 5 > > [[3]] > [1] 2 3 4 > > and I want to save it as Matrix in Matlab mat format like : > 0 1 2 3 > 0 1 5 0 > 2 3 4 0 > > > can any body help me ? Appreciate your help and thanks in advance. > > Reza > > > -- > View this message in context: http://r.789695.n4.nabble.com/Saving-a-list-as-a-Matrix-tp3788274p3788274.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O From andrew.macfarlane at pg.canterbury.ac.nz Sun Sep 4 01:52:51 2011 From: andrew.macfarlane at pg.canterbury.ac.nz (drewmac) Date: Sat, 3 Sep 2011 16:52:51 -0700 (PDT) Subject: [R] Lmer plot help Message-ID: <1315093971791-3788613.post@n4.nabble.com> Hello all I'm running the lme4 package on my binomial data, and I'm happy with the model and the resultant plot. However, I'd like to plot my table data, which has: two IVs, and one DV. You can see an example below, where 'attractive' = question (IV), male = condition(IV/predictor) and no/yes = answer (dv). I'm using the table to investigate what questions act differently to the others, so I can better fit my model. Going through tables of numbers doesn't seem the most efficient way of instantly seeing what questions work differently, and I'd like to plot that. Here is my code: table(finaldata$Voice, finaldata$supportive, finaldata$question) #generates my table# , , = attractive no yes male1 28 35 male2 20 22 female1 21 21 female2 30 19 Any help most appreciated. Drew -- View this message in context: http://r.789695.n4.nabble.com/Lmer-plot-help-tp3788613p3788613.html Sent from the R help mailing list archive at Nabble.com. From dadrivr at gmail.com Sun Sep 4 01:53:33 2011 From: dadrivr at gmail.com (dadrivr) Date: Sat, 3 Sep 2011 16:53:33 -0700 (PDT) Subject: [R] Change properties of line summary in interaction.plot Message-ID: <1315094013620-3788614.post@n4.nabble.com> Is it possible to change the color/thickness of the summary line in an interaction.plot without changing the other individual data lines? I would like to make the line from the summary function (mean) the color red and thicker than the surrounding black lines. How can I do that? Here is a link to interaction.plot: http://stat.ethz.ch/R-manual/R-patched/library/stats/html/interaction.plot.html Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Change-properties-of-line-summary-in-interaction-plot-tp3788614p3788614.html Sent from the R help mailing list archive at Nabble.com. From kaushik.sinha.cs at gmail.com Sun Sep 4 02:00:55 2011 From: kaushik.sinha.cs at gmail.com (novis) Date: Sat, 3 Sep 2011 17:00:55 -0700 (PDT) Subject: [R] sd help Message-ID: <982c0b4f-fd16-4afd-88b9-175cfeec91a0@e34g2000prn.googlegroups.com> Hello, I am getting a strange error while computing standard deviation. If the do the following I get the answer NA NA. Any help??? Thank you so much. a<- cbind() a<- cbind(a, 5) a<- cbind(a, 7) print(sd(a)) If I compute the mean it works fine though i.e., print(mean(a)) works just fine. From jwiley.psych at gmail.com Sun Sep 4 02:54:07 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Sat, 3 Sep 2011 17:54:07 -0700 Subject: [R] sd help In-Reply-To: <982c0b4f-fd16-4afd-88b9-175cfeec91a0@e34g2000prn.googlegroups.com> References: <982c0b4f-fd16-4afd-88b9-175cfeec91a0@e34g2000prn.googlegroups.com> Message-ID: Hi, You are not getting an error. You are getting the standard deviation of the columns as clearly described in the documentation. Cheers, Josh On Sat, Sep 3, 2011 at 5:00 PM, novis wrote: > Hello, > I am getting a strange error while computing standard deviation. If > the do the following I get the answer NA NA. Any help??? Thank you so > much. > > a<- cbind() > a<- cbind(a, 5) > a<- cbind(a, 7) > print(sd(a)) > > If I compute the mean it works fine though i.e., print(mean(a)) works > just fine. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ From r.sookias at gmail.com Sun Sep 4 03:08:52 2011 From: r.sookias at gmail.com (Roland Sookias) Date: Sun, 4 Sep 2011 02:08:52 +0100 Subject: [R] AICc function with gls Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From r.sookias at gmail.com Sun Sep 4 03:16:59 2011 From: r.sookias at gmail.com (Roland Sookias) Date: Sun, 4 Sep 2011 02:16:59 +0100 Subject: [R] AICc function with gls In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jumlong.ubru at gmail.com Sun Sep 4 03:58:02 2011 From: jumlong.ubru at gmail.com (Jumlong Vongprasert) Date: Sun, 4 Sep 2011 08:58:02 +0700 Subject: [R] Markov Switching GARCH Package Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jumlong.ubru at gmail.com Sun Sep 4 03:59:07 2011 From: jumlong.ubru at gmail.com (Jumlong Vongprasert) Date: Sun, 4 Sep 2011 08:59:07 +0700 Subject: [R] Test for Random Walk and Makov Process Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From conny-clauss at gmx.de Sun Sep 4 03:51:20 2011 From: conny-clauss at gmx.de (warc) Date: Sat, 3 Sep 2011 18:51:20 -0700 (PDT) Subject: [R] what is wrong with my quicksort? Message-ID: <1315101080660-3788681.post@n4.nabble.com> Hey guys, I tried to program quicksort like this but somethings wrong. please help >partition <- function(x, links, rechts){ > > i <- links > j <- rechts > t <- 0 > pivot <- sample(x[i:j],1) > > while(i <= j){ > > while(x[i] <= pivot){ > i = i+1} > > while(x[j] >= pivot){ > j = j-1} > > if( i <= j){ > > t = x[i] > x[i] = x[j] > x[j] = t > > i=i+1 > j=j-1 > > } > print(pivot) > > > } > #Rekursion > > if(links < j){ > partition(x, links, j)} > if(i < rechts){ > partition(x, i, rechts)} > > return(x) > } > > >quicksort <- function(x){ > > > > partition(x, 1, length(x)) >} thx -- View this message in context: http://r.789695.n4.nabble.com/what-is-wrong-with-my-quicksort-tp3788681p3788681.html Sent from the R help mailing list archive at Nabble.com. From listanand at gmail.com Sun Sep 4 03:26:15 2011 From: listanand at gmail.com (andy1234) Date: Sat, 3 Sep 2011 18:26:15 -0700 (PDT) Subject: [R] Classifying large text corpora using R In-Reply-To: <1315071249621-3788196.post@n4.nabble.com> References: <1314987808606-3786787.post@n4.nabble.com> <1315071249621-3788196.post@n4.nabble.com> Message-ID: <1315099575276-3788667.post@n4.nabble.com> Daniel Malter wrote: > > Take a look here: http://www.jstatsoft.org/v25/i05/paper > > HTH, > Da. > > > andy1234 wrote: >> >> Dear everyone, >> >> I am new to R, and I am looking at doing text classification on a huge >> collection of documents (>500,000) which are distributed among 300 >> classes (so basically, this is my training data). Would someone please be >> kind enough to let me know about the R packages to use and their >> scalability (time and space)? >> >> I am very new to R and do not know of the right packages to use. I >> started off by trying to use the tm package >> (http://cran.r-project.org/package=tm) for pre-processing and FSelector >> (http://cran.r-project.org/web/packages/FSelector/index.html) package for >> feature selection - but both of these are incredibly slow and completely >> unusable for my task. >> >> So the question is what are the right packages to use (for >> pre-processing, feature selection, and classification)? Please consider >> the fact that I may be dealing with data of millions of dimensions which >> may not even fit in memory. >> >> I posted on this issue twice >> (http://r.789695.n4.nabble.com/Entropy-based-feature-selection-in-R-td3708056.html >> , >> http://r.789695.n4.nabble.com/R-s-handling-of-high-dimensional-data-td3741758.html) >> but did not get any response. This is a very critical piece of my >> research and I have been struggling with this issue for a long time. >> Please consider helping me out, directly or by pointing me to any other >> software/website that you think may be more appropriate. >> >> Many thanks in advance. >> > Hi, Many thanks for your reply. I did in fact mention in my e-mail that I have looked at tm package. It does not scale well at all. Then there are other stages in the pipeline - feature selection, classification etc. and I need to find suitable R packages for those also. Any other thoughts? Thanks. Andy -- View this message in context: http://r.789695.n4.nabble.com/Classifying-large-text-corpora-using-R-tp3786787p3788667.html Sent from the R help mailing list archive at Nabble.com. From jholtman at gmail.com Sun Sep 4 04:50:46 2011 From: jholtman at gmail.com (Jim Holtman) Date: Sat, 3 Sep 2011 22:50:46 -0400 Subject: [R] what is wrong with my quicksort? In-Reply-To: <1315101080660-3788681.post@n4.nabble.com> References: <1315101080660-3788681.post@n4.nabble.com> Message-ID: have you tried to debug it yourself. All you said is that 'it went wrong'. that is not a very clear statement of the problem. If I were to start looking at it, I would put some print statements in it to see what is happening on eachpath and with each set of data. Have you tried this? Sent from my iPad On Sep 3, 2011, at 21:51, warc wrote: > Hey guys, > I tried to program quicksort like this but somethings wrong. > > please help > > > >> partition <- function(x, links, rechts){ >> >> i <- links >> j <- rechts >> t <- 0 >> pivot <- sample(x[i:j],1) >> >> while(i <= j){ >> >> while(x[i] <= pivot){ >> i = i+1} >> >> while(x[j] >= pivot){ >> j = j-1} >> >> if( i <= j){ >> >> t = x[i] >> x[i] = x[j] >> x[j] = t >> >> i=i+1 >> j=j-1 >> >> } >> print(pivot) >> >> >> } >> #Rekursion >> >> if(links < j){ >> partition(x, links, j)} >> if(i < rechts){ >> partition(x, i, rechts)} >> >> return(x) >> } >> >> >> quicksort <- function(x){ >> >> >> >> partition(x, 1, length(x)) >> } > > > > thx > > -- > View this message in context: http://r.789695.n4.nabble.com/what-is-wrong-with-my-quicksort-tp3788681p3788681.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From markleeds2 at gmail.com Sun Sep 4 05:17:32 2011 From: markleeds2 at gmail.com (Mark Leeds) Date: Sat, 3 Sep 2011 23:17:32 -0400 Subject: [R] what is wrong with my quicksort? In-Reply-To: References: <1315101080660-3788681.post@n4.nabble.com> Message-ID: Hi: I looked it up in google because I couldn't remember how quicksort worked. ( i'm getting old ). the java code is below. I didn't translate it but what you are doing incorrectly is calling partition recursively when you need to call the quicksort algorithm recursively. you'll see what I mean if you look at the java code below. there are many links on the internet that explains why it works with examples etc. here's one: http://www.algolist.net/Algorithms/Sorting/Quicksort good luck. #============================================================== int partition(int arr[], int left, int right) { int i = left, j = right; int tmp; int pivot = arr[(left + right) / 2]; while (i <= j) { while (arr[i] < pivot) i++; while (arr[j] > pivot) j--; if (i <= j) { tmp = arr[i]; arr[i] = arr[j]; arr[j] = tmp; i++; j--; } }; return i; } #=============================================================== void quickSort(int arr[], int left, int right) { int index = partition(arr, left, right); if (left < index - 1) quickSort(arr, left, index - 1); if (index < right) quickSort(arr, index, right); } On Sat, Sep 3, 2011 at 10:50 PM, Jim Holtman wrote: > have you tried to debug it yourself. ?All you said is that 'it went wrong'. ?that is not a very clear statement of the problem. ?If I were to start looking at it, I would put some print statements in it to see what is happening on eachpath and with each set of data. ?Have you tried this? > > Sent from my iPad > > On Sep 3, 2011, at 21:51, warc wrote: > >> Hey guys, >> I tried to program quicksort like this but somethings wrong. >> >> please help >> >> >> >>> partition <- function(x, links, rechts){ >>> >>> ? ?i <- links >>> ? ?j <- rechts >>> ? ?t <- 0 >>> ? ?pivot <- sample(x[i:j],1) >>> >>> ? ?while(i <= j){ >>> >>> ? ? ? ?while(x[i] <= pivot){ >>> ? ? ? ? ? ?i = i+1} >>> >>> ? ? ? ?while(x[j] >= pivot){ >>> ? ? ? ? ? ?j = j-1} >>> >>> ? ? ? ?if( i <= j){ >>> >>> ? ? ? ? ? ?t = x[i] >>> ? ? ? ? ? ?x[i] = x[j] >>> ? ? ? ? ? ?x[j] = t >>> >>> ? ? ? ? ? ?i=i+1 >>> ? ? ? ? ? ?j=j-1 >>> >>> ? ? ? ? ? ?} >>> ? ? ? ? ? ?print(pivot) >>> >>> >>> ? ? ? ?} >>> ? ?#Rekursion >>> >>> ? ?if(links < j){ >>> ? ? ? ?partition(x, links, j)} >>> ? ?if(i < rechts){ >>> ? ? ? ?partition(x, i, rechts)} >>> >>> ? ?return(x) >>> ? ?} >>> >>> >>> quicksort <- function(x){ >>> >>> >>> >>> ? ? ? ?partition(x, 1, length(x)) >>> } >> >> >> >> thx >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/what-is-wrong-with-my-quicksort-tp3788681p3788681.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From vicvoncastle at gmail.com Sun Sep 4 06:06:05 2011 From: vicvoncastle at gmail.com (Ken) Date: Sun, 4 Sep 2011 00:06:05 -0400 Subject: [R] Test for Random Walk and Makov Process In-Reply-To: References: Message-ID: <8B8C64F0-4452-41FE-8C24-2AF4A46A08C7@gmail.com> For random walk, there are entropy based tests (Robinson 1991), or you could empirically test the hypothesis by generating random normal data with the same mean and standard deviation and looking at the distribution of your quantiles. You could make generic statements also about whether or not the data demonstrates autocorrelation function values which are not significant and do not appear to have trend. Further, In a random walk, a binary variable for whether or not values are above and below the mean should follow a binomial distribution of size 1 with a probability of .5, there are tests which do this but also take magnitude into account. I mean to say there are a lot of ways to approach that problem, it depends on the application and how strong you want your conclusions to be. What kind of Markov process? On Sep 3, 2554 BE, at 9:59 PM, Jumlong Vongprasert wrote: > Dear All > I want to test my data for Random Walk or Markov Process. > How I can do this. > Many Thanks > > -- > Jumlong Vongprasert Assist, Prof. > Institute of Research and Development > Ubon Ratchathani Rajabhat University > Ubon Ratchathani > THAILAND > 34000 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jdnewmil at dcn.davis.ca.us Sun Sep 4 06:15:34 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Sat, 03 Sep 2011 21:15:34 -0700 Subject: [R] about raw type In-Reply-To: References: Message-ID: <31fc90bb-2668-4913-abcf-d8b3abee986b@email.android.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From andresago1 at hotmail.com Sun Sep 4 07:50:09 2011 From: andresago1 at hotmail.com (andre bedon) Date: Sun, 4 Sep 2011 15:50:09 +1000 Subject: [R] Regression coefficient constraints Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From andresago1 at hotmail.com Sun Sep 4 10:12:22 2011 From: andresago1 at hotmail.com (andre bedon) Date: Sun, 4 Sep 2011 18:12:22 +1000 Subject: [R] FW: Regression coefficient constraints In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From liov2067 at gmail.com Sun Sep 4 14:31:34 2011 From: liov2067 at gmail.com (=?ISO-8859-1?Q?Luis_Iv=E1n_Ortiz_Valencia?=) Date: Sun, 4 Sep 2011 09:31:34 -0300 Subject: [R] about raw type In-Reply-To: <31fc90bb-2668-4913-abcf-d8b3abee986b@email.android.com> References: <31fc90bb-2668-4913-abcf-d8b3abee986b@email.android.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From glnbrntt at gmail.com Sun Sep 4 09:40:36 2011 From: glnbrntt at gmail.com (GlenB) Date: Sun, 4 Sep 2011 17:40:36 +1000 Subject: [R] sd help In-Reply-To: <982c0b4f-fd16-4afd-88b9-175cfeec91a0@e34g2000prn.googlegroups.com> References: <982c0b4f-fd16-4afd-88b9-175cfeec91a0@e34g2000prn.googlegroups.com> Message-ID: The sd function is doing *exactly* what it should. the sd of 5 is "NA", and the sd of 7 is "NA" try looking at "a" itself. My guess is that you didn't intend to have two columns each with one value. Did you instead intend to have 5 and 7 in one vector, like so: a<-c(5,7) sd(a) Glen From jim.trabas at googlemail.com Sun Sep 4 09:52:34 2011 From: jim.trabas at googlemail.com (Jim Trabas) Date: Sun, 4 Sep 2011 00:52:34 -0700 (PDT) Subject: [R] How to understand the plotting of the cox.zph function Message-ID: <1315122754949-3788886.post@n4.nabble.com> I have a coxph model which gives me HR of about 2.9 for presence of factor B (factors can be A, B, C, A as baseline in the model), with 95% CI 1.8-4.8 , p<0.001. When checking the proportionality assumption there is significant evidence that there is a violation On the link is the results of the plot(cox.zph) of the model for factor A. http://img31.imageshack.us/img31/7213/coxzph.jpg My question is how should I understand the smoothing line of the graph, and what is its relation (and the relation of the values on the y-axis) to the beta estimate the coxph function gives me (2.9 for the above example) IF there was no violation and the line of the cox.zph plot was straight, would the y-value of the line be (in this example) log(2.9)=1.06? If there is no violation of the proportionality assumption, does the "intercept" of the line equal the log of the HR that the coxph outputs? Or is the intercept the delta of the beta? Thank you very much JT -- View this message in context: http://r.789695.n4.nabble.com/How-to-understand-the-plotting-of-the-cox-zph-function-tp3788886p3788886.html Sent from the R help mailing list archive at Nabble.com. From tewksjj at uw.edu Sun Sep 4 14:37:43 2011 From: tewksjj at uw.edu (Josh Tewksbury) Date: Sun, 4 Sep 2011 05:37:43 -0700 Subject: [R] conditional replacement of character strings in vectors In-Reply-To: <9A9D3A8E-D3C5-4EBF-8A59-68427BF0B5CC@comcast.net> References: <9A9D3A8E-D3C5-4EBF-8A59-68427BF0B5CC@comcast.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From renaud.lancelot at gmail.com Sun Sep 4 09:51:31 2011 From: renaud.lancelot at gmail.com (Renaud Lancelot) Date: Sun, 4 Sep 2011 09:51:31 +0200 Subject: [R] AICc function with gls In-Reply-To: References: Message-ID: Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible URL : From d.rizopoulos at erasmusmc.nl Sun Sep 4 10:55:16 2011 From: d.rizopoulos at erasmusmc.nl (Dimitris Rizopoulos) Date: Sun, 04 Sep 2011 10:55:16 +0200 Subject: [R] Regression coefficient constraints In-Reply-To: References: Message-ID: <4E633CF4.6030209@erasmusmc.nl> One option is to reparameterize to an unconstrained problem using the transformation: exp(betas_i) / sum(exp(betas_i)), with e.g., beta_1 = 0, and then use optim() to maximize with respect to betas. I hope it helps. Best, Dimitris On 9/4/2011 7:50 AM, andre bedon wrote: > > Hi Guys, > Does anyone know how I could constrain my regression coefficients so that they are positive and add up to one? Any help will be greatly appreciated. > Kind Regards, > Andre > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ From conny-clauss at gmx.de Sun Sep 4 13:18:00 2011 From: conny-clauss at gmx.de (warc) Date: Sun, 4 Sep 2011 04:18:00 -0700 (PDT) Subject: [R] what is wrong with my quicksort? In-Reply-To: References: <1315101080660-3788681.post@n4.nabble.com> Message-ID: <1315135080530-3789080.post@n4.nabble.com> the error message I get is: Error in while (x[j] >= pivot) { : Argument has length 0 so either pivot or x[j] is NULL. and it somestimes happens the first time the program enters the recursion, sometimes the 6. or anywhere inbetween. jholtman wrote: > > have you tried to debug it yourself. All you said is that 'it went > wrong'. that is not a very clear statement of the problem. If I were to > start looking at it, I would put some print statements in it to see what > is happening on eachpath and with each set of data. Have you tried this? > > Sent from my iPad > > On Sep 3, 2011, at 21:51, warc <conny-clauss at gmx.de> wrote: > >> Hey guys, >> I tried to program quicksort like this but somethings wrong. >> >> please help >> >> >> >>> partition <- function(x, links, rechts){ >>> >>> i <- links >>> j <- rechts >>> t <- 0 >>> pivot <- sample(x[i:j],1) >>> >>> while(i <= j){ >>> >>> while(x[i] <= pivot){ >>> i = i+1} >>> >>> while(x[j] >= pivot){ >>> j = j-1} >>> >>> if( i <= j){ >>> >>> t = x[i] >>> x[i] = x[j] >>> x[j] = t >>> >>> i=i+1 >>> j=j-1 >>> >>> } >>> print(pivot) >>> >>> >>> } >>> #Rekursion >>> >>> if(links < j){ >>> partition(x, links, j)} >>> if(i < rechts){ >>> partition(x, i, rechts)} >>> >>> return(x) >>> } >>> >>> >>> quicksort <- function(x){ >>> >>> >>> >>> partition(x, 1, length(x)) >>> } >> >> >> >> thx >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/what-is-wrong-with-my-quicksort-tp3788681p3788681.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- View this message in context: http://r.789695.n4.nabble.com/what-is-wrong-with-my-quicksort-tp3788681p3789080.html Sent from the R help mailing list archive at Nabble.com. From rosbreed.pba at gmail.com Sun Sep 4 15:25:02 2011 From: rosbreed.pba at gmail.com (John Clark) Date: Sun, 4 Sep 2011 09:25:02 -0400 Subject: [R] generating multiple dataset and applying function and output multiple output dataset...... Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jimmaasuk at gmail.com Sun Sep 4 15:20:35 2011 From: jimmaasuk at gmail.com (Jim Maas) Date: Sun, 04 Sep 2011 14:20:35 +0100 Subject: [R] procedure Power in module Math Message-ID: <4E637B23.3080103@uea.ac.uk> Hi All, I'm attempting to compile and run a BUGS model using OpenBUGS from R and getting this message. I've googled around not found much, it may well be a problem with OpenBUGS and not R. Any suggestions or clues as to how to find the problem would be welcome. ****** Sorry something went wrong in procedure Power in module Math ****** Thanks a bunch, Jim From nashjc at uottawa.ca Sun Sep 4 18:55:28 2011 From: nashjc at uottawa.ca (John C Nash) Date: Sun, 04 Sep 2011 12:55:28 -0400 Subject: [R] Gradients in optimx In-Reply-To: <2F9EA67EF9AE1C48A147CB41BE2E15C3061C21@DOM-EB-MAIL2.win.ad.jhu.edu> References: <2F9EA67EF9AE1C48A147CB41BE2E15C3061C21@DOM-EB-MAIL2.win.ad.jhu.edu> Message-ID: <4E63AD80.2@uottawa.ca> I've started to work on this again, and can confirm there seems to be some sort of bug in the gradient test at the beginning of the current R-forge version of optimx. It is not something obvious, and looks like a mixup in arguments to functions, which have been an issue since I've been trying to trap NaN and Inf returns. Worse, making the control starttests = FALSE fails because there I inadvertently put the initial function calculation inside the block that does the tests. Sigh. Will try to get something done by end of this week. (This will be R-forge version.) JN On 08/31/2011 09:31 AM, Ravi Varadhan wrote: > Hi Reuben, > > > > I am puzzled to note that the gradient check in ?optimx? does not work for you. Can you > send me a reproducible example so that I can figure this out? > > > > John ? I think the best solution for now is to issue a ?warning? rather than an error > message, when the numerical gradient is not sufficiently close to the user-specified > gradient. > > > > Best, > > Ravi. > > > > ------------------------------------------------------- > > Ravi Varadhan, Ph.D. > > Assistant Professor, > > Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University > > > > Ph. (410) 502-2619 > > email: rvaradhan at jhmi.edu > > > From descostes at ciml.univ-mrs.fr Sun Sep 4 14:17:18 2011 From: descostes at ciml.univ-mrs.fr (Nico902) Date: Sun, 4 Sep 2011 05:17:18 -0700 (PDT) Subject: [R] mclust: modelName="E" vs modelName="V" Message-ID: <1315138638856-3789167.post@n4.nabble.com> Hi, I'm trying to use the library mclust for gaussian mixture on a numeric vector. The function Mclust(data,G=3) is working fine but the fitting is not optimal and is using modelNames="E". When I'm trying Mclust(data,G=3,modelName="V") I have the following message: Error in if (Sumry$G > 1) ans[c(orderedNames, "z")] else ans[orderedNames] : argument is of length zero In addition: Warning message: In pickBIC(object[as.character(G), modelNames, drop = FALSE], k = 3) : none of the selected models could be fitted Using variable variance would fit my data better, any idea how to do it? Thanks a lot. -- View this message in context: http://r.789695.n4.nabble.com/mclust-modelName-E-vs-modelName-V-tp3789167p3789167.html Sent from the R help mailing list archive at Nabble.com. From J.Maas at uea.ac.uk Sun Sep 4 20:15:45 2011 From: J.Maas at uea.ac.uk (Maas James Dr (MED)) Date: Sun, 4 Sep 2011 19:15:45 +0100 Subject: [R] procedure Power in module Math Message-ID: <9C2B89830110BF4A845878D9A31F3D925AF4AF4FB4@UEAEXCHMBX.UEA.AC.UK> Hi All, I'm attempting to compile and run a BUGS model using OpenBUGS from R and getting this message. I've googled around not found much, it may well be a problem with OpenBUGS and not R. Any suggestions or clues as to how to find the problem would be welcome. I'm wondering if I'm missing a library on Linux or something? ****** Sorry something went wrong in procedure Power in module Math ****** Thanks a bunch, Jim =============================== Dr. Jim Maas University of East Anglia From ligges at statistik.tu-dortmund.de Sun Sep 4 20:28:58 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sun, 04 Sep 2011 20:28:58 +0200 Subject: [R] procedure Power in module Math In-Reply-To: <9C2B89830110BF4A845878D9A31F3D925AF4AF4FB4@UEAEXCHMBX.UEA.AC.UK> References: <9C2B89830110BF4A845878D9A31F3D925AF4AF4FB4@UEAEXCHMBX.UEA.AC.UK> Message-ID: <4E63C36A.4060909@statistik.tu-dortmund.de> On 04.09.2011 20:15, Maas James Dr (MED) wrote: > > Hi All, > > I'm attempting to compile and run a BUGS model using OpenBUGS from R and > getting this message. I've googled around not found much, it may well > be a problem with OpenBUGS and not R. It is, since you are just using an interface from R to BUGS. Therefore, please ask on the BUGS mailing list. Best, Uwe Ligges Any suggestions or clues as to > how to find the problem would be welcome. I'm wondering if I'm missing a library on Linux or something? > > ****** Sorry something went wrong in procedure Power in module Math ****** > > > Thanks a bunch, > > Jim > > > =============================== > Dr. Jim Maas > University of East Anglia > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From pdalgd at gmail.com Sun Sep 4 20:09:19 2011 From: pdalgd at gmail.com (peter dalgaard) Date: Sun, 4 Sep 2011 20:09:19 +0200 Subject: [R] what is wrong with my quicksort? In-Reply-To: <1315135080530-3789080.post@n4.nabble.com> References: <1315101080660-3788681.post@n4.nabble.com> <1315135080530-3789080.post@n4.nabble.com> Message-ID: <39E2DB22-0BC3-41F9-9659-136808CE06B4@gmail.com> On Sep 4, 2011, at 13:18 , warc wrote: > the error message I get is: > > Error in while (x[j] >= pivot) { : Argument has length 0 > > so either pivot or x[j] is NULL. > and it somestimes happens the first time the program enters the recursion, > sometimes the 6. or anywhere inbetween. > Well, then print out x, j, and pivot just before hitting that test (i.e., before the loop and at the end of it). With sample() in the code, you will naturally get different results at each run. It's your problem, so your debugging, but I'd wager that nothing is keeping j from hitting zero if you sample a pivot equal to min(x). > > > jholtman wrote: >> >> have you tried to debug it yourself. All you said is that 'it went >> wrong'. that is not a very clear statement of the problem. If I were to >> start looking at it, I would put some print statements in it to see what >> is happening on eachpath and with each set of data. Have you tried this? >> >> Sent from my iPad >> >> On Sep 3, 2011, at 21:51, warc <conny-clauss at gmx.de> wrote: >> >>> Hey guys, >>> I tried to program quicksort like this but somethings wrong. >>> >>> please help >>> >>> >>> >>>> partition <- function(x, links, rechts){ >>>> >>>> i <- links >>>> j <- rechts >>>> t <- 0 >>>> pivot <- sample(x[i:j],1) >>>> >>>> while(i <= j){ >>>> >>>> while(x[i] <= pivot){ >>>> i = i+1} >>>> >>>> while(x[j] >= pivot){ >>>> j = j-1} >>>> >>>> if( i <= j){ >>>> >>>> t = x[i] >>>> x[i] = x[j] >>>> x[j] = t >>>> >>>> i=i+1 >>>> j=j-1 >>>> >>>> } >>>> print(pivot) >>>> >>>> >>>> } >>>> #Rekursion >>>> >>>> if(links < j){ >>>> partition(x, links, j)} >>>> if(i < rechts){ >>>> partition(x, i, rechts)} >>>> >>>> return(x) >>>> } >>>> >>>> >>>> quicksort <- function(x){ >>>> >>>> >>>> >>>> partition(x, 1, length(x)) >>>> } >>> >>> >>> >>> thx >>> >>> -- >>> View this message in context: >>> http://r.789695.n4.nabble.com/what-is-wrong-with-my-quicksort-tp3788681p3788681.html >>> Sent from the R help mailing list archive at Nabble.com. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > -- > View this message in context: http://r.789695.n4.nabble.com/what-is-wrong-with-my-quicksort-tp3788681p3789080.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com "D?den skal tape!" --- Nordahl Grieg From awesolow at andrew.cmu.edu Sun Sep 4 20:26:23 2011 From: awesolow at andrew.cmu.edu (awesolow) Date: Sun, 4 Sep 2011 11:26:23 -0700 (PDT) Subject: [R] Coloring Dirichlet Tiles Message-ID: <1315160783676-3789746.post@n4.nabble.com> Hi, I have a set of x, y points (longitude/latitude) along with a z value representing an attribute at each point. I want to create a Voronoi/Dirichlet tesselation of these points coloring each tile by the z value. I tried searching for a way to solve this and it was suggested to use the dirichlet() command to get the correct coloring. However, my coloring is not correct. Any thoughts? Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/Coloring-Dirichlet-Tiles-tp3789746p3789746.html Sent from the R help mailing list archive at Nabble.com. From michael.weylandt at gmail.com Sun Sep 4 20:46:31 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Sun, 4 Sep 2011 13:46:31 -0500 Subject: [R] Bootstrapping a covariance matrix In-Reply-To: <1315089286925-3788553.post@n4.nabble.com> References: <1315089286925-3788553.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From friendly at yorku.ca Sun Sep 4 21:08:25 2011 From: friendly at yorku.ca (Michael Friendly) Date: Sun, 04 Sep 2011 15:08:25 -0400 Subject: [R] Lmer plot help In-Reply-To: <1315093971791-3788613.post@n4.nabble.com> References: <1315093971791-3788613.post@n4.nabble.com> Message-ID: <4E63CCA9.4090706@yorku.ca> On 9/3/2011 7:52 PM, drewmac wrote: > Hello all > > I'm running the lme4 package on my binomial data, and I'm happy with the > model and the resultant plot. However, I'd like to plot my table data, which > has: two IVs, and one DV. You can see an example below, where 'attractive' > = question (IV), male = condition(IV/predictor) and no/yes = answer (dv). > I'm using the table to investigate what questions act differently to the > others, so I can better fit my model. Going through tables of numbers > doesn't seem the most efficient way of instantly seeing what questions work > differently, and I'd like to plot that. Modulo the lme4 reference, for which you provide no data, details, code or context, you will probably find some suitable visualization methods in the vcd package, with a tutorial vignette and some extensions in the vcdExtra package. These include mosaic plots, fourfold plots, and a variety of specialized plots within the strucplot framework, which have close relations to models for n-way frequency tables. > > Here is my code: > > table(finaldata$Voice, finaldata$supportive, finaldata$question) > #generates my table# > > From your description above and the output below, it is not clear whether you just want to view the associations within this table or to compare the associations across the elided levels of finaldata$question. Maybe somethings like [untested] mytab <- table(finaldata$Voice, finaldata$supportive, finaldata$question) mosaic(Voice, supportive, data=mytab) mosaic(supportive ~ Voice|question, data=mytab) would get you started. Also, you have 4 levels for finaldata$Voice, which seem to imply that these might be a 2x2 combination of Voice.gender and Voice.type or something like that. > > > , , = attractive > > > no yes > male1 28 35 > male2 20 22 > female1 21 21 > female2 30 19 > > Any help most appreciated. > > Drew > > -- > View this message in context: http://r.789695.n4.nabble.com/Lmer-plot-help-tp3788613p3788613.html > Sent from the R help mailing list archive at Nabble.com. > -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA From markleeds2 at gmail.com Sun Sep 4 21:53:31 2011 From: markleeds2 at gmail.com (Mark Leeds) Date: Sun, 4 Sep 2011 15:53:31 -0400 Subject: [R] what is wrong with my quicksort? In-Reply-To: <1315135080530-3789080.post@n4.nabble.com> References: <1315101080660-3788681.post@n4.nabble.com> <1315135080530-3789080.post@n4.nabble.com> Message-ID: Hi: I sent this yesterday but it must be out there in hyperspace somewhere because it never appeared on the R-list. Also, I had to look quicksort up because it's been too long. Anyway, your code looks similar to the java code at http://www.algolist.net/Algorithms/Sorting/Quicksort. and I show it below. The difference is that you are calling partition recursively while the code below calls quicksort recursively. that probably makes a difference but I didn't test it. hopefully that's the problem. good luck. # quicksort Java code #================================================================ int partition(int arr[], int left, int right) { int i = left, j = right; int tmp; int pivot = arr[(left + right) / 2]; while (i <= j) { while (arr[i] < pivot) i++; while (arr[j] > pivot) j--; if (i <= j) { tmp = arr[i]; arr[i] = arr[j]; arr[j] = tmp; i++; j--; } }; return i; } void quickSort(int arr[], int left, int right) { int index = partition(arr, left, right); if (left < index - 1) quickSort(arr, left, index - 1); if (index < right) quickSort(arr, index, right); } On Sun, Sep 4, 2011 at 7:18 AM, warc wrote: > the error message I get is: > > ? ? ? ? ? Error in while (x[j] >= pivot) { : Argument has length 0 > > so either pivot or x[j] is NULL. > ?and it somestimes happens the first time the program enters the recursion, > sometimes the 6. or anywhere inbetween. > > > > jholtman wrote: >> >> have you tried to debug it yourself. ?All you said is that 'it went >> wrong'. ?that is not a very clear statement of the problem. ?If I were to >> start looking at it, I would put some print statements in it to see what >> is happening on eachpath and with each set of data. ?Have you tried this? >> >> Sent from my iPad >> >> On Sep 3, 2011, at 21:51, warc <conny-clauss at gmx.de> wrote: >> >>> Hey guys, >>> I tried to program quicksort like this but somethings wrong. >>> >>> please help >>> >>> >>> >>>> partition <- function(x, links, rechts){ >>>> >>>> ? ?i <- links >>>> ? ?j <- rechts >>>> ? ?t <- 0 >>>> ? ?pivot <- sample(x[i:j],1) >>>> >>>> ? ?while(i <= j){ >>>> >>>> ? ? ? ?while(x[i] <= pivot){ >>>> ? ? ? ? ? ?i = i+1} >>>> >>>> ? ? ? ?while(x[j] >= pivot){ >>>> ? ? ? ? ? ?j = j-1} >>>> >>>> ? ? ? ?if( i <= j){ >>>> >>>> ? ? ? ? ? ?t = x[i] >>>> ? ? ? ? ? ?x[i] = x[j] >>>> ? ? ? ? ? ?x[j] = t >>>> >>>> ? ? ? ? ? ?i=i+1 >>>> ? ? ? ? ? ?j=j-1 >>>> >>>> ? ? ? ? ? ?} >>>> ? ? ? ? ? ?print(pivot) >>>> >>>> >>>> ? ? ? ?} >>>> ? ?#Rekursion >>>> >>>> ? ?if(links < j){ >>>> ? ? ? ?partition(x, links, j)} >>>> ? ?if(i < rechts){ >>>> ? ? ? ?partition(x, i, rechts)} >>>> >>>> ? ?return(x) >>>> ? ?} >>>> >>>> >>>> quicksort <- function(x){ >>>> >>>> >>>> >>>> ? ? ? ?partition(x, 1, length(x)) >>>> } >>> >>> >>> >>> thx >>> >>> -- >>> View this message in context: >>> http://r.789695.n4.nabble.com/what-is-wrong-with-my-quicksort-tp3788681p3788681.html >>> Sent from the R help mailing list archive at Nabble.com. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > -- > View this message in context: http://r.789695.n4.nabble.com/what-is-wrong-with-my-quicksort-tp3788681p3789080.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From conny-clauss at gmx.de Sun Sep 4 22:08:20 2011 From: conny-clauss at gmx.de (warc) Date: Sun, 4 Sep 2011 13:08:20 -0700 (PDT) Subject: [R] what is wrong with my quicksort? In-Reply-To: <1315101080660-3788681.post@n4.nabble.com> References: <1315101080660-3788681.post@n4.nabble.com> Message-ID: <1315166900062-3789902.post@n4.nabble.com> Hey again and thanks for all the help this is what i have for now but it still doesn't work, the main problem is the random pivot i think (error in while (x[j] >= pivot) { : Argument has length 0) >partition <- function(x, links, rechts){ > > i <- links > j <- rechts > t <- 0 > pivot <- x[sample((links:rechts),1)] > > > while(i <= j){ > > while(x[i] <= pivot){ > i = i+1} > > while(x[j] >= pivot){ > j = j-1} > > if( i <= j){ > > > t = x[i] > x[i] = x[j] > x[j] = t > > i=i+1 > j=j-1 > > } > } > return(pivot) > } > >qsort <- function(x, links, rechts){ > > index <- partition(x, links, rechts) > > if((links < (index+1))&(length(x)>1)){ > qsort(x, links, index+1)} > > > if((index < rechts)&(length(x)>1)){ > qsort(x, index, rechts)} > > return(x) > } > > >quicksort <- function(x){ > > if(length(x) == 0)stop("empty Vector") > > qsort(x, 1, length(x)) >} but whatever i will just keep on trying thank you again -- View this message in context: http://r.789695.n4.nabble.com/what-is-wrong-with-my-quicksort-tp3788681p3789902.html Sent from the R help mailing list archive at Nabble.com. From markleeds2 at gmail.com Sun Sep 4 22:46:38 2011 From: markleeds2 at gmail.com (Mark Leeds) Date: Sun, 4 Sep 2011 16:46:38 -0400 Subject: [R] what is wrong with my quicksort? In-Reply-To: <1315166900062-3789902.post@n4.nabble.com> References: <1315101080660-3788681.post@n4.nabble.com> <1315166900062-3789902.post@n4.nabble.com> Message-ID: hi: the link i sent presents a data example and steps through it with beautiful figures for each step. why don't you take their data example,. use it in your code and put browser() at the top if your partition function so you can step using n. your steps should result in EXACTLY the same steps given in the link. this way you can see what's happening with a working example. On Sun, Sep 4, 2011 at 4:08 PM, warc wrote: > Hey again and thanks for all the help > > this is what i have for now but it still doesn't work, the main problem is > the random pivot i think > (error in while (x[j] >= pivot) { : Argument has length 0) > >>partition <- function(x, links, rechts){ >> >> ? ? ? i <- links >> ? ? ? j <- rechts >> ? ? ? t <- 0 >> ? ? ? pivot <- x[sample((links:rechts),1)] >> >> >> ? ? ? while(i <= j){ >> >> ? ? ? ? ? ? ? while(x[i] <= pivot){ >> ? ? ? ? ? ? ? ? ? ? ?i = i+1} >> >> ? ? ? ? ? ? ? while(x[j] >= pivot){ >> ? ? ? ? ? ? ? ? ? ? ? j = j-1} >> >> ? ? ? ? ? ? ? if( i <= j){ >> >> >> ? ? ? ? ? ? ? ? ? ? ? t = x[i] >> ? ? ? ? ? ? ? ? ? ? ? x[i] = x[j] >> ? ? ? ? ? ? ? ? ? ? ? x[j] = t >> >> ? ? ? ? ? ? ? ? ? ? ? i=i+1 >> ? ? ? ? ? ? ? ? ? ? ? j=j-1 >> >> ? ? ? ? ? ? ? ? ? ? ? } >> ? ? ? ? ? ? ? } >> ? ? ? ? ? ? ? return(pivot) >> ? ? ? ? ? ? ? } >> >>qsort <- function(x, links, rechts){ >> >> ? ? ? index <- partition(x, links, rechts) >> >> ? ? ? if((links < (index+1))&(length(x)>1)){ >> ? ? ? ? ? ? ? qsort(x, links, index+1)} >> >> >> ? ? ? if((index < rechts)&(length(x)>1)){ >> ? ? ? ? ? ? ? qsort(x, index, rechts)} >> >> ? ? ? return(x) >> ? ? ? } >> >> >>quicksort <- function(x){ >> >> ? ? ? ? ? ? ? if(length(x) == 0)stop("empty Vector") >> >> ? ? ? ? ? ? ? qsort(x, 1, length(x)) >>} > > > > but whatever > i will just keep on trying > > thank you again > > -- > View this message in context: http://r.789695.n4.nabble.com/what-is-wrong-with-my-quicksort-tp3788681p3789902.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From baptiste.auguie at googlemail.com Sun Sep 4 22:59:29 2011 From: baptiste.auguie at googlemail.com (baptiste auguie) Date: Mon, 5 Sep 2011 08:59:29 +1200 Subject: [R] Coloring Dirichlet Tiles In-Reply-To: <1315160783676-3789746.post@n4.nabble.com> References: <1315160783676-3789746.post@n4.nabble.com> Message-ID: Hi, Try this, d <- data.frame(x=runif(1e3, 0, 30), y=runif(1e3, 0, 30)) d$z = (d$x - 15)^2 + (d$y - 15)^2 library(spatstat) library(maptools) W <- ripras(df, shape="rectangle") W <- owin(c(0, 30), c(0, 30)) X <- as.ppp(d, W=W) Y <- dirichlet(X) Z <- as(Y, "SpatialPolygons") plot(Z, col=grey(d$z/max(d$z))) and also panel.voronoi in latticeExtra. Unfortunately I do not know of a solution that uses more efficient algorithms for computing the Dirichlet tessellation and extracting tiles than those relying on deldir. The Triangle package (r-forge) looks promising. HTH, baptiste On 5 September 2011 06:26, awesolow wrote: > Hi, > > I have a set of x, y points (longitude/latitude) along with a z value > representing an attribute at each point. ?I want to create a > Voronoi/Dirichlet tesselation of these points coloring each tile by the z > value. ?I tried searching for a way to solve this and it was suggested to > use the dirichlet() command to get the correct coloring. ?However, my > coloring is not correct. > > Any thoughts? > > Thanks in advance. > > -- > View this message in context: http://r.789695.n4.nabble.com/Coloring-Dirichlet-Tiles-tp3789746p3789746.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From chrish at stats.ucl.ac.uk Sun Sep 4 23:14:57 2011 From: chrish at stats.ucl.ac.uk (Christian Hennig) Date: Sun, 4 Sep 2011 22:14:57 +0100 (BST) Subject: [R] mclust: modelName="E" vs modelName="V" In-Reply-To: <1315138638856-3789167.post@n4.nabble.com> References: <1315138638856-3789167.post@n4.nabble.com> Message-ID: This normally happens if the algorithm gets caught in a solution where one of the components has variance converging to zero. One way of dealing with this is the use of a prior that penalises too small variances. This works through the prior argument of Mclust (the defaultPrior should do the trick but I currently don't have the time to figure out again how to do this precisely; I have done it before with success). Another option is to have a look at the flexmix package. Best regards, Christian On Sun, 4 Sep 2011, Nico902 wrote: > Hi, > > I'm trying to use the library mclust for gaussian mixture on a numeric > vector. The function Mclust(data,G=3) is working fine but the fitting is not > optimal and is using modelNames="E". When I'm trying > Mclust(data,G=3,modelName="V") I have the following message: > > Error in if (Sumry$G > 1) ans[c(orderedNames, "z")] else ans[orderedNames] : > argument is of length zero > In addition: Warning message: > In pickBIC(object[as.character(G), modelNames, drop = FALSE], k = 3) : > none of the selected models could be fitted > > > Using variable variance would fit my data better, any idea how to do it? > > Thanks a lot. > > -- > View this message in context: http://r.789695.n4.nabble.com/mclust-modelName-E-vs-modelName-V-tp3789167p3789167.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche From rock.ouimet at gmail.com Sun Sep 4 22:38:02 2011 From: rock.ouimet at gmail.com (RockO) Date: Sun, 4 Sep 2011 13:38:02 -0700 (PDT) Subject: [R] ROCR package question for evaluating two regression models In-Reply-To: <1315009964.93061.YahooMailClassic@web120609.mail.ne1.yahoo.com> References: <1315009964.93061.YahooMailClassic@web120609.mail.ne1.yahoo.com> Message-ID: <1315168682507-3789946.post@n4.nabble.com> Hi Andra, I have been doing some ROC analysis for a new diagnosis test. I used the pROC package to assess thresholds and compare different diagnosis tests to a "gold standard". In your case, let say the gold standard are the observed values y0. Here is an example: y0 <- sample(0:1,50,replace=TRUE) # Make observed binomial values test1<-sample(0:100,50,replace=TRUE)/100 y1 <- ifelse(y0==0,test,1-test) # Make first predicted model values test2<-sample(0:100,50,replace=TRUE)/100 y2 <- ifelse(y0==0,test,1-test) # make 2nd predicted model values library(pROC) i1<-roc(response=y0,predictor=y1,percent=TRUE, plot=TRUE, of="threshold",ci=T, lwd=1,lty=2,thresholds="best", asp=1) i2<-roc(response=y0,predictor=y2,percent=TRUE, plot=TRUE, of="threshold",ci=T, lwd=1,lty=3,thresholds="best", add=T) coords(i1,x="best",best.method="youden") # Best threshold of y1 with the Youden index coords(i2,x="best",best.method="youden") # Best threshold of y1 with the Youden index roc.test(i1,i2) # Compare the performance of the best threshold of y1 and y2 See ?pROC for more details. Hope this help, Rock -- View this message in context: http://r.789695.n4.nabble.com/ROCR-package-question-for-evaluating-two-regression-models-tp3787301p3789946.html Sent from the R help mailing list archive at Nabble.com. From sharma.ram.h at gmail.com Sun Sep 4 23:20:35 2011 From: sharma.ram.h at gmail.com (Ram H. Sharma) Date: Sun, 4 Sep 2011 17:20:35 -0400 Subject: [R] output and save multiple dataset from a function: sorry I could not figure out this.... Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From andrew.macfarlane at pg.canterbury.ac.nz Sun Sep 4 23:50:28 2011 From: andrew.macfarlane at pg.canterbury.ac.nz (Andrew MacFarlane) Date: Mon, 05 Sep 2011 09:50:28 +1200 Subject: [R] Lmer plot help References: <1315093971791-3788613.post@n4.nabble.com> <4E63CCA9.4090706@yorku.ca> Message-ID: <320E04E2113DE340B5ACC6B48492CB29162533@ucexchange7.canterbury.ac.nz> Thanks for that Michael. My model code, and resultant output, is: model = lmer(supportive ~ Voice + question +(1|participant), data=voice, family="binomial") #Fixed effects: # Estimate Std. Error z value Pr(>|z|) #(Intercept) 1.3981 0.2140 6.534 6.42e-11 *** #Voice1 0.5781 0.2340 2.470 0.0135 * #Voice2 0.5189 0.2333 2.225 0.0261 * #Voice3 0.2275 0.2205 1.032 0.3022 #Voice4 0.3760 0.2215 1.698 0.0896 . #question1 -2.2065 0.2213 -9.970 < 2e-16 *** #question2 -1.6641 0.2188 -7.605 2.86e-14 *** #question3 -1.1896 0.2211 -5.380 7.46e-08 *** plotLMER.fnc(model, pred="Voice") If I add question to that plot (e.g.plotLMER.fnc(model, pred="Voice, "question"") , then it looks very messy and essentially unreadable. I'm looking at how the voice 1) influences support/non-support for the questions. I have already excluded sex/age/ethnicity from my analysis to better fit the model. I'm not near R just now, but look forward to trying your suggestions. While it's in my head, do you know a method for asking the lme to list *ALL* the IVs? I have 5 voices (and 4 questions), but it lists the effects of 4 voices/3questions, similarly if I run summary (model). Best Andrew E. MacFarlane PhD student New Zealand Institute of Language, Brain and Behaviour University of Canterbury | Private Bag 4800 Christchurch | New Zealand 8140 http://www.nzilbb.canterbury.ac.nz/macfarlane.shtml ________________________________ From: Michael Friendly [mailto:friendly at yorku.ca] Sent: Mon 5/09/2011 7:08 a.m. To: Andrew MacFarlane Cc: r-help at r-project.org Subject: Re: Lmer plot help On 9/3/2011 7:52 PM, drewmac wrote: > Hello all > > I'm running the lme4 package on my binomial data, and I'm happy with the > model and the resultant plot. However, I'd like to plot my table data, which > has: two IVs, and one DV. You can see an example below, where 'attractive' > = question (IV), male = condition(IV/predictor) and no/yes = answer (dv). > I'm using the table to investigate what questions act differently to the > others, so I can better fit my model. Going through tables of numbers > doesn't seem the most efficient way of instantly seeing what questions work > differently, and I'd like to plot that. Modulo the lme4 reference, for which you provide no data, details, code or context, you will probably find some suitable visualization methods in the vcd package, with a tutorial vignette and some extensions in the vcdExtra package. These include mosaic plots, fourfold plots, and a variety of specialized plots within the strucplot framework, which have close relations to models for n-way frequency tables. > > Here is my code: > > table(finaldata$Voice, finaldata$supportive, finaldata$question) > #generates my table# > > From your description above and the output below, it is not clear whether you just want to view the associations within this table or to compare the associations across the elided levels of finaldata$question. Maybe somethings like [untested] mytab <- table(finaldata$Voice, finaldata$supportive, finaldata$question) mosaic(Voice, supportive, data=mytab) mosaic(supportive ~ Voice|question, data=mytab) would get you started. Also, you have 4 levels for finaldata$Voice, which seem to imply that these might be a 2x2 combination of Voice.gender and Voice.type or something like that. > > > , , = attractive > > > no yes > male1 28 35 > male2 20 22 > female1 21 21 > female2 30 19 > > Any help most appreciated. > > Drew > > -- > View this message in context: http://r.789695.n4.nabble.com/Lmer-plot-help-tp3788613p3788613.html > Sent from the R help mailing list archive at Nabble.com. > -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA This email may be confidential and subject to legal privilege, it may not reflect the views of the University of Canterbury, and it is not guaranteed to be virus free. If you are not an intended recipient, please notify the sender immediately and erase all copies of the message and any attachments. Please refer to http://www.canterbury.ac.nz/emaildisclaimer for more information. From joonsumin at gmail.com Mon Sep 5 00:25:41 2011 From: joonsumin at gmail.com (joonsum) Date: Sun, 4 Sep 2011 15:25:41 -0700 (PDT) Subject: [R] combining rows Message-ID: <1315175141964-3790068.post@n4.nabble.com> First time using R and have so many basic questions. The problem that I have confronted is combining rows. I have a data frame that contains daily rain falls from 60 to 80. There are 27 columns which are Year,month, day, and record in hours. I am trying to combine the 4th column to the 27th to get daily rain fall data. rowSums() works in the case of merging all rows but in my case, I need to be selective. How should I start? -- View this message in context: http://r.789695.n4.nabble.com/combining-rows-tp3790068p3790068.html Sent from the R help mailing list archive at Nabble.com. From Bettina.Gruen at jku.at Mon Sep 5 00:44:08 2011 From: Bettina.Gruen at jku.at (Bettina Gruen) Date: Mon, 05 Sep 2011 08:44:08 +1000 Subject: [R] betareg question - keeping the mean fixed? In-Reply-To: <1314955240369-3785683.post@n4.nabble.com> References: <1314876482055-3783303.post@n4.nabble.com> <4E5F8793.1090308@jku.at> <1314955240369-3785683.post@n4.nabble.com> Message-ID: <4E63FF38.9090709@jku.at> On 09/02/2011 07:20 PM, betty_d wrote: > Thanks for your response, that does work, however, it is still not quite what > want. I would like to tell betareg what the mean is (in my case, 0.5) and > force it to use that value. Is this possible? AFAIK package betareg currently does not allow you to fix the mean and only estimate the precision parameters. Best, Bettina -- ------------------------------------------------------------------- Bettina Gr?n Institut f?r Angewandte Statistik / IFAS Johannes Kepler Universit?t Linz Altenbergerstra?e 69 4040 Linz, Austria Tel: +43 732 2468-5889 Fax: +43 732 2468-9846 E-Mail: Bettina.Gruen at jku.at www.ifas.jku.at From kang.tu.rfan at gmail.com Mon Sep 5 02:16:31 2011 From: kang.tu.rfan at gmail.com (Kang Tu) Date: Sun, 04 Sep 2011 17:16:31 -0700 Subject: [R] combining rows In-Reply-To: <1315175141964-3790068.post@n4.nabble.com> References: <1315175141964-3790068.post@n4.nabble.com> Message-ID: <1315181791.9235.3.camel@K-studio> Would you mind show us a simple example of your data? It is hard to understand your request directly from your text. If you just want to combine the data, you can try cbind() function directly, or you can use subset() function to get a subset of your data.frame. If you want to selectively aggregate some statistics you can try aggregate() function. If you want a more complex aggregation, you may want to try ddply() in 'plyr' package. On Sun, 2011-09-04 at 15:25 -0700, joonsum wrote: > First time using R and have so many basic questions. > > The problem that I have confronted is combining rows. I have a data frame > that contains daily rain falls from 60 to 80. There are 27 columns which are > Year,month, day, and record in hours. > > I am trying to combine the 4th column to the 27th to get daily rain fall > data. > rowSums() works in the case of merging all rows but in my case, I need to be > selective. How should I start? > > > -- > View this message in context: http://r.789695.n4.nabble.com/combining-rows-tp3790068p3790068.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From carl at witthoft.com Mon Sep 5 02:23:04 2011 From: carl at witthoft.com (Carl Witthoft) Date: Sun, 04 Sep 2011 20:23:04 -0400 Subject: [R] question with uniroot function Message-ID: <4E641668.4030405@witthoft.com> As others will tell you: you need to provide a reproducible example. What are p1, u1, u2 ? Dear all, I have the following problem with the uniroot function. I want to find roots for the fucntion "Fp2" which is defined as below. Fz <- function(z){0.8*pnorm(z)+p1*pnorm(z-u1)+(0.2-p1)*pnorm(z-u2)} Fp <- function(t){(1-Fz(abs(qnorm(1-(t/2)))))+(Fz(-abs(qnorm(1-(t/2)))))} Fp2 <- function(t) {Fp(t)-0.8*t/alpha} th <- uniroot(Fp2, lower =0, upper =1, tol = 0.0001)$root The result is 0 as shown below. > th [1] 0 However, there should be a root between 0.00952 and 0.00955, since the function values are of opposite signs as below. > Fp2(0.00952) [1] 2.264272e-05 > Fp2(0.00955) [1] -0.0003657404 Can any one give me a hand here? Thanks a lot. Hannah -- ----- Sent from my Cray XK6 From michael.weylandt at gmail.com Mon Sep 5 02:35:57 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Sun, 4 Sep 2011 19:35:57 -0500 Subject: [R] combining rows In-Reply-To: <1315181791.9235.3.camel@K-studio> References: <1315175141964-3790068.post@n4.nabble.com> <1315181791.9235.3.camel@K-studio> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Mon Sep 5 04:13:46 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 4 Sep 2011 22:13:46 -0400 Subject: [R] combining rows In-Reply-To: References: <1315175141964-3790068.post@n4.nabble.com> <1315181791.9235.3.camel@K-studio> Message-ID: <55B016CE-E5E1-4F1B-9990-492213BBE71C@comcast.net> On Sep 4, 2011, at 8:35 PM, R. Michael Weylandt wrote: > Kang Tu is right and you will certainly need to learn the techniques > being > suggesting to do more advanced data analysis, but it sounds like for > your > immediate problem the following might work: > > # Suppose df is your data frame > > > This takes a sub-data-frame consisting only of columns 4 through 27 > and does > rowSums on them. And to take this a bit further you could record those results in the same dataframe with: df$dailySums <- rowSums(df[,4:27]) And print them out with: df[ , c("Year", "Month", "day", dailySums") ] Or assign them to a daySummary dayRainSummary <- df[ , c("Year", "Month", "day", dailySims") ] dayRainSummary$date <- with(dayRainSummary, as.POSIXct(paste(Year, Month, day, sep="-"), origin="1970-01-01") ) with(dayRainSummary, plot(date, dailySums) ) save(dayRainSummary, file="daySummary.rda") > > Hope this helps, > > Michael Weylandt > > On Sun, Sep 4, 2011 at 7:16 PM, Kang Tu > wrote: > >> Would you mind show us a simple example of your data? It is hard to >> understand your request directly from your text. >> >> If you just want to combine the data, you can try cbind() function >> directly, or you can use subset() function to get a subset of your >> data.frame. If you want to selectively aggregate some statistics >> you can >> try aggregate() function. If you want a more complex aggregation, you >> may want to try ddply() in 'plyr' package. >> >> On Sun, 2011-09-04 at 15:25 -0700, joonsum wrote: >>> First time using R and have so many basic questions. >>> >>> The problem that I have confronted is combining rows. I have a >>> data frame >>> that contains daily rain falls from 60 to 80. There are 27 columns >>> which >> are >>> Year,month, day, and record in hours. >>> >>> I am trying to combine the 4th column to the 27th to get daily >>> rain fall >>> data. >>> rowSums() works in the case of merging all rows but in my case, I >>> need to >> be >>> selective. How should I start? >>> >>> >>> -- >>> View this message in context: >> http://r.789695.n4.nabble.com/combining-rows-tp3790068p3790068.html >>> Sent from the R help mailing list archive at Nabble.com. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From jwelsh at sdibr.org Mon Sep 5 04:02:56 2011 From: jwelsh at sdibr.org (John Welsh) Date: Sun, 4 Sep 2011 19:02:56 -0700 (PDT) Subject: [R] savePlot with % in character string Message-ID: <1315188176429-3790227.post@n4.nabble.com> This occurred after I installed R x64 2.13.1 on Windows: savePlot("95%.winners.wmf") saves the file as: "951nners.wmf" Is this the correct behavior, or have I bungled something? John Welsh, Ph.D. Associate Professor Molecular and Cancer Biology Vaccine Research Institute of San Diego 10835 Road to the Cure San Diego, CA 92121 Phone: (858) 581-3960 ex.248 Email: jwelsh at sdibr.org -- View this message in context: http://r.789695.n4.nabble.com/savePlot-with-in-character-string-tp3790227p3790227.html Sent from the R help mailing list archive at Nabble.com. From rolf.turner at xtra.co.nz Mon Sep 5 02:51:14 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Mon, 05 Sep 2011 12:51:14 +1200 Subject: [R] Coloring Dirichlet Tiles In-Reply-To: <1315160783676-3789746.post@n4.nabble.com> References: <1315160783676-3789746.post@n4.nabble.com> Message-ID: <4E641D02.6070107@xtra.co.nz> On 05/09/11 06:26, awesolow wrote: > Hi, > > I have a set of x, y points (longitude/latitude) along with a z value > representing an attribute at each point. I want to create a > Voronoi/Dirichlet tesselation of these points coloring each tile by the z > value. I tried searching for a way to solve this and it was suggested to > use the dirichlet() command to get the correct coloring. However, my > coloring is not correct. > > Any thoughts? Does something like this do what you want? require(deldir) set.seed(42) x <- runif(20) y <- runif(20) z <- sample(1:6,20,TRUE) d <- deldir(x,y) td <- tile.list(d) plot(td,polycol=z,close=TRUE) Note that the tiling assumes planar Euclidean distance, which might create distortion in dealing with points on a sphere if the region in which you observe the points is large. cheers, Rolf Turner From jholtman at gmail.com Mon Sep 5 06:22:37 2011 From: jholtman at gmail.com (jim holtman) Date: Mon, 5 Sep 2011 00:22:37 -0400 Subject: [R] savePlot with % in character string In-Reply-To: <1315188176429-3790227.post@n4.nabble.com> References: <1315188176429-3790227.post@n4.nabble.com> Message-ID: Try putting "%%" (two percent signs). A feature of the plots is that they use "%" for the plot number; e.g., win.metafile("Rplot%02d.wmf", pointsize = 10) look at the help page for 'tiff' On Sun, Sep 4, 2011 at 10:02 PM, John Welsh wrote: > > This occurred after I installed R x64 2.13.1 on Windows: > > savePlot("95%.winners.wmf") > > saves the file as: > > "951nners.wmf" > > Is this the correct behavior, or have I bungled something? > > > John Welsh, Ph.D. > Associate Professor > Molecular and Cancer Biology > Vaccine Research Institute of San Diego > 10835 Road to the Cure > San Diego, CA 92121 > Phone: (858) 581-3960 ex.248 > Email: jwelsh at sdibr.org > > -- > View this message in context: http://r.789695.n4.nabble.com/savePlot-with-in-character-string-tp3790227p3790227.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From jholtman at gmail.com Mon Sep 5 06:29:39 2011 From: jholtman at gmail.com (jim holtman) Date: Mon, 5 Sep 2011 00:29:39 -0400 Subject: [R] output and save multiple dataset from a function: sorry I could not figure out this.... In-Reply-To: References: Message-ID: try this: #my data myseed <- c(1001:1030) gend <- function(x){ set.seed(x) var <- rep(1:4, c(rep(4, 4))) vary <- rnorm(length(var), 50, 10) mat <- matrix(sample(c(-1,0,1), c(10*length(var)), replace = TRUE), ncol = 10) mydat <- data.frame(var, vary, mat) filename = paste("file", x, ".RData", sep="") save(mydat, file = filename) } lapply (myseed, gend) On Sun, Sep 4, 2011 at 5:20 PM, Ram H. Sharma wrote: > Dear list: > Before going into my problem, R list has been?awesome?for me ...thank you > for the help. I have a simple problem, however I could get a answer to it... > #my data > myseed <- c(1001:1030) > gend <- function(x){ > set.seed(x) > var <- rep(1:4, c(rep(4, 4))) > vary <- rnorm(length(var), 50, 10) > mat <- matrix(sample(c(-1,0,1), c(10*length(var)), replace = TRUE), ncol = > 10) > mydat <- data.frame(var, vary, mat) > #filename = paste("file", x, ".txt", sep="") > #save(mydat, list = filename, file = "my.Rdata") > } > lapply (myseed, ?gend) > This works and I can create 30 random data set in the list. So far so good. > My individual data are huge ( 1 million datapoints each) ?and I need to > generate > 2000?data set. So it may not be good strategy to write as *.csv > file and read with take a lot of computing time. > My attempt is to output and save individual Rdata sets and load it using > load ("my.Rdata") function. It is more fast this way. But could not figure > out how to do it !!!!!!!!!!!!!!! > In above function I have put # some of my attempt to solve this problem.... > Any suggestions ....thank you... > > - > > Ram H > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From Achim.Zeileis at uibk.ac.at Mon Sep 5 08:20:43 2011 From: Achim.Zeileis at uibk.ac.at (Achim Zeileis) Date: Mon, 5 Sep 2011 08:20:43 +0200 (CEST) Subject: [R] betareg question - keeping the mean fixed? In-Reply-To: <4E63FF38.9090709@jku.at> References: <1314876482055-3783303.post@n4.nabble.com> <4E5F8793.1090308@jku.at> <1314955240369-3785683.post@n4.nabble.com> <4E63FF38.9090709@jku.at> Message-ID: On Mon, 5 Sep 2011, Bettina Gruen wrote: > On 09/02/2011 07:20 PM, betty_d wrote: >> Thanks for your response, that does work, however, it is still not quite >> what >> want. I would like to tell betareg what the mean is (in my case, 0.5) and >> force it to use that value. Is this possible? > > AFAIK package betareg currently does not allow you to fix the mean and only > estimate the precision parameters. That is also my impression. When I wrote the code, I had in mind that there should be at least one parameter to estimate in each of the components (mean/precision). I'll have a look if that can be changed easily. Best, Z > Best, > Bettina > > -- > ------------------------------------------------------------------- > Bettina Gr?n > Institut f?r Angewandte Statistik / IFAS > Johannes Kepler Universit?t Linz > Altenbergerstra?e 69 > 4040 Linz, Austria > > Tel: +43 732 2468-5889 > Fax: +43 732 2468-9846 > E-Mail: Bettina.Gruen at jku.at > www.ifas.jku.at > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From rroa at azti.es Mon Sep 5 08:58:53 2011 From: rroa at azti.es (=?iso-8859-1?Q?Rub=E9n_Roa?=) Date: Mon, 5 Sep 2011 08:58:53 +0200 Subject: [R] Gradients in optimx In-Reply-To: <4E63AD80.2@uottawa.ca> References: <2F9EA67EF9AE1C48A147CB41BE2E15C3061C21@DOM-EB-MAIL2.win.ad.jhu.edu> <4E63AD80.2@uottawa.ca> Message-ID: <5CD78996B8F8844D963C875D3159B94A028C0E79@DSRCORREO.azti.local> Well, I guess this doesn't make necessary for me to prepare a report with the CatDyn package. However, I am available to test a new optimx R-forge version with my package. Cheers Rub?n -- Dr. Ruben H. Roa-Ureta Senior Researcher, AZTI Tecnalia, Marine Research Division, Txatxarramendi Ugartea z/g, 48395, Sukarrieta, Bizkaia, Spain -----Mensaje original----- De: John C Nash [mailto:nashjc at uottawa.ca] Enviado el: domingo, 04 de septiembre de 2011 18:55 Para: Ravi Varadhan CC: Rub?n Roa; 'kathryn.lord2000 at gmail.com'; 'r-help at r-project.org' Asunto: Re: Gradients in optimx I've started to work on this again, and can confirm there seems to be some sort of bug in the gradient test at the beginning of the current R-forge version of optimx. It is not something obvious, and looks like a mixup in arguments to functions, which have been an issue since I've been trying to trap NaN and Inf returns. Worse, making the control starttests = FALSE fails because there I inadvertently put the initial function calculation inside the block that does the tests. Sigh. Will try to get something done by end of this week. (This will be R-forge version.) JN On 08/31/2011 09:31 AM, Ravi Varadhan wrote: > Hi Reuben, > > > > I am puzzled to note that the gradient check in "optimx" does not work > for you. Can you send me a reproducible example so that I can figure this out? > > > > John - I think the best solution for now is to issue a "warning" > rather than an error message, when the numerical gradient is not > sufficiently close to the user-specified gradient. > > > > Best, > > Ravi. > > > > ------------------------------------------------------- > > Ravi Varadhan, Ph.D. > > Assistant Professor, > > Division of Geriatric Medicine and Gerontology School of Medicine > Johns Hopkins University > > > > Ph. (410) 502-2619 > > email: rvaradhan at jhmi.edu > > > From yvonnick.noel at uhb.fr Mon Sep 5 09:25:32 2011 From: yvonnick.noel at uhb.fr (Yvonnick Noel) Date: Mon, 05 Sep 2011 09:25:32 +0200 Subject: [R] Hysteresis modeling and simulation Message-ID: <4E64796C.2040107@uhb.fr> Hi Bill, I once modelled a hysteresis phenomenon (on binary data) with a simple logistic model. I am not sure I understand how this pattern appears in your data, but in my previous analyses, it appeared as an order effect: The response increased in probability later with increasing than with decreasing values of the predictor. I then simply created a binary variable for the decreasing and increasing conditions, and the coefficient on this variable was a direct and testable measure of hysteresis. In some cases, you can directly model the bimodal conditional distribution of the response. This is what I did here with a beta distribution for continuous bounded responses: http://webcolleges.uva.nl/mediasite/Viewer/?peid=c7a7b041327f4db09dc2fc3a7872aa5a1d HTH, Best, Yvonnick Noel University of Brittany, Rennes 2 France From emailvessel at gmail.com Mon Sep 5 09:40:11 2011 From: emailvessel at gmail.com (Katerina Karayianni) Date: Mon, 5 Sep 2011 10:40:11 +0300 Subject: [R] calling R from PHP error In-Reply-To: <4E625F5E.30702@statistik.tu-dortmund.de> References: <4E625F5E.30702@statistik.tu-dortmund.de> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From c.timmermann at yahoo.de Mon Sep 5 10:04:15 2011 From: c.timmermann at yahoo.de (Christian Timmermann) Date: Mon, 5 Sep 2011 09:04:15 +0100 (BST) Subject: [R] Stemming functions only work on the last word of plain text documents Message-ID: <1315209855.71545.YahooMailNeo@web27101.mail.ukl.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From deepayan.sarkar at gmail.com Mon Sep 5 11:26:10 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Mon, 5 Sep 2011 14:56:10 +0530 Subject: [R] Getting the values out of histogram (lattice) In-Reply-To: <4E5F112B.5090803@xtra.co.nz> References: <201108312301.p7VN1O2N007997@mail12.tpg.com.au> <4E5F112B.5090803@xtra.co.nz> Message-ID: On Thu, Sep 1, 2011 at 10:29 AM, Rolf Turner wrote: > > 'Scuse me, but I don't see anything in your example relating to what the OP > asked for. She wanted to get at the ``actual data defining the histogram'', > which > I interpret as meaning the bar heights (the percentages, density values, or > counts, > depending on "type"). These do not appeared to be stored in the object > returned > by histogram(). A couple of additional comments: 1. The `official' way to get panel arguments is trellis.panelArgs(); e.g., > p <- histogram(~rnorm(100) | gl(2, 50), type = "density") > str(trellis.panelArgs(p, 2)) List of 5 $ x : num [1:50] 0.277 1.144 1.13 -0.912 -0.892 ... $ breaks : num [1:9] -2.561 -1.979 -1.398 -0.816 -0.234 ... $ type : chr "density" $ equal.widths: logi TRUE $ nint : num 8 2. hist.constructor() is needed for technical reasons, and can be considered to be the same as hist() for this purpose. So the computations performed by panel.histogram() can be reduced to histogram.computations <- function(x, breaks, equal.widths = TRUE, type = "density", nint, ...) { if (is.null(breaks)) { breaks <- if (is.factor(x)) seq_len(1 + nlevels(x)) - 0.5 else if (equal.widths) do.breaks(range(x, finite = TRUE), nint) else quantile(x, 0:nint/nint, na.rm = TRUE) } hist(x, breaks = breaks, plot = FALSE) } which may be used as follows to get the ``actual data defining the histogram'': > a <- trellis.panelArgs(p, 2) > h <- do.call(histogram.computations, a) > str(h) List of 7 $ breaks : num [1:9] -2.561 -1.979 -1.398 -0.816 -0.234 ... $ counts : int [1:8] 1 4 6 14 7 8 6 4 $ intensities: num [1:8] 0.0344 0.1375 0.2062 0.4812 0.2406 ... $ density : num [1:8] 0.0344 0.1375 0.2062 0.4812 0.2406 ... $ mids : num [1:8] -2.2704 -1.6885 -1.1065 -0.5246 0.0573 ... $ xname : chr "x" $ equidist : logi TRUE - attr(*, "class")= chr "histogram" -Deepayan > > cheers, > > Rolf Turner > > On 01/09/11 10:59, Duncan Mackay wrote: >> >> Hi Monica >> >> An example abbreviated from ?histogram >> >> x = histogram( ~ height, data = singer) >> >> names(x) >> # to see what is there >> str(x) >> >> # information >> x$panel.args.common >> $breaks >> [1] 59.36 61.28 63.20 65.12 67.04 68.96 70.88 72.80 74.72 76.64 >> >> $type >> [1] "percent" >> >> $equal.widths >> [1] TRUE >> >> $nint >> [1] 9 >> >> # x$panel.args: name as number >> x[[35]] >> [[1]] >> [[1]]$x >> [1] 64 62 66 65 60 61 65 66 65 63 67 65 62 65 68 65 63 65 62 65 66 62 65 >> 63 65 66 65 62 65 66 65 61 65 66 65 62 63 67 60 67 66 62 65 62 >> [45] 61 62 66 60 65 65 61 64 68 64 63 62 64 62 64 65 60 65 70 63 67 66 65 >> 62 68 67 67 63 67 66 63 72 62 61 66 64 60 61 66 66 66 62 70 65 >> [89] 64 63 65 69 61 66 65 61 63 64 67 66 68 70 65 65 65 64 66 64 70 63 70 >> 64 63 67 65 63 66 66 64 64 70 70 66 66 66 69 67 65 69 72 71 66 >> [133] 76 74 71 66 68 67 70 65 72 70 68 64 73 66 68 67 64 68 73 69 71 69 76 >> 71 69 71 66 69 71 71 71 69 70 69 68 70 68 69 72 70 72 69 73 71 >> [177] 72 68 68 71 66 68 71 73 73 70 68 70 75 68 71 70 74 70 75 75 69 72 71 >> 70 71 68 70 75 72 66 72 70 69 72 75 67 75 74 72 72 74 72 72 74 >> [221] 70 66 68 75 68 70 72 67 70 70 69 72 71 74 75 >> >> etc to suite your requirements >> >> HTH >> >> Regards >> >> Duncan >> >> >> Duncan Mackay >> Department of Agronomy and Soil Science >> University of New England >> ARMIDALE NSW 2351 >> Email: home mackay at northnet.com.au >> >> >> >> At 23:50 31/08/2011, you wrote: >> >> >> >>> Hi, >>> >>> >>> >>> I have a relatively big dataset and I want to construct >>> some histograms using the histogram function in lattice. One thing I am >>> interested in is to look at differences between density and percent. I >>> know I can >>> use the hist function but it seems that this function gives sometimes >>> some >>> wrong answers and the density is actually a percent since it is >>> calculated as counts in the bin divided by the total no. of points. Let me >>> explain. >>> >>> >>> >>> If I let the hist function to decide the breaks, or I use >>> a small number, or one of the pre-determined methods to select breaks >>> then >>> everything seems to be in order. But if I decide to use ? for example ? >>> 100 as >>> a breaks (I have over 90000 data points so the number of breaks is not >>> necessarily too large I would think) the density for the first bin is >>> over 1, >>> although for all the other breaks the density is actually a percent since >>> it is >>> the count for that bin divided by the total no. of points I have. So ?. >>> Here it >>> is something wrong or most probably I am doing something wrong. >>> >>> >>> >>> If I use the function histogram from lattice it is >>> obvious that there is a difference between the percent param and the >>> density >>> param. I looked at the function code and I didn't understand it ? to be >>> honest. >>> It seems it calls inside the hist function, or a slightly modify variant >>> of >>> hist. Reading about the object trellis I saw I can access different info >>> about >>> the graph it generates but nothing about the actual data that goes into >>> defining the histogram. How can I access the data from it? >>> >>> >>> >>> I am not sure if my problem is platform specific ? it should >>> not be ? but I have Rx64 2.13.1 on windows machine, in case it counts. >>> >>> >>> >>> I appreciate your help, thanks, >>> >>> >>> >>> Monica >>> >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From ligges at statistik.tu-dortmund.de Mon Sep 5 11:40:27 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Mon, 05 Sep 2011 11:40:27 +0200 Subject: [R] calling R from PHP error In-Reply-To: References: <4E625F5E.30702@statistik.tu-dortmund.de> Message-ID: <4E64990B.5050903@statistik.tu-dortmund.de> On 05.09.2011 09:40, Katerina Karayianni wrote: > Yes, you were right about the make install command for R, now it is ok. > > The new issue is that when the R script is trying to generate a plot > file, the following error occurs > > Error in function (file = ifelse(onefile, "Rplots.pdf", > "Rplot%03d.pdf"), : > cannot open file 'Rplots.pdf' > Calls: plot ... plot.default -> plot.new -> -> .External > Execution halted > > I have changed permissions in the relative folder but this doesn't seem > to be the issue. I'm still working on that, any ideas ? Just check if the folder really exists and is really the one you thought it is so far and that you really have both read and write permissions within the folder. Uwe Ligges > > > 2011/9/3 Uwe Ligges > > > > > On 02.09.2011 08:23, Katerina Karayianni wrote: > > Hello, > I am having the following error while calling an R script > through PHP. > > /usr/local/bin/R: line 227: /kk/Programs/R-2.13.0/etc/__ldpaths: > Permission > denied > ERROR: R_HOME ('/kk/Programs/R-2.13.0') not found > > I had compiled R from source and placed the generated R shell > script in > /usr/local/bin. > > > So you said > > make install > > or did you copy it manually? The latter may have been your first glitch. > > > > > Can you give me an insight of how to give permission to access > the ldpaths > file and why is the R_HOME tree not found? > > > > Does that directory exist? > Do you have read/execute permissions? > > Uwe Ligges > > > Thank you and regards > > [[alternative HTML version deleted]] > > ________________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/__listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/__posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > From samuoko at yahoo.com Mon Sep 5 11:47:33 2011 From: samuoko at yahoo.com (Samuel Okoye) Date: Mon, 5 Sep 2011 10:47:33 +0100 (BST) Subject: [R] glm Message-ID: <1315216053.33410.YahooMailClassic@web29308.mail.ird.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jholtman at gmail.com Mon Sep 5 12:43:10 2011 From: jholtman at gmail.com (Jim Holtman) Date: Mon, 5 Sep 2011 06:43:10 -0400 Subject: [R] about raw type In-Reply-To: References: <31fc90bb-2668-4913-abcf-d8b3abee986b@email.android.com> Message-ID: <602042A2-DB05-4FD0-AE36-14D1DA91EC4A@gmail.com> it is useful if you need to process your data 'byte by byte'. I have used it to read in a large file and then translate one byte value to another. you can learn by creating some small script and processing example data with them. do you have a specific application tha you want to use it on? Sent from my iPad On Sep 4, 2011, at 8:31, Luis Iv?n Ortiz Valencia wrote: > Ok > > I mean Current values are the vector types "logical", "integer", "double", > "complex", "character", "raw". > > > I want to know more about the raw type, I know that It holds bytes. > > Atte > > IVAN > > > 2011/9/4 Jeff Newmiller > >> "Raw data" could mean many different things to different people. You need >> to be more specific about what YOU mean when you use that term. >> --------------------------------------------------------------------------- >> Jeff Newmiller The ..... ..... Go Live... >> DCN: Basics: ##.#. ##.#. Live Go... >> Live: OO#.. Dead: OO#.. Playing >> Research Engineer (Solar/Batteries O.O#. #.O#. with >> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k >> --------------------------------------------------------------------------- >> >> Sent from my phone. Please excuse my brevity. >> >> "Luis Iv?n Ortiz Valencia" wrote: >> >>> Dears >>> >>> I am searching information about how to use raw data, when it is used, >>> didn't find to much information on R help. Any suggestion on links about >>> this theme? >>> >>> atte >>> >>> -- >>> Luis Iv?n Ortiz Valencia >>> Doutorando Sa?de P?blica - Epidemiologia, IESC, UFRJ >>> Estat?stico Msc. >>> Spatial Analyst Msc. >>> >>> [[alternative HTML version deleted]] >>> >>> ------------------------------ >>> >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> > > > -- > Luis Iv?n Ortiz Valencia > Doutorando Sa?de P?blica - Epidemiologia, IESC, UFRJ > Estat?stico Msc. > Spatial Analyst Msc. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From JRadinger at gmx.at Mon Sep 5 13:07:00 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Mon, 05 Sep 2011 13:07:00 +0200 Subject: [R] plot regression line, log-transformation Message-ID: <20110905110700.49050@gmx.net> Hello UseRs, I've somehow general questions. I've got a dataset which shows signs of heteroscedasticity and non-normality in errors if I do a normal linear regression of the form Y~X. So to things came into my mind, either transforming the variables (log or log10) or using robust regression. So my first question: How can I decide what is the better method? Either: lm(log(Y)~log(X)) or rlm(Y~X)? Or is it even necessary to log transform for the robust regression? Another question has to do with the plotting: I can do a simple scatterplot with plot(Y~X) but that doesn't give a good picture as lot of the points are clumped in the left down corner. So I thought I could use either: plot(Y~X,log="xy") or plot(log(Y)~log(X)) but then I have problems if I want to plot also the abline from the robust regression (which is then probably not a straight line anymore). How do you deal with such cases where the plot uses different scaling (log) then the regression (and therefore the abline). Thank you very much! best regards, Johannes -- From cecilia.carmo at ua.pt Mon Sep 5 13:17:58 2011 From: cecilia.carmo at ua.pt (Cecilia Carmo) Date: Mon, 5 Sep 2011 12:17:58 +0100 Subject: [R] plm package, R squared, dummies in panel data Message-ID: <000601cc6bbd$77f3d040$67db70c0$@carmo@ua.pt> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Kay.Cichini at uibk.ac.at Mon Sep 5 13:52:27 2011 From: Kay.Cichini at uibk.ac.at (Kay Cecil Cichini) Date: Mon, 05 Sep 2011 13:52:27 +0200 Subject: [R] help with installing tar.gz package Message-ID: <20110905135227.73056zpwc8alwa4o@web-mail.uibk.ac.at> hi, i'd like to install the package "RGoogleDocs ". i downloaded to path "E:/R/R-2.13.0/library/RCurl_0.91-0.tar.gz" i run R from an usb-stick and can't get the install.packages() prompt to run correctly - can anyone help with this? thanks, kay > sessionInfo() R version 2.13.0 (2011-04-13) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=German_Austria.1252 LC_CTYPE=German_Austria.1252 LC_MONETARY=German_Austria.1252 [4] LC_NUMERIC=C LC_TIME=German_Austria.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.13.0 From ligges at statistik.tu-dortmund.de Mon Sep 5 14:02:50 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Mon, 05 Sep 2011 14:02:50 +0200 Subject: [R] help with installing tar.gz package In-Reply-To: <20110905135227.73056zpwc8alwa4o@web-mail.uibk.ac.at> References: <20110905135227.73056zpwc8alwa4o@web-mail.uibk.ac.at> Message-ID: <4E64BA6A.6090103@statistik.tu-dortmund.de> On 05.09.2011 13:52, Kay Cecil Cichini wrote: > hi, > > i'd like to install the package "RGoogleDocs ". > i downloaded to path "E:/R/R-2.13.0/library/RCurl_0.91-0.tar.gz" > > i run R from an usb-stick and can't get the install.packages() prompt to > run correctly - can anyone help with this? 1. Why not use install.packages() to install it from the net? 2. You can download the binary package (zip) file, if network access is not available for you on the target machine. 3. If you want to install from sources (the tar.gz), please read the manual R Installation and Administration. Uwe Ligges > > thanks, > kay > > >> sessionInfo() > R version 2.13.0 (2011-04-13) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=German_Austria.1252 LC_CTYPE=German_Austria.1252 > LC_MONETARY=German_Austria.1252 > [4] LC_NUMERIC=C LC_TIME=German_Austria.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] tools_2.13.0 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From Kay.Cichini at uibk.ac.at Mon Sep 5 14:52:48 2011 From: Kay.Cichini at uibk.ac.at (Kay Cecil Cichini) Date: Mon, 05 Sep 2011 14:52:48 +0200 Subject: [R] help with installing tar.gz package In-Reply-To: <4E64BA6A.6090103@statistik.tu-dortmund.de> References: <20110905135227.73056zpwc8alwa4o@web-mail.uibk.ac.at> <4E64BA6A.6090103@statistik.tu-dortmund.de> Message-ID: <20110905145248.41532z3ix132pla8@web-mail.uibk.ac.at> internet connection exists - dont' know why, but i can't set to CRAN mirror - maybe because i run R from an usb-stick, firewall or whatever. thus, install.packages("RCurl_0.91-0.tar.gz") won't work. a zip-file does not exist, only the tar.gz file. so i tried to install from the source that i downloaded to: "E:/R/R-2.13.0/library/RCurl_0.91-0.tar.gz" i tried to apply the manual for R Installation and Administration but admittedly doesn't grasp it - any help how to achieve this would be greatly appreciated. kay Zitat von Uwe Ligges : > > > On 05.09.2011 13:52, Kay Cecil Cichini wrote: >> hi, >> >> i'd like to install the package "RGoogleDocs ". >> i downloaded to path "E:/R/R-2.13.0/library/RCurl_0.91-0.tar.gz" >> >> i run R from an usb-stick and can't get the install.packages() prompt to >> run correctly - can anyone help with this? > > > 1. Why not use install.packages() to install it from the net? > 2. You can download the binary package (zip) file, if network access > is not available for you on the target machine. > 3. If you want to install from sources (the tar.gz), please read the > manual R Installation and Administration. > > Uwe Ligges > > > >> >> thanks, >> kay >> >> >>> sessionInfo() >> R version 2.13.0 (2011-04-13) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=German_Austria.1252 LC_CTYPE=German_Austria.1252 >> LC_MONETARY=German_Austria.1252 >> [4] LC_NUMERIC=C LC_TIME=German_Austria.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> loaded via a namespace (and not attached): >> [1] tools_2.13.0 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > From Torsten.Hothorn at R-project.org Mon Sep 5 15:02:37 2011 From: Torsten.Hothorn at R-project.org (Torsten Hothorn) Date: Mon, 5 Sep 2011 15:02:37 +0200 (CEST) Subject: [R] glht (multcomp): NA's for confidence intervals using univariate_calpha (fwd) In-Reply-To: References: Message-ID: fixed @ R-forge. New version should appear on CRAN soon. Thanks for the report! Torsten > > ---------- Forwarded message ---------- > Date: Sat, 3 Sep 2011 23:56:35 +0200 > From: Ulrich Halekoh > To: "r-help at r-project.org" > Subject: [R] glht (multcomp): NA's for confidence intervals using > univariate_calpha > > Hej, > > Calculation of confidence intervals for means > based on a model fitted with lmer > > using the package multcomp > > - yields results for calpha=adjusted_calpha > - NA's for calpha=univariate_calpha > > > Example: > library(lme4) > library(multcomp) > ### Generate data > set.seed(8) > d<-expand.grid(treat=1:2,block=1:3) > e<-rnorm(3) > names(e)<-1:3 > d$y<-rnorm(nrow(d)) + e[d$block] > d<-transform(d,treat=factor(treat),block=factor(block)) > ##### lmer fit > Mod<-lmer(y~treat+ (1|block), data=d) > ### estimate treatment means > L<-cbind(c(1,0),c(0,1)) > s<-glht(Mod,linfct=L) > ## confidence intervals > confint(s,calpha=adjusted_calpha()) > #produces NA's for the confidence limits > confint(s,calpha=univariate_calpha()) > > #for models fitted with lm the problem does not occur > G<-lm(y~treat+ block, data=d) > L<-matrix( c(1,0,1/3,1/3,1,1,1/3,1/3),2,4,byrow=TRUE) > s<-glht(G,linfct=L) > confint(s,calpha=adjusted_calpha()) > confint(s,calpha=univariate_calpha()) > > > > multcomp version 1.2-7 > R:platform i386-pc-mingw32 > version.string R version 2.13.1 Patched (2011-08-19 r56767) > > > Regards > > Ulrich Halekoh > Department of Molecular Biology and Genetics, > Aarhus University, Denmark > Ulrich.Halekoh at agrsci.dk > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From ligges at statistik.tu-dortmund.de Mon Sep 5 15:05:36 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Mon, 05 Sep 2011 15:05:36 +0200 Subject: [R] help with installing tar.gz package In-Reply-To: <20110905145248.41532z3ix132pla8@web-mail.uibk.ac.at> References: <20110905135227.73056zpwc8alwa4o@web-mail.uibk.ac.at> <4E64BA6A.6090103@statistik.tu-dortmund.de> <20110905145248.41532z3ix132pla8@web-mail.uibk.ac.at> Message-ID: <4E64C920.8060709@statistik.tu-dortmund.de> On 05.09.2011 14:52, Kay Cecil Cichini wrote: > internet connection exists - dont' know why, but i can't set to CRAN > mirror - maybe because i run R from an usb-stick, firewall or whatever. Probably not correctly configured proxy setting? > > thus, install.packages("RCurl_0.91-0.tar.gz") won't work. install.packages("RCurl"), to be precise > > a zip-file does not exist, only the tar.gz file. It exists. The most recent version of RCurl for Windows for R-2.13.x is available from "CRAN extras" which expands to: http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.13/RCurl_1.6-10.1.zip I wonder where you got that ancient version of RCurl (0.91-0) from? > so i tried to install from the source that i downloaded to: > "E:/R/R-2.13.0/library/RCurl_0.91-0.tar.gz" > > i tried to apply the manual for R Installation and Administration but > admittedly doesn't grasp it - any help how to achieve this would be greatly > appreciated. After installing the prerequisites as mentioned in that manual, you could do install.packages("/path/to/RCurl_0.91-0.tar.gz", repos=NULL, type="source") but RCurl won't install out of the box (therefore the version on "CRAN extras" rather than an auto-built one on regular CRAN - the former is manually tweaked by Brian Ripley). Uwe Ligges > kay > > > Zitat von Uwe Ligges : > >> >> >> On 05.09.2011 13:52, Kay Cecil Cichini wrote: >>> hi, >>> >>> i'd like to install the package "RGoogleDocs ". >>> i downloaded to path "E:/R/R-2.13.0/library/RCurl_0.91-0.tar.gz" >>> >>> i run R from an usb-stick and can't get the install.packages() prompt to >>> run correctly - can anyone help with this? >> >> >> 1. Why not use install.packages() to install it from the net? >> 2. You can download the binary package (zip) file, if network access >> is not available for you on the target machine. >> 3. If you want to install from sources (the tar.gz), please read the >> manual R Installation and Administration. >> >> Uwe Ligges >> >> >> >>> >>> thanks, >>> kay >>> >>> >>>> sessionInfo() >>> R version 2.13.0 (2011-04-13) >>> Platform: i386-pc-mingw32/i386 (32-bit) >>> >>> locale: >>> [1] LC_COLLATE=German_Austria.1252 LC_CTYPE=German_Austria.1252 >>> LC_MONETARY=German_Austria.1252 >>> [4] LC_NUMERIC=C LC_TIME=German_Austria.1252 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> loaded via a namespace (and not attached): >>> [1] tools_2.13.0 >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> > > > From sarah.goslee at gmail.com Mon Sep 5 15:40:51 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Mon, 5 Sep 2011 09:40:51 -0400 Subject: [R] generating multiple dataset and applying function and output multiple output dataset...... In-Reply-To: References: Message-ID: Hi, On Sun, Sep 4, 2011 at 9:25 AM, John Clark wrote: > Dear R experts: > > Here is my problem, just hard for me... > > I want to generate multiple datasets, then apply a function to these > datasets and output corresponding output in single or multiple dataset > (whatever possible)... > > # my example, although I need to generate a large number of variables and > datasets > > seed <- round(runif(10)*1000000) > > datagen <- function(x){ > set.seed(x) > var <- rep(1:3, c(rep(3, 3))) > yvar <- rnorm(length(var), 50, 10) > matrix <- matrix(sample(1:10, c(10*length(var)), replace = TRUE), ncol = 10) > mydata <- data.frame(var, yvar, matrix) > } > > gdt <- lapply (seed, ?datagen) > > # resulting list (I believe is correct term) has 10 dataframes: gdt[1] > .......to gdt[10] Yes, that's a list of dataframes, though the correct reference is gdt[[1]] > # my function, this will perform anova in every component data frames and > output probability coefficients... > anovp <- function(x){ > ? ? ? ? ?ind <- 3:ncol(x) > ? ? ? ? ?out <- lm(gdt[x]$yvar ~ gdt[x][, ind[ind]]) > ? ? ? ? ?pval <- out$coefficients[,4][2] > ? ? ? ? ?pval <- do.call(rbind,pval) > ? ? ? ? } > > plist <- lapply (gdt, ?anovp) > > Error in gdt[x] : invalid subscript type 'list' It's not a matter of your use of lapply(), which is fine. It's that your anovp() function just plain doesn't work. You need to debug it with ONE dataframe before you try to lapply it to a whole bunch. > anovp(gdt[[1]]) Error in gdt[x] : invalid subscript type 'list' This suggests to me that x should be a matrix rather than a list (a dataframe is a type of list), so I tried: > anovp(as.matrix(gdt[[1]])) Error in gdt[x][, ind[ind]] : incorrect number of dimensions But as you see there are still problems. You'll need to solve those first: if anovp() doesn't work for one dataframe, it won't work on a list of them. > This is not working, I tried different options. But could not figure > out...finally decided to bother experts, sorry for that... > > My questions are: > > (1) Is this possible to handle such situation in this way or there are other > alternatives to handle such multiple datasets created? > > (2) ?If this is right way, how can I do it? > > > Thank you for attention and I will appreciate your help... > > > JC > -- Sarah Goslee http://www.functionaldiversity.org From Kay.Cichini at uibk.ac.at Mon Sep 5 15:51:43 2011 From: Kay.Cichini at uibk.ac.at (Kay Cecil Cichini) Date: Mon, 05 Sep 2011 15:51:43 +0200 Subject: [R] help with installing tar.gz package In-Reply-To: <4E64C920.8060709@statistik.tu-dortmund.de> References: <20110905135227.73056zpwc8alwa4o@web-mail.uibk.ac.at> <4E64BA6A.6090103@statistik.tu-dortmund.de> <20110905145248.41532z3ix132pla8@web-mail.uibk.ac.at> <4E64C920.8060709@statistik.tu-dortmund.de> Message-ID: <20110905155143.11001yftyfw34hdw@web-mail.uibk.ac.at> i finally found the XML.zip: http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.14/ and thanks to you pointing me at the URL of the RCurl.zip: http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.13/RCurl_1.6-10.1.zip i was able to to install the wanted RGooglesDoc package successfully by call install.packages("E:/R/R-2.13.0/library/RGoogleDocs_0.5-0.tar.gz", repos=NULL, type="source") thanks and sorry for my clumsiness, kay Zitat von Uwe Ligges : > > > On 05.09.2011 14:52, Kay Cecil Cichini wrote: >> internet connection exists - dont' know why, but i can't set to CRAN >> mirror - maybe because i run R from an usb-stick, firewall or whatever. > > Probably not correctly configured proxy setting? >> >> thus, install.packages("RCurl_0.91-0.tar.gz") won't work. > > install.packages("RCurl"), to be precise > >> >> a zip-file does not exist, only the tar.gz file. > > It exists. The most recent version of RCurl for Windows for R-2.13.x > is available from "CRAN extras" which expands to: > > http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.13/RCurl_1.6-10.1.zip > > I wonder where you got that ancient version of RCurl (0.91-0) from? > > > >> so i tried to install from the source that i downloaded to: >> "E:/R/R-2.13.0/library/RCurl_0.91-0.tar.gz" >> >> i tried to apply the manual for R Installation and Administration but >> admittedly doesn't grasp it - any help how to achieve this would be greatly >> appreciated. > > > After installing the prerequisites as mentioned in that manual, you could do > > install.packages("/path/to/RCurl_0.91-0.tar.gz", repos=NULL, type="source") > > but RCurl won't install out of the box (therefore the version on > "CRAN extras" rather than an auto-built one on regular CRAN - the > former is manually tweaked by Brian Ripley). > > Uwe Ligges > > >> kay >> >> >> Zitat von Uwe Ligges : >> >>> >>> >>> On 05.09.2011 13:52, Kay Cecil Cichini wrote: >>>> hi, >>>> >>>> i'd like to install the package "RGoogleDocs ". >>>> i downloaded to path "E:/R/R-2.13.0/library/RCurl_0.91-0.tar.gz" >>>> >>>> i run R from an usb-stick and can't get the install.packages() prompt to >>>> run correctly - can anyone help with this? >>> >>> >>> 1. Why not use install.packages() to install it from the net? >>> 2. You can download the binary package (zip) file, if network access >>> is not available for you on the target machine. >>> 3. If you want to install from sources (the tar.gz), please read the >>> manual R Installation and Administration. >>> >>> Uwe Ligges >>> >>> >>> >>>> >>>> thanks, >>>> kay >>>> >>>> >>>>> sessionInfo() >>>> R version 2.13.0 (2011-04-13) >>>> Platform: i386-pc-mingw32/i386 (32-bit) >>>> >>>> locale: >>>> [1] LC_COLLATE=German_Austria.1252 LC_CTYPE=German_Austria.1252 >>>> LC_MONETARY=German_Austria.1252 >>>> [4] LC_NUMERIC=C LC_TIME=German_Austria.1252 >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> loaded via a namespace (and not attached): >>>> [1] tools_2.13.0 >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> >> >> > > From kehld at ktk.pte.hu Mon Sep 5 16:08:16 2011 From: kehld at ktk.pte.hu (=?ISO-8859-2?Q?Kehl_D=E1niel?=) Date: Mon, 05 Sep 2011 16:08:16 +0200 Subject: [R] sorry, [WinBUGS] question Message-ID: <4E64D7D0.6050601@ktk.pte.hu> Dear Community, I know this is not the place to ask WinBUGS questions, but I did not get any answers on other lists. I am rather new to the BUGS language and to bayesian modeling, excuse me for probably simple questions. I have to conduct a bayesian meta-analysis of some data. We have collected observational and randomized studies related to a certain field of interest. The idea is to analyse the randomized studies with two different priors. One is non-informative, the other is calculated from the observational ones. We also want to use a sceptical prior. The code I used for the non-informative prior analysis and to get the other prior is following: model { for( i in 1 : Num ) { rc[i] ~ dbin(pc[i], nc[i]) rt[i] ~ dbin(pt[i], nt[i]) log(pc[i]) <- mu[i] log(pt[i]) <- mu[i] + delta[i] mu[i] ~ dnorm(0,1.0E-5) delta[i] ~ dnorm(d, tau) } d ~ dnorm(0,1.0E-6) tau ~ dgamma(0.001,0.001) sigma <- 1 / sqrt(tau) relr <- exp(d) } which appears to work fine after loading data and initials. (there was a study with 0 treated and 0 control cases, I had to exclude that one for some reasons, is there a solution for this?) If I understand right, I can interpret the "relr" as bayesian estimate of relative risk, with credible interval etc. I have some questions in connection with the informative prior analysis: - after running this same code for the observational data, how do I change the specification of d and tau? - how can I get posterior probabilities like relr>1? - usually how many iterations, thin etc. do we use? - can I get nice graphics with both priors and posteriors on it? I do have to learn everything on my own, so any help is greatly appreciated. I know R and the BUGS package are able to communicate, is anybody can help to solve the task through the R interface would be great. Thank you for you answer or any kind of help: Daniel From Tom.Wilding at sams.ac.uk Mon Sep 5 16:17:28 2011 From: Tom.Wilding at sams.ac.uk (Tom Wilding) Date: Mon, 05 Sep 2011 15:17:28 +0100 Subject: [R] Power analysis in hierarchical models Message-ID: <4E64E7DC.5B29.0082.0@sams.ac.uk> Dear All I am attempting some power analyses, based on simulated data. My experimental set up is thus: Bleach: main effect, three levels (control, med, high), Fixed. Temp: main effect, two levels (cold, hot), Fixed. Main effect interactions, six levels (fixed) For each main-effect combination I have three replicates. Within each replicate I can take varying numbers of measurements (response variable = Growth (of marine worms)) but, for this example, assume eight). (I?m interested in changing this to see if the experimental power changes much). Total size = 3 x 2 x 3 x 8 = 144 The script thus far goes: =========== start of script ================= library(lme4) #Data frame structure Bleach=rep(c("Control","Med","High"),each=48) Temp= rep(rep(c("Cold","Hot"),each=24),3) Rep= (rep(rep(rep(c("1","2","3"),each=8),2),3)) Ind= (rep(rep(rep(c(1:8),3),2),3))#not required for stats #Fake data (based on pilot studies), only showing a single main effect (bleach) Growth=c( rnorm(48,3.27,0.77),rnorm(48,3.21,0.77),rnorm(48,3.64,1.17)) fake2=data.frame(Bleach,Temp,Rep,Ind,Growth);head(fake2) #generate factor level for lmer as per Crawley, page 649 fake2$rep=fake2$Bleach:fake2$Temp:fake2$Rep#rep is used in the lmer model with(fake2,table(rep))#check that each rep contains 8 measurements # run alternative (?equivalent) models model1=aov(Growth~Bleach*Temp+Error(Bleach*Temp/Rep),data=fake2);summary(model1) model2=lmer(Growth~Bleach*Temp+(1|rep),data=fake2);summary(model2)#note: see above, rep<>Rep! ============ end of script ========== I'd like to get familiar with using lme4 because it is likely that the final results of the experiment will be unbalanced (which precludes the use of aov I think). The df given by model1 seem to make sense. Any guidance on any of the following would be much appreciated: 1. Are model1 and model2 equivalent? 2. For model1 - is the random component correctly specified and is there a (simple) mechanism to get the appropriate F ratios and P values? 3. For model2 - again, is the random component correct (probably not) and why is the random effect (rep) variance and standard deviations so low (zero in most iterations)? 4. For both models - how do I isolate (so I can tabulate and create histograms) the appropriate P and/or t values? (for model2 - the ?mer? object doesn?t seem to contain the t values but maybe I?m missing something). Direction to any more generic sources of information regarding power analysis in hierarchical models would be gladly received. Thank you Tom. ------------------------------------------------------------------------- Tom Wilding, MSc, PhD, Dip. Stat. Scottish Association for Marine Science, Scottish Marine Institute, OBAN Argyll. PA37 1QA United Kingdom. Phone (+44) (0) 1631 559214 Fax (+44) (0) 1631 559001 ------------------------------------------------------------------------ +++++++++++++++++++++++++++++++ The Scottish Association for Marine Science (SAMS) is registered in Scotland as a Company Limited by Guarantee (SC009292) and is a registered charity (9206). SAMS has an actively trading wholly owned subsidiary company: SAMS Research Services Ltd a Limited Company (SC224404). All Companies in the group are registered in Scotland and share a registered office at Scottish Marine Institute, Oban Argyll PA37 1QA. The content of this message may contain personal views which are not the views of SAMS unless specifically stated. Please note that all email traffic is monitored for purposes of security and spam filtering. As such individual emails may be examined in more detail. +++++++++++++++++++++++++++++++ From xkziloj at gmail.com Mon Sep 5 16:27:57 2011 From: xkziloj at gmail.com (. .) Date: Mon, 5 Sep 2011 11:27:57 -0300 Subject: [R] Alternatives to integrate? In-Reply-To: References: <1314885193156-3783645.post@n4.nabble.com> Message-ID: Hi, continuing the improvements... I've prepared a new code: ddae <- function(individuals, frac, sad, samp="pois", trunc=0, ...) { dots <- list(...) Compound <- function(individuals, frac, n.species, sad, samp, dots) { print(c("Size:", length(individuals), "Compound individuals:", individuals, "End.")) RegDist <- function(n.species, sad, dots) { # "RegDist" may be Exponential, Gamma, etc. dcom <- paste("d", as.name(sad), sep="") dots <- as.list(c(n.species, dots)) ans <- do.call(dcom, dots) return(ans) } SampDist <- function(individuals, frac, n.species, samp) { # "SampDist" may be Poisson or Negative Binomial dcom <- paste("d", samp, sep="") lambda <- frac * n.species dots <- as.list(c(individuals, lambda)) ans <- do.call(dcom, dots) return(ans) } ans <- RegDist(n.species, sad, dots) * SampDist(individuals, frac, n.species, samp) return(ans) } IntegrateScheme <- function(Compound, individuals, frac, sad, samp, dots) { print(c("Size:", length(individuals), "Integrate individuals:", individuals)) ans <- integrate(Compound, 0, 2000, individuals, frac, sad, samp, dots)$value return(ans) } ans <- IntegrateScheme(Compound, individuals, frac, sad, samp, dots) return(ans) } ddae(2, 0.05, "exp") Now I can't understand what happen to "individuals", why is it changing in value and size? I've tried to "traceback()" and "debug()", but I was not smart enough to understand what is going on. Could you, please, give some more help? Thanks in advance. On Thu, Sep 1, 2011 at 10:41 PM, R. Michael Weylandt wrote: > Actually, it's very easy to integrate a function of two variables in a > single variable for a given value of the other variable. > > Using your example: > > MySum <- function(x,y) { > ???? ans = x + y > ? ?? return(ans) > } > > Note a things about how I wrote this. One, I broke the function out and used > curly braces to enclose the body of the expression; secondly, I kept the > body of the function at a constant indent level using spaces, not hard tabs; > thirdly, I gave it a meaningful (if somewhat silly) name. (There are so many > things that have names like "func" or "f" in R that you really don't want to > risk overloading something important) Finally, I used the (technically > unnecessary) return() command to say specifically what values my function > will be return. The use of "ans" is a personal preference, but I think it > makes clear what the function is aiming at. > > Suppose I want to integrate this over [0,1] with y = 3. This can be coded > > R> integrate(MySum, 0, 1, 3) > 3.5 > > If you read the documentation for integrate (? integrate) you'll see that > there is an optional "..." argument that allows further parameters to be > passed to the integrand. Here, this is only the value of y. > > Now suppose I want to define a function that integrates over that same unit > interval but takes y as an argument. This can be done as > > BadIntegrateMySum <- function(y) { > ???? ans = integrate(MySum, 0, 1, y) > ???? return(ans) > } > > However, this is a potentially dangerous thing to do because it requires > MySum to just show up inside of BadIntegrateMySum. R is able to try to help > you out, but really it's very dangerous so don't rely on it. Rather, define > MySum inside of the first function as a helper inside of the larger > function: > > GoodIntegrateMySum <- function(y) { > > ??? MySumHelper <- function(x,y) { > ??????? ans = x + y > ??????? return(ans) > ??? } > > ??? ans = integrate(MySumHelper, 0, 1, y) > ??? return(ans) > } > > Hopefully this is much clearer. There's a slightly contentious stylistic > point here -- whether it's ok to use y in the definition of the helper and > in the bigger function -- but I think it's ok in this circumstance because > the two instances specifically correspond to each other. > > A more general form of this could take in "MySumHelper" as an argument (yes > functions can be passed like that) > > # MySum as above > > GoodIntegrateUnitInterval <- function(xIntegrand, yParameter) { > ??? # Requires xIntegrand to be a function of two variables x,y > ??? # You can actually do this in the code, but for now let's just assume no > user error and that xIntegrand is the right sort of thing. > ??? ans = integrate(xIntegrand, 0, 1, yParameter) > ??? return(ans) > } > > R> GoodIntegrateUnitInverval(MySum, 3) > 3.5 > > as before. > > There's nothing wrong with using "result" like I've used "ans," but I do > hesitate to see it used as a function rather than a variable. A good rule of > thumb is to check if a variable is already defined as a function name using > the apropos() command. > > I don't have time or inclination to rework your whole code right now, but > take a stab at formatting it with consistent+informative variable and > function names, a well reasoned use of scoping, and appropriate use of > integrate() and I'll happily comment on it. > > Hope this helps, > > Michael Weylandt > > On Thu, Sep 1, 2011 at 8:57 PM, . . wrote: >> >> Thanks for your reply Michael, it seems I have a lot of things to >> learn yet but for sure, your response is being very helpful in this >> proccess. I will try to explore every point you said: >> >> A doubt I have is, if I define "func <- function(x,y) x + y" how can I >> integrate it only in "x"? My solution for this would be to define >> "func <- function(x) x + y". Is not ok? >> >> Also, with respect to the helper functions I'd created, I am wondering >> if you can see a better organization for my code. It is so because >> this is the only way I can see. Particularly I do not like how I am >> using "results", but I can not think in another form. >> >> Thanks in advance. >> >> On Thu, Sep 1, 2011 at 2:44 PM, R. Michael Weylandt >> wrote: >> > Leaving aside some other issues that this whole email chain has opened >> > up, >> > >> > I'd guess that your most immediate problem is that you are trying to >> > numerically integrate the PMF of a discrete distribution but you are >> > treating it as a continuous distribution. If you took the time to >> > properly >> > debug (as you were instructed yesterday) you'd probably find that >> > whenever >> > you call dpois(x, lambda) for x not an integer you get a warning >> > message. >> > >> > Specifically, check this out >> > >> >> integrate(dpois,0,Inf,1) >> > 9.429158e-13 with absolute error < 1.7e-12 >> > >> >> n = 0:1000; sum(dpois(n,1)) >> > 1 >> > >> > I could be entirely off base here, but I'm guessing that many of your >> > problems derive from this. >> > >> > >> > >> > On another basis, please, please read this: >> > http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html >> > or this >> > http://had.co.nz/stat405/resources/r-style-guide.html >> > >> > And, perhaps most importantly, don't rely on the black magic of values >> > moving in and out of functions (lexical scoping). Seriously, just don't >> > do >> > it. >> > >> > If you have helper functions that need values, actively pass them: you >> > will >> > save yourself hours of trouble when (not if) you debug your functions. >> > I'm >> > looking, for example, at g() in the first big block of code you >> > provided. >> > Call it g(a,n) and spend the extra 4 keystrokes to pass the values. It >> > makes >> > everyone happier. >> > >> > Michael >> > >> > On Thu, Sep 1, 2011 at 12:37 PM, . . wrote: >> >> >> >> So, please excuse me Michael, you are completely sure. I will try >> >> describe I am trying to do, please let me know if I can provide more >> >> info. >> >> >> >> The idea is provide to "func" two probability density functions(PDFs) >> >> and obtain another PDF that is a compound of them. In a final analysis >> >> this characterize an abundance distribution for me. The two PDFs are >> >> provided through "f" and "g" and there is some manipulation here >> >> because I need flexibility to easily change this two funcions. >> >> >> >> In the code provided, "f" is the Exponential distribution and "g" is >> >> the Poisson distribution. For this case, I have the analytical >> >> solution, below. This way I can check the result. But I am also >> >> considering other combinations of ?"f" and "g" that have difficult, or >> >> even does not have analitical solution. This is the reason why I am >> >> trying to develop "func". >> >> >> >> func2 <- function(y, frac, rate, trunc=0, log=FALSE) { >> >> ? ?is.wholenumber <- function(x, tol = .Machine$double.eps^0.5) >> >> ? ? ? ?abs(x - round(x)) < tol >> >> ? ?if(FALSE %in% sapply(y,is.wholenumber)) >> >> ? ? ? ?print("y must be integer because dpoix is a discrete PDF.") >> >> ? ?else { >> >> ? ? ? ?f <- function(y){ >> >> ? ? ? ? ? ?b <- y*log(frac) >> >> ? ? ? ? ? ?m <- log(rate) >> >> ? ? ? ? ? ?n <- (y+1)*log(rate+frac) >> >> ? ? ? ? ? ?if(log)b+m-n else exp(b+m-n) >> >> ? ? ? ?} >> >> ? ? ? ?f(y)/(1-f(trunc)) >> >> ? ?} >> >> } >> >> > func2(200,0.05,0.001) >> >> [1] 0.000381062 >> >> >> >> In theory, the interval of integration is 0 to Inf, but for some tests >> >> I did, go up to 2000 may still provide reasonable results. >> >> >> >> Also, as it seems, I am still writing my first functions in R and >> >> suggestions are welcome, please. >> >> >> >> Again, appologies for my previous mistake. It was not my intention to >> >> blame about "integrate". >> >> >> >> On Thu, Sep 1, 2011 at 11:49 AM, R. Michael Weylandt >> >> wrote: >> >> > I'm going to try to put this nicely: >> >> > >> >> > What you provided is not a problem with integrate. Instead, you >> >> > provided >> >> > a >> >> > rather unintelligible and badly-written piece of code that >> >> > (miraculously) >> >> > seems to work, though it's not well documented so I have no idea if >> >> > 1.3e-21 >> >> > is what you want to get. >> >> > >> >> > Let's try this again: per your original request, what is the problem >> >> > with >> >> > integrate? >> >> > >> >> > If instead you feel there's something wrong with your code, might I >> >> > suggest >> >> > you just say that and ask for help, rather than passing the blame >> >> > onto a >> >> > perfectly useful base function. >> >> > >> >> > Oh, and since you asked that I propose something: comment your code. >> >> > >> >> > Michael >> >> > >> >> > On Thu, Sep 1, 2011 at 10:33 AM, . . wrote: >> >> >> >> >> >> Hi Michael, >> >> >> >> >> >> This is the problem: >> >> >> >> >> >> func <- Vectorize(function(x, a, sad, samp="pois", trunc=0, ...) { >> >> >> ?result <- function(x) { >> >> >> ? ?f1 <- function(n) { >> >> >> ? ? ? ? ? ? ? ? ? ? ? ?f <- function() { >> >> >> ? ? ? ?dcom <- paste("d", sad, sep="") >> >> >> ? ? ? ?dots <- c(as.name("n"), list(...)) >> >> >> ? ? ? ?do.call(dcom, dots) >> >> >> ? ? ? ? ? ? ? ? ? ? ? ?} >> >> >> ? ? ?g <- function() { >> >> >> ? ? ? ?dcom <- paste("d", samp, sep="") >> >> >> ? ? ? ?lambda <- a * n >> >> >> ? ? ? ?dots <- c(as.name("x"), as.name("lambda")) >> >> >> ? ? ? ?do.call(dcom, dots) >> >> >> ? ? ?} >> >> >> ? ? ?f() * g() >> >> >> ? ?} >> >> >> ? ?integrate(f1,0,2000)$value >> >> >> # ? ? adaptIntegrate(f1,0,2000)$integral >> >> >> >> >> >> # ? ? n <- 0:2000 >> >> >> # ? ? trapz(n,f1(n)) >> >> >> >> >> >> # ? ? area(f1, 0, 2000, limit=10000, eps=1e-100) >> >> >> ?} >> >> >> ?return(result(x) / (1 - result(trunc))) >> >> >> }, "x") >> >> >> func(200, 0.05, "exp", rate=0.001) >> >> >> >> >> >> If you could propose something I will be gratefull. >> >> >> >> >> >> Thanks in advance. >> >> >> >> >> >> On Thu, Sep 1, 2011 at 10:55 AM, R. Michael Weylandt >> >> >> wrote: >> >> >> > Mr ". .", >> >> >> > >> >> >> > MASS::area comes to mind but it may be more helpful if you could >> >> >> > say >> >> >> > what >> >> >> > you are looking for / why integrate is not appropriate it is for >> >> >> > whatever >> >> >> > you are doing. >> >> >> > >> >> >> > Strictly speaking, I suppose there are all sorts of "alternatives" >> >> >> > to >> >> >> > integrate() if you are willing to be really creative and build >> >> >> > something >> >> >> > from scratch: diff(), cumsum(), lm(), hist(), t(), c(), .... >> >> >> > >> >> >> > Michael Weylandt >> >> >> > >> >> >> > On Thu, Sep 1, 2011 at 9:53 AM, B77S wrote: >> >> >> >> >> >> >> >> package "caTools" >> >> >> >> see ?trapz >> >> >> >> >> >> >> >> >> >> >> >> . wrote: >> >> >> >> > >> >> >> >> > Hi all, >> >> >> >> > >> >> >> >> > is there any alternative to the function integrate? >> >> >> >> > >> >> >> >> > Any comments are welcome. >> >> >> >> > >> >> >> >> > Thanks in advance. >> >> >> >> > >> >> >> >> > ______________________________________________ >> >> >> >> > R-help at r-project.org mailing list >> >> >> >> > https://stat.ethz.ch/mailman/listinfo/r-help >> >> >> >> > PLEASE do read the posting guide >> >> >> >> > http://www.R-project.org/posting-guide.html >> >> >> >> > and provide commented, minimal, self-contained, reproducible >> >> >> >> > code. >> >> >> >> > >> >> >> >> >> >> >> >> -- >> >> >> >> View this message in context: >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> http://r.789695.n4.nabble.com/Alternatives-to-integrate-tp3783624p3783645.html >> >> >> >> Sent from the R help mailing list archive at Nabble.com. >> >> >> >> >> >> >> >> ______________________________________________ >> >> >> >> R-help at r-project.org mailing list >> >> >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> >> >> PLEASE do read the posting guide >> >> >> >> http://www.R-project.org/posting-guide.html >> >> >> >> and provide commented, minimal, self-contained, reproducible >> >> >> >> code. >> >> >> > >> >> >> > >> >> > >> >> > >> > >> > > > From xkziloj at gmail.com Mon Sep 5 16:44:51 2011 From: xkziloj at gmail.com (. .) Date: Mon, 5 Sep 2011 11:44:51 -0300 Subject: [R] Alternatives to integrate? In-Reply-To: References: <1314885193156-3783645.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From privatn at yahoo.fr Mon Sep 5 11:33:09 2011 From: privatn at yahoo.fr (Quentin) Date: Mon, 5 Sep 2011 02:33:09 -0700 (PDT) Subject: [R] Implent the function vglm in C++ Message-ID: <1315215189791-3790813.post@n4.nabble.com> Hi, I'm working on multiple logistic regression. I used the function vglm (Package VGAM) in R. Now, i'd like to implent this function (vglm) in C++. Could someone help me or send me the algorithm. Thanks in advance, Quentin -- View this message in context: http://r.789695.n4.nabble.com/Implent-the-function-vglm-in-C-tp3790813p3790813.html Sent from the R help mailing list archive at Nabble.com. From ashz at walla.co.il Mon Sep 5 11:59:33 2011 From: ashz at walla.co.il (ashz) Date: Mon, 5 Sep 2011 02:59:33 -0700 (PDT) Subject: [R] ggplot2-grid/viewport and PNG Message-ID: <1315216773768-3790866.post@n4.nabble.com> Dear All, The following code save my graphs as pdf: pdf("j:/mix.pdf", width = 18, height = 16) grid.newpage() pushViewport(viewport(layout = grid.layout(3,1))) vplayout <- function(x, y) viewport(layout.pos.row = x, layout.pos.col = y) print(Aplot, vp = vplayout(1, 1)) print(Bplot, vp = vplayout(2, 1)) print(Cplot, vp = vplayout(3, 1)) dev.off() How can I save it in PNG and maintain the same graph structure? Thanks -- View this message in context: http://r.789695.n4.nabble.com/ggplot2-grid-viewport-and-PNG-tp3790866p3790866.html Sent from the R help mailing list archive at Nabble.com. From pbarapatre at gmail.com Mon Sep 5 12:04:05 2011 From: pbarapatre at gmail.com (Pariksheet) Date: Mon, 5 Sep 2011 03:04:05 -0700 (PDT) Subject: [R] How to create R executable? Message-ID: <1315217045310-3790883.post@n4.nabble.com> Hi , I have created .R file which connects to Teradata database and then does some manipulation and produces the output graph. How to create executable for .R file? Thanks Pariksheet -- View this message in context: http://r.789695.n4.nabble.com/How-to-create-R-executable-tp3790883p3790883.html Sent from the R help mailing list archive at Nabble.com. From adamofthehesse at gmail.com Mon Sep 5 12:21:53 2011 From: adamofthehesse at gmail.com (Adam Hesse) Date: Mon, 5 Sep 2011 22:21:53 +1200 Subject: [R] Dealing with NA's in a data matrix Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From privatn at yahoo.fr Mon Sep 5 12:46:18 2011 From: privatn at yahoo.fr (privat NDOUTOUME) Date: Mon, 5 Sep 2011 11:46:18 +0100 (BST) Subject: [R] Need more information about VGLM Message-ID: <1315219578.34961.YahooMailNeo@web26007.mail.ukl.yahoo.com> Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible URL : From billy.requena at gmail.com Mon Sep 5 14:13:23 2011 From: billy.requena at gmail.com (Billy) Date: Mon, 5 Sep 2011 09:13:23 -0300 Subject: [R] Bayesian functions for mle2 object In-Reply-To: References: <1314630180643-3776442.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From alexandra.soberon at unican.es Mon Sep 5 15:40:03 2011 From: alexandra.soberon at unican.es (Soberon Velez, Alexandra Pilar) Date: Mon, 5 Sep 2011 13:40:03 +0000 Subject: [R] KernSmooth: dpill Message-ID: <7546B009C5D8DF4FA41549EE1FBB2EDE23ED32D7@mbx01.unican.es> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From colstat at gmail.com Mon Sep 5 16:30:24 2011 From: colstat at gmail.com (colstat) Date: Mon, 5 Sep 2011 07:30:24 -0700 (PDT) Subject: [R] In optim() function, second parameter in par() missing Message-ID: <1315233024601-3791391.post@n4.nabble.com> Hi, First time using the optim(), can someone please tell me what I am doing wrong? The error looks like this Error in .Internal(pnorm(q, mean, sd, lower.tail, log.p)) : 'sd' is missing An example of the error dat = c(20, 19, 9, 8, 7, 4, 3, 2) dat_mu=mean(dat) dat_s=sd(dat) max.func = function(dat, mu, sd) { pnorm(dat, mu, sd) } optim(fn=max.func, dat=dat, par=c(mu=dat_mu, s=dat_s)) I get sd is missing error. If I wrote par=c(s=dat_s, mu=dat_mu) , then it tells me mu is missing. Can someone please help? Thanks! Colin -- View this message in context: http://r.789695.n4.nabble.com/In-optim-function-second-parameter-in-par-missing-tp3791391p3791391.html Sent from the R help mailing list archive at Nabble.com. From eran at taykey.com Mon Sep 5 17:09:53 2011 From: eran at taykey.com (Eran Eidinger) Date: Mon, 5 Sep 2011 18:09:53 +0300 Subject: [R] capturing a figure to PDF or Image Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From eran at taykey.com Mon Sep 5 17:14:18 2011 From: eran at taykey.com (Eran Eidinger) Date: Mon, 5 Sep 2011 18:14:18 +0300 Subject: [R] capturing a figure to PDF or Image In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sarah.goslee at gmail.com Mon Sep 5 17:20:44 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Mon, 5 Sep 2011 11:20:44 -0400 Subject: [R] capturing a figure to PDF or Image In-Reply-To: References: Message-ID: Hi Eran, To be able to help, we need at a minimum a reproducible example and the output of sessionInfo(). Sarah On Mon, Sep 5, 2011 at 11:14 AM, Eran Eidinger wrote: > Sorry guys, just removed, the "at" prameters. The warnings are gone now, but > I still get an empty image, no matter which function i use (pdf, jpeg, > bmp...) > Any clue? > > Thanks, > Eran. > > On Mon, Sep 5, 2011 at 6:09 PM, Eran Eidinger wrote: > >> Hello, >> >> I've been using jpeg(), bmp() and pdf() to capture plots. >> I've used the parameter "at" in a plot, to define the tickmarks. >> It works fine on screen, but when I try to print it to a file, it gives a >> warning: >> >> "at" is not a graphical parameter >> >> >> and prints an empty figure. Can you help? >> >> >> Thanks, >> >> Eran. >> >> > > -- Sarah Goslee http://www.functionaldiversity.org From bbolker at gmail.com Mon Sep 5 17:21:59 2011 From: bbolker at gmail.com (Ben Bolker) Date: Mon, 5 Sep 2011 15:21:59 +0000 Subject: [R] help with glmm.admb References: Message-ID: Elizabeth C Eadie unm.edu> writes: > > > R glmmADMB question > I am trying to use glmm.admb (the latest alpha version > from the R forge website 0.6.4) to model my count data > that is overdispersed using a negative binomial family but > keep getting the following error message: > > Error in glmm.admb(data$total_bites_rounded ~ > age_class_back, random = ~food.dif.id, : > Argument "group" must be a character string specifying > the name of the grouping variable (also when "random" is > missing) This is an error message from the old version of glmmADMB, so you must somehow (?) have failed to install / start the right version of glmmADMB. What is the result of sessionInfo()? > > Here is what I have tried so far (along with some similar > variations): > model_nb<-glmm.admb(data$total_bites_rounded~age_class_back+ > (1|"subject")+ > (1|food.dif.id)+offset(log(forage_time)), > data=data,family="nbinom") > The primary function of glmmADMB has been renamed "glmmadmb" (more confirmation that you are still using the old version) Also, you shouldn't need the "data$" stuff there because you have specified the 'data' argument. I would say: glmmadmb(total_bites_rounded~age_class_back+(1|focal_individual)+ (1|food.dif.id)+offset(log(forage_time)), data=data,family="nbinom") should work. > > I am not sure what I am doing wrong. My model in lmer that > seemed to work was: > modelc (1|data$focal_individual)+(1|food.dif.id)+ > offset(log(forage_time)),family=poisson) > > Where age class is my one fixed variable and focal > individual (=subject) and food id are my two random > variables. I have tried a number of different things in > glmm.admb like making subject a group and food id the > random variable, and trying to write the commands in the > lme format instead of the lmer format, but always get the > same message. The message is confusing because I think > that I do have a random variable as well as a group > argument that is a character string. If anyone can see > what I am doing wrong or has any suggestions I would > really appreciate your thoughts. > From ligges at statistik.tu-dortmund.de Mon Sep 5 17:26:08 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Mon, 05 Sep 2011 17:26:08 +0200 Subject: [R] sorry, [WinBUGS] question In-Reply-To: <4E64D7D0.6050601@ktk.pte.hu> References: <4E64D7D0.6050601@ktk.pte.hu> Message-ID: <4E64EA10.6080602@statistik.tu-dortmund.de> On 05.09.2011 16:08, Kehl D?niel wrote: > Dear Community, > > I know this is not the place to ask WinBUGS questions, but I did not get > any answers on other lists. > I am rather new to the BUGS language and to bayesian modeling, excuse me > for probably simple questions. > I have to conduct a bayesian meta-analysis of some data. We have > collected observational and randomized studies related to a certain > field of interest. > The idea is to analyse the randomized studies with two different priors. > One is non-informative, the other is calculated from the observational > ones. We also want to use a sceptical prior. > The code I used for the non-informative prior analysis and to get the > other prior is following: > > model > { > for( i in 1 : Num ) { > rc[i] ~ dbin(pc[i], nc[i]) > rt[i] ~ dbin(pt[i], nt[i]) > log(pc[i]) <- mu[i] > log(pt[i]) <- mu[i] + delta[i] > mu[i] ~ dnorm(0,1.0E-5) > delta[i] ~ dnorm(d, tau) > } > d ~ dnorm(0,1.0E-6) > tau ~ dgamma(0.001,0.001) > sigma <- 1 / sqrt(tau) > relr <- exp(d) > } > > which appears to work fine after loading data and initials. (there was a > study with 0 treated and 0 control cases, I had to exclude that one for > some reasons, is there a solution for this?) > If I understand right, I can interpret the "relr" as bayesian estimate > of relative risk, with credible interval etc. > I have some questions in connection with the informative prior analysis: > - after running this same code for the observational data, how do I > change the specification of d and tau? > - how can I get posterior probabilities like relr>1? > - usually how many iterations, thin etc. do we use? > - can I get nice graphics with both priors and posteriors on it? > > I do have to learn everything on my own, so any help is greatly > appreciated. > I know R and the BUGS package are able to communicate, is anybody can > help to solve the task through the R interface would be great. For model bulding and verification, I recommend to use BUGS directly. The interface is nice for running estimation processes and comparing models, not for building them. Uwe Ligges > Thank you for you answer or any kind of help: > Daniel > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From bbolker at gmail.com Mon Sep 5 17:37:54 2011 From: bbolker at gmail.com (Ben Bolker) Date: Mon, 5 Sep 2011 15:37:54 +0000 Subject: [R] Hessian Matrix Issue References: <4E625E7B.1030604@statistik.tu-dortmund.de> Message-ID: Uwe Ligges statistik.tu-dortmund.de> writes: > > I have not really looked into the details of the lengthy and almost > unreadable code below. In any case, there are good reasons why numerics > software typically uses Fisher scoring / IWLS in order to fit GLMs..... > > And if your matrix is that "singular", even the common numerical tricks > may not save the day anymore. 7e-21 is very close to exact singularity! > > Uwe Ligges > Your problem is with the strategy you use to try to deal with non-finite values, i.e. setting the negative log-likelihood to 10^20 if the calculated values are not finite. What happens is that, rather than just pushing the optimization away from a bad value, you get stuck there, which leads to a "solution" to the optimization, which is completely flat (because the objective function is 1e20 for any value near the solution), which leads to an uninvertible hessian. More specifically, inserting a browser() call at the point after the "if (!is.finite())" call and inspecting the results when the objective function is not finite shows that when d=1 the ifelse((d-1)>=0, ...) clause returns (d-1) as a denominator ... Beyond that, I can't spend any more time picking through this ... Ben Bolker From dwinsemius at comcast.net Mon Sep 5 17:41:58 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 5 Sep 2011 11:41:58 -0400 Subject: [R] In optim() function, second parameter in par() missing In-Reply-To: <1315233024601-3791391.post@n4.nabble.com> References: <1315233024601-3791391.post@n4.nabble.com> Message-ID: <42D0A3CB-C01F-469B-BC2C-66F70869B5F2@comcast.net> On Sep 5, 2011, at 10:30 AM, colstat wrote: > Hi, > First time using the optim(), can someone please tell me what I am > doing > wrong? The error looks like this > > Error in .Internal(pnorm(q, mean, sd, lower.tail, log.p)) : > 'sd' is missing > You should be using a textbook. Which one are you consulting? > > An example of the error > dat = c(20, 19, 9, 8, 7, 4, 3, 2) > dat_mu=mean(dat) > dat_s=sd(dat) > > max.func = function(dat, mu, sd) { > pnorm(dat, mu, sd) > } > > optim(fn=max.func, dat=dat, par=c(mu=dat_mu, s=dat_s)) > > I get sd is missing error. If I wrote par=c(s=dat_s, mu=dat_mu) , > then it > tells me mu is missing. You constructed a function with two arguments but optim only passes two argument to the objective function. Build your objective function a) so that it accepts two arguments and b) so that it returns one value. At the moment it does not do either. > Can someone please help? > -- David Winsemius, MD West Hartford, CT From rob.t.slider at gmail.com Mon Sep 5 17:30:43 2011 From: rob.t.slider at gmail.com (RTSlider) Date: Mon, 5 Sep 2011 08:30:43 -0700 (PDT) Subject: [R] p values greater than 1 from lme4 Message-ID: <1315236643683-3791526.post@n4.nabble.com> Hello, I'm running linear regressions using the following script where I have separated out species using the "IDtotsInLn" identifier x<-read.csv('tbl02TOTSInLn_ENV.csv', header=T) x attach (x) library(lme4) rInLn<-lmList(InLn~pMoist | IDtotsInLn, x, pool=F) write.table(summary(rInLn)$coefficients, "rInLnPlots.csv") write.table(summary(rInLn)$r.squared, append=T, "rInLnPlots.csv") write.table(summary(rInLn)$df, append=T, "rInLnPlots.csv") The script seems to be working for most of the species, but for some it is returning a p value of greater than 1 (e.g. 20). I thought this might be for the few cases where the independent variable remained constant, but found other species where this was not the case and the p value was still much greater than 1. Any help would be appreciated -RTS -- View this message in context: http://r.789695.n4.nabble.com/p-values-greater-than-1-from-lme4-tp3791526p3791526.html Sent from the R help mailing list archive at Nabble.com. From L.J.Bonnett at liverpool.ac.uk Mon Sep 5 17:42:25 2011 From: L.J.Bonnett at liverpool.ac.uk (Bonnett, Laura) Date: Mon, 5 Sep 2011 15:42:25 +0000 Subject: [R] SAS code in R Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Mon Sep 5 17:58:35 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 5 Sep 2011 11:58:35 -0400 Subject: [R] capturing a figure to PDF or Image In-Reply-To: References: Message-ID: On Sep 5, 2011, at 11:14 AM, Eran Eidinger wrote: > Sorry guys, just removed, the "at" prameters. The warnings are gone > now, but > I still get an empty image, no matter which function i use (pdf, jpeg, > bmp...) > Any clue? The most common error is failing to call dev.off() after the plotting is done. The second most common is to not read the FAQ regarding why lattice and ggplot2 plotting functions don't plot. -- David. > > Thanks, > Eran. > > On Mon, Sep 5, 2011 at 6:09 PM, Eran Eidinger wrote: > >> Hello, >> >> I've been using jpeg(), bmp() and pdf() to capture plots. >> I've used the parameter "at" in a plot, to define the tickmarks. >> It works fine on screen, but when I try to print it to a file, it >> gives a >> warning: >> >> "at" is not a graphical parameter >> >> >> and prints an empty figure. Can you help? >> >> >> Thanks, >> >> Eran. >> >> > > > -- > * > Eran Eidinger | Taykey Ltd | +972-54-5908077 | www.taykey.com > > > * > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From ligges at statistik.tu-dortmund.de Mon Sep 5 18:06:23 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Mon, 05 Sep 2011 18:06:23 +0200 Subject: [R] How to create R executable? In-Reply-To: <1315217045310-3790883.post@n4.nabble.com> References: <1315217045310-3790883.post@n4.nabble.com> Message-ID: <4E64F37F.9010700@statistik.tu-dortmund.de> On 05.09.2011 12:04, Pariksheet wrote: > Hi , > > I have created .R file which connects to Teradata database and then does > some manipulation and produces the output graph. > > How to create executable for .R file? You cannot. Or do you mean you want to create an R package? In any case, R code is always interpreted. Uwe Ligges > > Thanks > Pariksheet > > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-create-R-executable-tp3790883p3790883.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From dwinsemius at comcast.net Mon Sep 5 18:21:38 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 5 Sep 2011 12:21:38 -0400 Subject: [R] In optim() function, second parameter in par() missing In-Reply-To: <42D0A3CB-C01F-469B-BC2C-66F70869B5F2@comcast.net> References: <1315233024601-3791391.post@n4.nabble.com> <42D0A3CB-C01F-469B-BC2C-66F70869B5F2@comcast.net> Message-ID: On Sep 5, 2011, at 11:41 AM, David Winsemius wrote: > > On Sep 5, 2011, at 10:30 AM, colstat wrote: > >> Hi, >> First time using the optim(), can someone please tell me what I am >> doing >> wrong? The error looks like this >> >> Error in .Internal(pnorm(q, mean, sd, lower.tail, log.p)) : >> 'sd' is missing >> > > You should be using a textbook. Which one are you consulting? > >> >> An example of the error >> dat = c(20, 19, 9, 8, 7, 4, 3, 2) >> dat_mu=mean(dat) >> dat_s=sd(dat) >> >> max.func = function(dat, mu, sd) { >> pnorm(dat, mu, sd) >> } >> >> optim(fn=max.func, dat=dat, par=c(mu=dat_mu, s=dat_s)) >> >> I get sd is missing error. If I wrote par=c(s=dat_s, mu=dat_mu) , >> then it >> tells me mu is missing. > > You constructed a function with two < I meant to say that the function had three arguments (none of them "par") and you were incorrectly assuming the function would be able to find what was inside par despite not passing it.> > arguments but optim only passes two argument to the objective > function. Build your objective function a) so that it accepts two > arguments > and b) so that it returns one value. At the moment it does not do > either. > > >> Can someone please help? >> > > -- > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From giacomo.bezzi at gmail.com Mon Sep 5 19:13:29 2011 From: giacomo.bezzi at gmail.com (Giacomo Bezzi) Date: Mon, 5 Sep 2011 10:13:29 -0700 (PDT) Subject: [R] package gmp installation problem Message-ID: <1315242809139-3791717.post@n4.nabble.com> Hello everybody, Trying to install the package gmp I get the following errors and fail to install: Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/usr/local/lib/R/site-library/gmp/libs/gmp.so': libgmp.so.10: cannot open shared object file: No such file or directory Warning in eval(expr, envir, enclos) : Data for RFC 2409 Oakley groups requires package 'gmp' Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/usr/local/lib/R/site-library/gmp/libs/gmp.so': libgmp.so.10: cannot open shared object file: No such file or directory Warning in eval(expr, envir, enclos) : Data for RFC 2409 Oakley groups requires package 'gmp' ** testing if installed package can be loaded Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/usr/local/lib/R/site-library/gmp/libs/gmp.so': libgmp.so.10: cannot open shared object file: No such file or directory ERROR: loading failed * removing ?/usr/local/lib/R/site-library/gmp? Any clues on how I need to proceed to install the package? Best regards Giacomo -- View this message in context: http://r.789695.n4.nabble.com/package-gmp-installation-problem-tp3791717p3791717.html Sent from the R help mailing list archive at Nabble.com. From S.Ellison at LGCGroup.com Mon Sep 5 18:44:27 2011 From: S.Ellison at LGCGroup.com (S Ellison) Date: Mon, 5 Sep 2011 17:44:27 +0100 Subject: [R] How to create R executable? In-Reply-To: <1315217045310-3790883.post@n4.nabble.com> References: <1315217045310-3790883.post@n4.nabble.com> Message-ID: Do you mean 'how do you execute the script?'? If so, the executable is called R. You might mean R -f .R , assuming R is on your path. S > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Pariksheet > Sent: 05 September 2011 11:04 > To: r-help at r-project.org > Subject: [R] How to create R executable? > > Hi , > > I have created .R file which connects to Teradata database > and then does some manipulation and produces the output graph. > > How to create executable for .R file? > > Thanks > Pariksheet > > -- > View this message in context: > http://r.789695.n4.nabble.com/How-to-create-R-executable-tp379 > 0883p3790883.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ******************************************************************* This email and any attachments are confidential. Any use...{{dropped:8}} From djnordlund at frontier.com Mon Sep 5 20:28:37 2011 From: djnordlund at frontier.com (Daniel Nordlund) Date: Mon, 5 Sep 2011 11:28:37 -0700 Subject: [R] what is wrong with my quicksort? In-Reply-To: <39E2DB22-0BC3-41F9-9659-136808CE06B4@gmail.com> References: <1315101080660-3788681.post@n4.nabble.com><1315135080530-3789080.post@n4.nabble.com> <39E2DB22-0BC3-41F9-9659-136808CE06B4@gmail.com> Message-ID: <581CAA2434E44077A8E8D8540C6439A7@Gandalf> > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of peter dalgaard > Sent: Sunday, September 04, 2011 11:09 AM > To: warc > Cc: r-help at r-project.org > Subject: Re: [R] what is wrong with my quicksort? > > > On Sep 4, 2011, at 13:18 , warc wrote: > > > the error message I get is: > > > > Error in while (x[j] >= pivot) { : Argument has length 0 > > > > so either pivot or x[j] is NULL. > > and it somestimes happens the first time the program enters the > recursion, > > sometimes the 6. or anywhere inbetween. > > > > Well, then print out x, j, and pivot just before hitting that test (i.e., > before the loop and at the end of it). > > With sample() in the code, you will naturally get different results at > each run. > > It's your problem, so your debugging, but I'd wager that nothing is > keeping j from hitting zero if you sample a pivot equal to min(x). > I think Peter is correct that there is a problem with the stopping rules. However, there is also a problem in that quick sort is designed to sort in place, and therefore the recursive calls must pass the vector to be sorted by reference (and R passes by value). Otherwise, changes are only being made to copies and will be lost when returning from the partition function. I am reluctant to provide a solution, because (1) R already has a sort routine, therefore (2) this may be homework, and (3) I am not a skilled R programmer. However, I did successfully write a quick sort routine using a standard algorithm with 2 changes: 1. don't pass the vector to be sorted to the partition function, 2. use the <<- operator when swapping elements to make changes in place If appropriate, I would be willing to post my solution for discussion. I could probably benefit from such a discussion myself. Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA > > > > > > > jholtman wrote: > >> > >> have you tried to debug it yourself. All you said is that 'it went > >> wrong'. that is not a very clear statement of the problem. If I were > to > >> start looking at it, I would put some print statements in it to see > what > >> is happening on eachpath and with each set of data. Have you tried > this? > >> > >> Sent from my iPad > >> > >> On Sep 3, 2011, at 21:51, warc <conny-clauss at gmx.de> wrote: > >> > >>> Hey guys, > >>> I tried to program quicksort like this but somethings wrong. > >>> > >>> please help > >>> > >>> > >>> > >>>> partition <- function(x, links, rechts){ > >>>> > >>>> i <- links > >>>> j <- rechts > >>>> t <- 0 > >>>> pivot <- sample(x[i:j],1) > >>>> > >>>> while(i <= j){ > >>>> > >>>> while(x[i] <= pivot){ > >>>> i = i+1} > >>>> > >>>> while(x[j] >= pivot){ > >>>> j = j-1} > >>>> > >>>> if( i <= j){ > >>>> > >>>> t = x[i] > >>>> x[i] = x[j] > >>>> x[j] = t > >>>> > >>>> i=i+1 > >>>> j=j-1 > >>>> > >>>> } > >>>> print(pivot) > >>>> > >>>> > >>>> } > >>>> #Rekursion > >>>> > >>>> if(links < j){ > >>>> partition(x, links, j)} > >>>> if(i < rechts){ > >>>> partition(x, i, rechts)} > >>>> > >>>> return(x) > >>>> } > >>>> > >>>> > >>>> quicksort <- function(x){ > >>>> > >>>> > >>>> > >>>> partition(x, 1, length(x)) > >>>> } > >>> > >>> > >>> > >>> thx > >>> > >>> -- > >>> View this message in context: > >>> http://r.789695.n4.nabble.com/what-is-wrong-with-my-quicksort- > tp3788681p3788681.html > >>> Sent from the R help mailing list archive at Nabble.com. > >>> > >>> ______________________________________________ > >>> R-help at r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > > > -- > > View this message in context: http://r.789695.n4.nabble.com/what-is- > wrong-with-my-quicksort-tp3788681p3789080.html > > Sent from the R help mailing list archive at Nabble.com. > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > "D?den skal tape!" --- Nordahl Grieg > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From mailinglist.honeypot at gmail.com Mon Sep 5 20:48:35 2011 From: mailinglist.honeypot at gmail.com (Steve Lianoglou) Date: Mon, 5 Sep 2011 14:48:35 -0400 Subject: [R] Need more information about VGLM In-Reply-To: <1315219578.34961.YahooMailNeo@web26007.mail.ukl.yahoo.com> References: <1315219578.34961.YahooMailNeo@web26007.mail.ukl.yahoo.com> Message-ID: Hi, On Mon, Sep 5, 2011 at 6:46 AM, privat NDOUTOUME wrote: > Hi, > > I'm working on multiple logistic regression. I used the function vglm (Package VGAM) in R. Now, i'd like to implent this function (vglm) in C++. > Could someone help me or send me the algorithm. Download and extract the source version of the package from its page on CRAN: http://cran.r-project.org/web/packages/VGAM/index.html Look for the "Package source" link. It's not clear what part you want to re-implement, but the code for the entire package is in there. I reckon you'll be able to fish out the parts of whatever you are looking for yourself. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact From bbolker at gmail.com Mon Sep 5 23:28:17 2011 From: bbolker at gmail.com (Ben Bolker) Date: Mon, 5 Sep 2011 21:28:17 +0000 Subject: [R] p values greater than 1 from lme4 References: <1315236643683-3791526.post@n4.nabble.com> Message-ID: RTSlider gmail.com> writes: > > Hello, > I'm running linear regressions using the following script where I have > separated out species using the "IDtotsInLn" identifier > > x<-read.csv('tbl02TOTSInLn_ENV.csv', header=T) > x > attach (x) > library(lme4) > > rInLn<-lmList(InLn~pMoist | IDtotsInLn, x, pool=F) > write.table(summary(rInLn)$coefficients, "rInLnPlots.csv") > write.table(summary(rInLn)$r.squared, append=T, "rInLnPlots.csv") > write.table(summary(rInLn)$df, append=T, "rInLnPlots.csv") > > The script seems to be working for most of the species, but for some it is > returning a p value of greater than 1 (e.g. 20). I thought this might be for > the few cases where the independent variable remained constant, but found > other species where this was not the case and the p value was still much > greater than 1. > Any help would be appreciated > -RTS This is very interesting but practically impossible to solve because it's not reproducible; is there any chance that you can make the data available? You can send it directly to me (Ben Bolker -- my e-mail is pretty easy to find on the web) if you like. Ben Bolker From rolf.turner at xtra.co.nz Mon Sep 5 23:32:31 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Tue, 06 Sep 2011 09:32:31 +1200 Subject: [R] Getting the values out of histogram (lattice) In-Reply-To: References: <201108312301.p7VN1O2N007997@mail12.tpg.com.au> <4E5F112B.5090803@xtra.co.nz> Message-ID: <4E653FEF.6070009@xtra.co.nz> On 05/09/11 21:26, Deepayan Sarkar wrote: > 1. The `official' way to get panel arguments is trellis.panelArgs(); e.g., > >> p<- histogram(~rnorm(100) | gl(2, 50), type = "density") >> str(trellis.panelArgs(p, 2)) > List of 5 > $ x : num [1:50] 0.277 1.144 1.13 -0.912 -0.892 ... > $ breaks : num [1:9] -2.561 -1.979 -1.398 -0.816 -0.234 ... > $ type : chr "density" > $ equal.widths: logi TRUE > $ nint : num 8 > > 2. hist.constructor() is needed for technical reasons, and can be > considered to be the same as hist() for this purpose. So the > computations performed by panel.histogram() can be reduced to > > > histogram.computations<- > function(x, breaks, equal.widths = TRUE, > type = "density", nint, ...) > { > if (is.null(breaks)) > { > breaks<- > if (is.factor(x)) seq_len(1 + nlevels(x)) - 0.5 > else if (equal.widths) do.breaks(range(x, finite = TRUE), nint) > else quantile(x, 0:nint/nint, na.rm = TRUE) > } > hist(x, breaks = breaks, plot = FALSE) > } > > which may be used as follows to get the ``actual data defining the histogram'': > >> a<- trellis.panelArgs(p, 2) >> h<- do.call(histogram.computations, a) >> str(h) > List of 7 > $ breaks : num [1:9] -2.561 -1.979 -1.398 -0.816 -0.234 ... > $ counts : int [1:8] 1 4 6 14 7 8 6 4 > $ intensities: num [1:8] 0.0344 0.1375 0.2062 0.4812 0.2406 ... > $ density : num [1:8] 0.0344 0.1375 0.2062 0.4812 0.2406 ... > $ mids : num [1:8] -2.2704 -1.6885 -1.1065 -0.5246 0.0573 ... > $ xname : chr "x" > $ equidist : logi TRUE > - attr(*, "class")= chr "histogram" As usual: Clear, concise, and useful! Thanks! cheers, Rolf From bbolker at gmail.com Thu Sep 1 03:02:31 2011 From: bbolker at gmail.com (Ben Bolker) Date: Thu, 1 Sep 2011 03:02:31 +0200 Subject: [R] Bayesian functions for mle2 object In-Reply-To: References: <1314630180643-3776442.post@n4.nabble.com> Message-ID: <4E5ED9A7.5090505@gmail.com> On 11-09-05 02:13 PM, Billy wrote: > Dear Dr. Ben Bolker and 'JLucke', > > Thanks for your comments, but I'm still facing some problems. > For example, using the gls() function, I receive an error message and > I'm not sure I'm writing the arguments in the right way. Well, (if you still need help with this -- your comments below seem like you're making progress with a different approach) what did you try, and what is the error message? We can't help you without details ... > Instead, I thought about my original models and realized that I was > modelling variance as a linear function of the predictor variable, which > could drop off to zero values. Changing > > sd = c0 + c1*x > > to > > sd = c0*x^c1 > > avoids the zero values and all problematic models have converged. Paying > attention on the estimates, they also make sense. Good. > > The new problem now (and probably due to my weak Mathematical skills) is > that one set o candidate models includes models that consider the effect > of not only one, but two predictor variables on the response (y). > > How could be the right way to model that? > > sd = c0 * (x ^ c1) * (w ^ c2) > > or > > sd = (c0 * x ^ c1) + (c0 * w ^ c2) ? > > In which c0, c1, and c2 are constant parameters, and x and w are > different predictor variables. > > Thanks again > > Billy It's not clear that either of them is necessarily more "right" than the others, but you could (much) more easily implement the first in gls(); take a look at ?varComb ... you would use something like weights=varComb(varPower(form=~x),varPower(form=~w)) > > On Mon, Aug 29, 2011 at 3:50 PM, Ben Bolker > wrote: > > Billy.Requena gmail.com > writes: > > > > > Hi everybody, > > > > I'm interested in evaluating the effect of a continuous variable > on the mean > > and/or the variance of my response variable. I have built functions > > expliciting these and used the 'mle2' function to estimate the > coefficients, > > as follows: > > > > func.1 <- function(m=62.9, c0=8.84, c1=-1.6) > > { > > s <- c0+c1*(x) > > -sum(dnorm(y, mean=m, sd=s,log=T)) > > } > > > > m1 <- mle2(func.1, method="SANN") > > > > However, the estimation of the effect of x on the variance of y > usually has > > dealt some troubles, resulting in no convergencies or sd of estimates > > extremely huge. I tried using different optimizers, but I still > faced the > > some problems. > > > > When I had similar troubles in 'GLMM' statistical universe, I used > bayesian > > functions to solve this problem, enjoyning the flexibility of > different > > start points to reach the maximum likelihood estimates. However, I > have no > > idea which package or which function to use to solve the specific > problem > > I'm facing now. > > Does anyone have a clue? > > Thanks in advance > > Unless I'm missing something, you can fit this model > (more easily) in gls() from the nlme package, which allows models > for heteroscedasticity. See ?nlme::varConstPower > > gls(y~1,weights=varPower(power=1,form=~x),data) > > This gives you a standard deviation proportional to (t1+|v|); > that is, if the baseline residual standard deviation is S, then > the standard deviation is S*(t1+|v|), so S would correspond to > your c1 and S*t1 would correspond to your c0. > > Ben Bolker > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > Gustavo Requena > PhD student - Laboratory of Arthropod Behavior and Evolution > Universidade de S?o Paulo > Correspondence adress: > a/c Glauco Machado > Departamento de Ecologia - IBUSP > Rua do Mat?o - Travessa 14 no 321 Cidade Universit?ria, S?o Paulo - SP, > Brasil > CEP 05508-900 > Phone number: 55 11 3091-7488 > > http://ecologia.ib.usp.br/opilio/gustavo.html From alexandra.soberon at unican.es Mon Sep 5 23:48:34 2011 From: alexandra.soberon at unican.es (Soberon Velez, Alexandra Pilar) Date: Mon, 5 Sep 2011 21:48:34 +0000 Subject: [R] multivariate bandwidth: regband Message-ID: <7546B009C5D8DF4FA41549EE1FBB2EDE23ED3556@mbx01.unican.es> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dbrownjr at umn.edu Mon Sep 5 22:56:21 2011 From: dbrownjr at umn.edu (David Brown) Date: Mon, 5 Sep 2011 15:56:21 -0500 Subject: [R] Receive "unable to load shared object RNetCDF.o" during R INSTALL of RNetCDF Message-ID: On a Red Hat Linux cluster I am seeing the following after multiple other packages were successfully installed. The error seems to suggest that RNetCDF.o was not copied to the appropriate lib folder. The admin user performing the install has the required privileges to perform the install. Words of wisdom are greatly appreciated: R CMD INSTALL --configure-args="--with-netcdf-include='/soft/local/netcdf/netcdf-3.6.2/include/' --with-netcdf-lib='/soft/local/netcdf/netcdf-3.6.2/libso' --with-udunits-include='/soft/local/udunits-2.1.23/include' --with-udunits-lib='/soft/local/udunits-2.1.23/lib'" RNetCDF_1.5.2-2.tar.gz * installing to library ?/soft/local/r/R-2.13.1/lib64/R/library? * installing *source* package ?RNetCDF? ... checking for gcc... gcc -std=gnu99 checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc -std=gnu99 accepts -g... yes checking for gcc -std=gnu99 option to accept ISO C89... none needed checking for nc_open in -lnetcdf... yes checking for utInit in -ludunits2... no checking for utScan in -ludunits2... yes checking how to run the C preprocessor... gcc -std=gnu99 -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... no checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking netcdf.h usability... yes checking netcdf.h presence... yes checking for netcdf.h... yes checking udunits.h usability... yes checking udunits.h presence... yes checking for udunits.h... yes configure: creating ./config.status config.status: creating R/load.R config.status: creating src/Makevars ** libs gcc -std=gnu99 -I/soft/local/r/R-2.13.1/lib64/R/include -I/soft/local/udunits-2.1.23/include -I/soft/local/netcdf/netcdf-3.6.2/include/ -I/usr/local/include -fpic -g -O2 -c RNetCDF.c -o RNetCDF.o gcc -std=gnu99 -shared -L/usr/local/lib64 -o RNetCDF.so RNetCDF.o -ludunits2 -lnetcdf -L/soft/local/udunits-2.1.23/lib -L/soft/local/netcdf/netcdf-3.6.2/libso -lexpat installing to /soft/local/r/R-2.13.1/lib64/R/library/RNetCDF/libs ** R ** preparing package for lazy loading ** help *** installing help indices ** building package indices ... ** testing if installed package can be loaded Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/soft/local/r/R-2.13.1/lib64/R/library/RNetCDF/libs/RNetCDF.so': libudunits2.so.0: cannot open shared object file: No such file or directory Error: loading failed Execution halted ERROR: loading failed * removing ?/soft/local/r/R-2.13.1/lib64/R/library/RNetCDF? [swadm at boar01]/soft/local/temp% From igors.lahanciks at gmail.com Mon Sep 5 23:58:27 2011 From: igors.lahanciks at gmail.com (Igors) Date: Mon, 5 Sep 2011 14:58:27 -0700 (PDT) Subject: [R] function censReg in panel data setting Message-ID: <1315259907768-3792227.post@n4.nabble.com> Hello all, I have a problem estimating Random Effects model using censReg function. small part of code: UpC <- censReg(Power ~ Windspeed, left = -Inf, right = 2000,data=PData_In,method="BHHH",nGHQ = 4) Error in maxNRCompute(fn = logLikAttr, fnOrig = fn, gradOrig = grad, hessOrig = hess, : NA in the initial gradient ...then I tried to set starting values myself and here is the error what I got: UpC <- censReg(Power ~ Windspeed, left = -Inf, right = 2000,data=PData_In,method="BHHH",nGHQ = 4,start=c(-691,189,5)) Error in names(start) <- c(colnames(xMat), "logSigmaMu", "logSigmaNu") : 'names' attribute [4] must be the same length as the vector [3] How can I solve any of these errors? Thanks in advance! Best, Igors -- View this message in context: http://r.789695.n4.nabble.com/function-censReg-in-panel-data-setting-tp3792227p3792227.html Sent from the R help mailing list archive at Nabble.com. From rob.t.slider at gmail.com Tue Sep 6 00:09:44 2011 From: rob.t.slider at gmail.com (RTSlider) Date: Mon, 5 Sep 2011 15:09:44 -0700 (PDT) Subject: [R] p values greater than 1 from lme4 In-Reply-To: References: <1315236643683-3791526.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sjkiss at gmail.com Mon Sep 5 23:48:57 2011 From: sjkiss at gmail.com (Simon Kiss) Date: Mon, 5 Sep 2011 17:48:57 -0400 Subject: [R] htmlParse hangs or crashes Message-ID: Dear colleagues, each time I use htmlParse, R crashes or hangs. The url I'd like to parse is included below as is the results of a series of basic commands that describe what I'm experiencing. The results of sessionInfo() are attached at the bottom of the message. The thing is, htmlTreeParse appears to work just fine, although it doesn't appear to contain the information I need (the URLs of the articles linked to on this search page). Regardless, I'd still like to understand why htmlParse doesn't work. Thank you for any insight. Yours, Simon Kiss myurl<-c("http://timesofindia.indiatimes.com/searchresult.cms?sortorder=score&searchtype=2&maxrow=10&startdate=2001-01-01&enddate=2011-08-25&article=2&pagenumber=1&isphrase=no&query=IIM&searchfield=§ion=&kdaterange=30&date1mm=01&date1dd=01&date1yyyy=2001&date2mm=08&date2dd=25&date2yyyy=2011") .x<-htmlParse(myurl) class(.x) #returns "HTMLInternalDocument" "XMLInternalDocument" .x #returns *** caught segfault *** address 0x1398754, cause 'memory not mapped' Traceback: 1: .Call("RS_XML_dumpHTMLDoc", doc, as.integer(indent), as.character(encoding), as.logical(indent), PACKAGE = "XML") 2: saveXML(from) 3: saveXML(from) 4: asMethod(object) 5: as(x, "character") 6: cat(as(x, "character"), "\n") 7: print.XMLInternalDocument() 8: print() Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace sessionInfo() R version 2.13.0 (2011-04-13) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_CA.UTF-8/en_CA.UTF-8/C/C/en_CA.UTF-8/en_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] XML_3.4-0 RCurl_1.5-0 bitops_1.0-4.1 ********************************* Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 905 746 7606 From colstat at gmail.com Tue Sep 6 02:19:26 2011 From: colstat at gmail.com (colstat) Date: Mon, 5 Sep 2011 17:19:26 -0700 (PDT) Subject: [R] In optim() function, second parameter in par() missing In-Reply-To: References: <1315233024601-3791391.post@n4.nabble.com> <42D0A3CB-C01F-469B-BC2C-66F70869B5F2@comcast.net> Message-ID: <1315268366193-3792356.post@n4.nabble.com> Thanks, David. You suggestion worked very well. The par() in optim() only takes one argument, so I combine it into a vector. Now, I will run it with my actual code and see what happens. Colin -- View this message in context: http://r.789695.n4.nabble.com/In-optim-function-parameter-in-par-missing-tp3791391p3792356.html Sent from the R help mailing list archive at Nabble.com. From arne.henningsen at googlemail.com Tue Sep 6 06:41:24 2011 From: arne.henningsen at googlemail.com (Arne Henningsen) Date: Tue, 6 Sep 2011 06:41:24 +0200 Subject: [R] function censReg in panel data setting In-Reply-To: <1315259907768-3792227.post@n4.nabble.com> References: <1315259907768-3792227.post@n4.nabble.com> Message-ID: Dear Igors On 5 September 2011 23:58, Igors wrote: > I have a problem estimating Random Effects model using censReg function. > > small part of code: > > UpC <- censReg(Power ~ Windspeed, left = -Inf, right = > 2000,data=PData_In,method="BHHH",nGHQ = 4) > > Error in maxNRCompute(fn = logLikAttr, fnOrig = fn, gradOrig = grad, > hessOrig = hess, ?: > ?NA in the initial gradient > > > ...then I tried to set starting values myself and here is the error what I > got: > > UpC <- censReg(Power ~ Windspeed, left = -Inf, right = > 2000,data=PData_In,method="BHHH",nGHQ = 4,start=c(-691,189,5)) > > Error in names(start) <- c(colnames(xMat), "logSigmaMu", "logSigmaNu") : > ?'names' attribute [4] must be the same length as the vector [3] > > > How can I solve any of these errors? You chose a suitable solution for the first problem (NA in initial gradient). Unfortunately, the documentation of "censReg" is not very clear regarding starting values of panel tobit models. Please note that you have to specify 4 starting values: intercept, slope parameter, variance of the individual effects ("mu"), and variance of the general error term ("nu"). http://cran.r-project.org/web/packages/censReg/vignettes/censReg.pdf Best wishes from Copenhagen, Arne -- Arne Henningsen http://www.arne-henningsen.name From daniel at umd.edu Tue Sep 6 06:43:02 2011 From: daniel at umd.edu (Daniel Malter) Date: Mon, 5 Sep 2011 21:43:02 -0700 (PDT) Subject: [R] plm package, R squared, dummies in panel data In-Reply-To: <000601cc6bbd$77f3d040$67db70c0$@carmo@ua.pt> References: <000601cc6bbd$77f3d040$67db70c0$@carmo@ua.pt> Message-ID: <1315284182071-3792582.post@n4.nabble.com> Hi, I answered this question before in this post: http://r.789695.n4.nabble.com/Regressions-with-fixed-effect-in-R-td2173314.html , specifically in my message from May 11, 2010; 4:30pm. However, I believe the newer version of plm shows an R-squared, which should be the within R-squared. Why the programmers of the package decided to not show the others or to not provide a test of the significance of the a(i)s, I don't know. As for your second question, the plm function has an option as to whether the FEs should be "id", "time", or "twoways", i.e., for id and time. I would guess that then the time FEs are not estimated either but differenced out, too. This should affect the within/between variance and therefore the within and between R-squareds. HTH, Daniel Cecilia Carmo wrote: > > Hi R-helpers, > > > > I have two questions I hope you could help me with them: > > > > In the plm package how can I calculate the R2 within, R2 between and R2 > overall? Is there any special reason to not display these values? > > > > When using first differences do I need to have some special care with > dummies (both year dummies and industry dummies)? > > (A friend who works with Stata told me that there is necessary to > construct > some ?accumulated dummies?.) > > > > Thank you very much, > > > > Cec?lia Carmo > > (Universidade de Aveiro ? Portugal) > > > > > > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- View this message in context: http://r.789695.n4.nabble.com/plm-package-R-squared-dummies-in-panel-data-tp3791018p3792582.html Sent from the R help mailing list archive at Nabble.com. From eran at taykey.com Tue Sep 6 07:53:23 2011 From: eran at taykey.com (Eran Eidinger) Date: Tue, 6 Sep 2011 08:53:23 +0300 Subject: [R] capturing a figure to PDF or Image In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From eran at taykey.com Tue Sep 6 07:54:25 2011 From: eran at taykey.com (Eran Eidinger) Date: Tue, 6 Sep 2011 08:54:25 +0300 Subject: [R] capturing a figure to PDF or Image In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From knifeboot at 163.com Tue Sep 6 08:01:00 2011 From: knifeboot at 163.com (KnifeBoot) Date: Tue, 6 Sep 2011 14:01:00 +0800 (CST) Subject: [R] r-help volcano plot Message-ID: <146393c.e200.1323d534767.Coremail.knifeboot@163.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From hasan.diwan at gmail.com Tue Sep 6 08:51:09 2011 From: hasan.diwan at gmail.com (Hasan Diwan) Date: Tue, 6 Sep 2011 08:51:09 +0200 Subject: [R] r-help volcano plot In-Reply-To: <146393c.e200.1323d534767.Coremail.knifeboot@163.com> References: <146393c.e200.1323d534767.Coremail.knifeboot@163.com> Message-ID: On 6 September 2011 08:01, KnifeBoot wrote: > ?Can't installe packag maDB or limma. Which R version, and what platform are you using? -- Sent from my mobile device Envoyait de mon telephone mobil From daniel at umd.edu Tue Sep 6 09:00:09 2011 From: daniel at umd.edu (Daniel Malter) Date: Tue, 6 Sep 2011 00:00:09 -0700 (PDT) Subject: [R] glm In-Reply-To: <1315216053.33410.YahooMailClassic@web29308.mail.ird.yahoo.com> References: <1315216053.33410.YahooMailClassic@web29308.mail.ird.yahoo.com> Message-ID: <1315292409109-3792732.post@n4.nabble.com> y is the dependent variable, not a predictor or independent variable. since this is a binomial model, y should be 0/1 or, atypically, a proportion. HTH, Daniel Samuel Okoye wrote: > > Dear all, > > I am using glm with quasibinomial. What does the following error message > mean: > > Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1 > > Does it mean that the predictor variable should only have zero and one or > it is also possible to have continuous values between zero and one? > > Many thanks, > Samuel > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- View this message in context: http://r.789695.n4.nabble.com/glm-tp3790845p3792732.html Sent from the R help mailing list archive at Nabble.com. From daniel at umd.edu Tue Sep 6 09:03:34 2011 From: daniel at umd.edu (Daniel Malter) Date: Tue, 6 Sep 2011 00:03:34 -0700 (PDT) Subject: [R] Dealing with NA's in a data matrix In-Reply-To: References: Message-ID: <1315292614612-3792737.post@n4.nabble.com> Please provide the data as self-contained code, as requested in the posting guide, so that helpers can directly paste it into R. Alternatively, you can provide the dilomaaethiops.txt. Best, Daniel Watching19 wrote: > > hello... I am trying to get this code to work, but as I get to the predict > command, it displays an error due to the length of the data sets from the > removal of the NA's. Here is the data, and the code that I am using so > far, if you run it, you'll see the error pop up.... please help me to get > around this problem. Thanks in advance. > > Location Dist. Size > low1 .5 10.5 > low2 .5 23 > low3 .5 NA > low4 .5 NA > mid1 3 15 > mid2 3 11.5 > mid3 3 15 > mid4 3 NA > high1 6 12.5 > high2 6 20 > high3 6 21 > high4 6 22 > wall1 10 13 > wall2 10 12.5 > wall3 10 13 > wall4 10 12 > > d<-read.table("dilomaaethiops.txt",header=T) > d > str(d) > range(d$Dist.) > s<-d$Size > range(s) > s[s<.1]<-NA > range(s,na.rm=T) > summary(d) > length(s) > length(d$Dist.) > plot(s~d$Dist.,xlab="Distance from Shoreline (m)",ylab="D. aethiops Size > (mm)", pch=7) > a1<-glm(s~d$Dist.,family="quasipoisson",data=d) > scatter.smooth(d$Dist.,s,ylab="D. aethiops Length (mm)",xlab="Distance > from > Shoreline (m)",col="red") > summary(a1) > a2<-aov(s~d$Dist.,data=d) > summary(a2) > shapiro.test(a2$residuals) > qqnorm(a2$residuals) > qqline(a2$residuals) > fligner.test(s~d$Dist.,data=d) > kruskal.test(s~d$Dist.,data=d) > df<-df.residual(a1) > t<-qt(.975,df) > k<-data.frame(s=seq(10.5,max(s,na.rm=T),1)) > p<-predict(a1,k,se=T) > est<-exp(p$fit) > low<-exp(p$fit-t*p$se.fit) > hight<-exp(p$fit-t*p$se.fit) > plot(d$Dist.,s,ylim=c(min(s),max(s)),xlab="Distance from Shorline (m)", > ylab="Length of D. aethiops (mm)", pch=7) > lines(k$s,est,lty=1,col="red") > lines(k$s,low,lty=2,col="blue") > lines(k$s,high,lty=2,col="blue") > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- View this message in context: http://r.789695.n4.nabble.com/Dealing-with-NA-s-in-a-data-matrix-tp3791462p3792737.html Sent from the R help mailing list archive at Nabble.com. From anna.dunietz at gmail.com Tue Sep 6 08:28:27 2011 From: anna.dunietz at gmail.com (Duny) Date: Mon, 5 Sep 2011 23:28:27 -0700 (PDT) Subject: [R] r-help volcano plot In-Reply-To: <146393c.e200.1323d534767.Coremail.knifeboot@163.com> References: <146393c.e200.1323d534767.Coremail.knifeboot@163.com> Message-ID: <1315290507382-3792696.post@n4.nabble.com> Use the ggplot2 package in order to make a volcano plot! Check out the following book for more information about the package: ggplot2: Elegant Graphics for Data Analysis (Use R) by Hadley Wickham. ggplot2 is great for creating professional graphics in no time. If you look up stat_density in R, you will find the following example at the bottom of the page: # Make a volcano plot ggplot(diamonds, aes(x = price)) + stat_density(aes(ymax = ..density.., ymin = -..density..), fill = "grey50", colour = "grey50", geom = "ribbon", position = "identity") + facet_grid(. ~ cut) + coord_flip() Good luck! Anna -- View this message in context: http://r.789695.n4.nabble.com/r-help-volcano-plot-tp3792651p3792696.html Sent from the R help mailing list archive at Nabble.com. From igors.lahanciks at gmail.com Tue Sep 6 07:51:12 2011 From: igors.lahanciks at gmail.com (Igors) Date: Mon, 5 Sep 2011 22:51:12 -0700 (PDT) Subject: [R] function censReg in panel data setting In-Reply-To: References: <1315259907768-3792227.post@n4.nabble.com> Message-ID: <1315288271999-3792639.post@n4.nabble.com> >You chose a suitable solution for the first problem (NA in initial >gradient). Unfortunately, the documentation of "censReg" is not very >clear regarding starting values of panel tobit models. Please note >that you have to specify 4 starting values: intercept, slope >parameter, variance of the individual effects ("mu"), and variance of >the general error term ("nu"). I have already tried to use 4 parameters, here is what I get: > UpC <- censReg(Power ~ Windspeed, left = -Inf, right = > 2000,data=PData_In,method="BHHH",nGHQ = 4,start=c(-691,189,5,5)) Error in censReg(Power ~ Windspeed, left = -Inf, right = 2000, data = PData_In, : argument 'start' must have length 3 Any ideas how to overcome this one? Could this be an error in package or censReg function? All the best, Igors Best wishes from Copenhagen, Arne -- Arne Henningsen http://www.arne-henningsen.name ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/function-censReg-in-panel-data-setting-tp3792227p3792639.html Sent from the R help mailing list archive at Nabble.com. From Mark.Ebbert at hci.utah.edu Tue Sep 6 08:24:08 2011 From: Mark.Ebbert at hci.utah.edu (Mark Ebbert) Date: Tue, 6 Sep 2011 00:24:08 -0600 Subject: [R] write.matrix row names vs sink vs capture.output Message-ID: <53A88CC9-FD22-4CFC-B107-1AFE20FBF93E@hci.utah.edu> Dear R gurus, I am trying to write several large matrices (~ 1GB) to separate files. I have learned that write.table is simply too slow for this task and was attempting to use write.matrix, but write.matrix does not have the ability to include row names in the output. Anyone know why that's the case? I've seen a thread stating that write.matrix is the way to go for large prints to files, but it doesn't do what I need it to. Since write.matrix wasn't working I tried both sink and capture.output, but then the output is printed to the file using the same 'width' restrictions as the general "options(width=)" limit. Any ideas on how to print a large matrix with row names? I could write a perl script to modify the files after the fact, but I shouldn't have to do that. Thanks for your help! Mark T. W. Ebbert From pbarapatre at gmail.com Tue Sep 6 07:44:01 2011 From: pbarapatre at gmail.com (Pariksheet) Date: Mon, 5 Sep 2011 22:44:01 -0700 (PDT) Subject: [R] How to create R executable? In-Reply-To: References: <1315217045310-3790883.post@n4.nabble.com> Message-ID: <1315287841236-3792632.post@n4.nabble.com> Thanks guys. I didn't know that R is interpreted language. Could you suggest good reference? My work is here to do analysis on Data stored on teradata warehouse and plot corresponding graphs. Any idea how to capture output in different image formats? -- View this message in context: http://r.789695.n4.nabble.com/How-to-create-R-executable-tp3790883p3792632.html Sent from the R help mailing list archive at Nabble.com. From peter at engelbrecht.dk Tue Sep 6 09:44:10 2011 From: peter at engelbrecht.dk (Peter Engelbrecht) Date: Tue, 6 Sep 2011 00:44:10 -0700 (PDT) Subject: [R] read.xls (gdata) problem Message-ID: <1315295050681-3792792.post@n4.nabble.com> I've suddenly started seeing a consistent problem with read.xls. No matter what xls file I try I always get an error message of this type: Error in xls2sep(xls, sheet, verbose = verbose, ..., method = method, : Intermediate file '/var/folders/cb/vvshkpm90lx_y2n69qlyw4z40000gn/T//RtmpK50r4g/file546a2722.csv' missing! In addition: Warning message: running command '"/usr/bin/perl" "/Library/Frameworks/R.framework/Versions/2.13/Resources/library/gdata/perl/xls2csv.pl" "/Users/peter/dev/R/telenor pricelists/ild_price_processor/ILD priser 2011.10.01.xls" "/var/folders/cb/vvshkpm90lx_y2n69qlyw4z40000gn/T//RtmpK50r4g/file546a2722.csv" "1"' had status 255 Error in file.exists(tfn) : invalid 'file' argument Error parsing file '/Users/peter/dev/R/telenor pricelists/ild_price_processor/ILD priser 2011.10.01.xls'. E.g. using this code: > fn = file.choose() > read.xls(fn) With many different .xls files, including ones I have read with read.xls in the past. When I try running the xls2csv perl script (which read.xls depends on) directly from the command line I get the following error message: $ perl xls2csv.pl ~/Downloads/900numre.xls ~/Downloads/900numre.csv 1 Loading '/Users/peter/Downloads/900numre.xls'... Error parsing file '/Users/peter/Downloads/900numre.xls'. So it's pretty obvious the perl script part has broken down. The frustrating things is that this worked perfectly fine until today, so I (or my system) has clearly unknowingly changed some part of the configuration (I'm on OS X). Any ideas? Rgds, Peter -- View this message in context: http://r.789695.n4.nabble.com/read-xls-gdata-problem-tp3792792p3792792.html Sent from the R help mailing list archive at Nabble.com. From remkoduursma at gmail.com Tue Sep 6 07:35:59 2011 From: remkoduursma at gmail.com (Remko Duursma) Date: Tue, 6 Sep 2011 15:35:59 +1000 Subject: [R] Sweave : some comments disappear Message-ID: Dear R-helpers, when I have an R code chunk in a sweave file like this: <<>>= x <- 1:10 # this comment disapears x # this one does not! print(x) #mean mean(x) @ The first comment does not appear in the sweaved document, the second one does. How can this be? I have tried print=TRUE and keep.source=TRUE, but neither seem to affect this behavior. thanks, Remko ------------------------------------------------- Remko Duursma Research Lecturer Hawkesbury Institute for the Environment University of Western Sydney Hawkesbury Campus, Richmond Mobile: +61 (0)422 096908 www.remkoduursma.com From bhh at xs4all.nl Tue Sep 6 10:00:05 2011 From: bhh at xs4all.nl (Berend Hasselman) Date: Tue, 6 Sep 2011 01:00:05 -0700 (PDT) Subject: [R] capturing a figure to PDF or Image In-Reply-To: References: Message-ID: <1315296005985-3792815.post@n4.nabble.com> eranid wrote: > > Thank you Sarah and David, > > I only used a simple plot, and remembered to dev.off(). > attached is the session: > >> plot(c(1,2,3),c(3,2,4))> jpeg("test.jpg")> dev.off()RStudioGD > 2 > sessionInfo()R version 2.13.1 (2011-07-08) > Put the statement jpeg("test.jpg") BEFORE plot(...). Berend -- View this message in context: http://r.789695.n4.nabble.com/capturing-a-figure-to-PDF-or-Image-tp3791480p3792815.html Sent from the R help mailing list archive at Nabble.com. From Achim.Zeileis at uibk.ac.at Tue Sep 6 10:20:45 2011 From: Achim.Zeileis at uibk.ac.at (Achim Zeileis) Date: Tue, 6 Sep 2011 10:20:45 +0200 (CEST) Subject: [R] Sweave : some comments disappear In-Reply-To: References: Message-ID: On Tue, 6 Sep 2011, Remko Duursma wrote: > Dear R-helpers, > > > when I have an R code chunk in a sweave file like this: > > <<>>= > x <- 1:10 > > # this comment disapears > x > > # this one does not! > print(x) > > #mean > mean(x) > @ > > The first comment does not appear in the sweaved document, the second > one does. How can this be? > > I have tried print=TRUE and keep.source=TRUE, but neither seem to > affect this behavior. I cannot replicate this. I included this in a simple document with \documentclass[a4paper]{article} \begin{document} <<>>= ... @ \end{document} When I use the version above, _no_ comment shows up (as expected). When I use <>= then _all_ comments show up (as expected). This is R 2.13.1 on Debian/GNU Linux. Z > > thanks, > Remko > > > > > > ------------------------------------------------- > Remko Duursma > Research Lecturer > > Hawkesbury Institute for the Environment > University of Western Sydney > Hawkesbury Campus, Richmond > > Mobile: +61 (0)422 096908 > www.remkoduursma.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From azamjaafari at yahoo.com Tue Sep 6 10:44:18 2011 From: azamjaafari at yahoo.com (azam jaafari) Date: Tue, 6 Sep 2011 01:44:18 -0700 (PDT) Subject: [R] How do I define moving window fequency Message-ID: <1315298658.56235.YahooMailNeo@web37107.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From arne.henningsen at googlemail.com Tue Sep 6 11:14:56 2011 From: arne.henningsen at googlemail.com (Arne Henningsen) Date: Tue, 6 Sep 2011 11:14:56 +0200 Subject: [R] function censReg in panel data setting In-Reply-To: <1315288271999-3792639.post@n4.nabble.com> References: <1315259907768-3792227.post@n4.nabble.com> <1315288271999-3792639.post@n4.nabble.com> Message-ID: On 6 September 2011 07:51, Igors wrote: >>You chose a suitable solution for the first problem (NA in initial >>gradient). Unfortunately, the documentation of "censReg" is not very >>clear regarding starting values of panel tobit models. Please note >>that you have to specify 4 starting values: intercept, slope >>parameter, variance of the individual effects ("mu"), and variance of >>the general error term ("nu"). > > > I have already tried to use 4 parameters, here is what I get: > >> UpC <- censReg(Power ~ Windspeed, left = -Inf, right = >> 2000,data=PData_In,method="BHHH",nGHQ = 4,start=c(-691,189,5,5)) > > Error in censReg(Power ~ Windspeed, left = -Inf, right = 2000, data = > PData_In, ?: > ?argument 'start' must have length 3 > > > Any ideas how to overcome this one? Could this be an error in package or > censReg function? Yes, you are right. This is (was) a bug in the censReg function/package. I have fixed the package. The source code of the new version (0.5-6) is available on R-Forge [1]; R packages will be available on CRAN [2] and R-Forge [3] probably within one or two days. [1] https://r-forge.r-project.org/scm/?group_id=256 [2] http://cran.r-project.org/package=censReg [3] https://r-forge.r-project.org/R/?group_id=256 /Arne -- Arne Henningsen http://www.arne-henningsen.name From rat.cage at gmail.com Tue Sep 6 13:01:51 2011 From: rat.cage at gmail.com (eldor ado) Date: Tue, 6 Sep 2011 13:01:51 +0200 Subject: [R] xtable with conditional formatting using \textcolor In-Reply-To: <8C7A8850-599F-4C5E-83E7-2AFBF80808FB@me.com> References: <8C7A8850-599F-4C5E-83E7-2AFBF80808FB@me.com> Message-ID: I have a related question: dataframe df contains values like >df .. "\\textbf{ 0.644 }" .. and the line > print( xtable(df , sanitize.text.function = function(x){x})) converts them to .. & $\backslash$textbf\{ 0.644 \} & .. escaping both double backslashes and brackes. maybe somebody here knows how to prevent xtable from escaping the code? best regards, lukas kohl On Wed, Jun 1, 2011 at 8:47 PM, Marc Schwartz wrote: > On Jun 1, 2011, at 1:33 PM, Walmes Zeviani wrote: > >> Hello list, >> >> I'm doing a table with scores and I want include colors to represent status >> of an individual. I'm using sweave <>= and xtable but I can't >> get a result I want. My attemps are >> >> #----------------------------------------------------------------------------- >> # code R >> >> da <- data.frame(id=letters[1:5], score=1:5*2) >> >> col <- function(x){ >> ?ifelse(x>7, >> ? ? ? ? paste("\textcolor{blue}{", formatC(x, dig=2, format="f"), "}"), >> ? ? ? ? paste("\textcolor{red}{", formatC(x, dig=2, format="f"), "}")) >> } >> >> da$score.string <- col(da$score) >> >> require(xtable) >> xtable(da[,c("id","score.string")]) >> >> #----------------------------------------------------------------------------- >> >> actual result >> #----------------------------------------------------------------------------- >> \begin{tabular}{rll} >> ?\hline >> & id & score.string \\ >> ?\hline >> 1 & a & ? ? extcolor\{red\}\{ 2.00 \} \\ >> ?2 & b & ? ? extcolor\{red\}\{ 4.00 \} \\ >> ?3 & c & ? ? extcolor\{red\}\{ 6.00 \} \\ >> ?4 & d & ? ? extcolor\{blue\}\{ 8.00 \} \\ >> ?5 & e & ? ? extcolor\{blue\}\{ 10.00 \} \\ >> ? \hline >> \end{tabular} >> #----------------------------------------------------------------------------- >> >> desired result (lines omited to save space) >> #----------------------------------------------------------------------------- >> 1 & a & ? ? \textcolor{red}{ 2.00 } \\ >> 2 & b & ? ? \textcolor{red}{ 4.00} \\ >> #----------------------------------------------------------------------------- >> >> Any contribution will be useful. Thanks. >> Walmes. > > > When the '\t' is being cat()'d to the TeX file (or console) by print.xtable(), it is being interpreted as a tab character. You need to escape it with additional backslashes and then adjust the sanitize.text.function in print.xtable() so that it does not touch the backslashes: > > > da <- data.frame(id=letters[1:5], score=1:5*2) > > col <- function(x){ > ?ifelse(x>7, > ? ? ? ?paste("\\textcolor{blue}{", formatC(x, dig=2, format="f"), "}"), > ? ? ? ?paste("\\textcolor{red}{", formatC(x, dig=2, format="f"), "}")) > } > > da$score.string <- col(da$score) > > >> da > ?id score ? ? ? ? ? ? ? score.string > 1 ?a ? ? 2 ? \\textcolor{red}{ 2.00 } > 2 ?b ? ? 4 ? \\textcolor{red}{ 4.00 } > 3 ?c ? ? 6 ? \\textcolor{red}{ 6.00 } > 4 ?d ? ? 8 ?\\textcolor{blue}{ 8.00 } > 5 ?e ? ?10 \\textcolor{blue}{ 10.00 } > > > require(xtable) > > print(xtable(da[,c("id","score.string")]), sanitize.text.function = function(x){x}) > > > That will give you: > > % latex table generated in R 2.13.0 by xtable 1.5-6 package > % Wed Jun ?1 13:44:54 2011 > \begin{table}[ht] > \begin{center} > \begin{tabular}{rll} > ?\hline > ?& id & score.string \\ > ?\hline > 1 & a & \textcolor{red}{ 2.00 } \\ > ?2 & b & \textcolor{red}{ 4.00 } \\ > ?3 & c & \textcolor{red}{ 6.00 } \\ > ?4 & d & \textcolor{blue}{ 8.00 } \\ > ?5 & e & \textcolor{blue}{ 10.00 } \\ > ? \hline > \end{tabular} > \end{center} > \end{table} > > > HTH, > > Marc Schwartz > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From remkoduursma at gmail.com Tue Sep 6 11:04:47 2011 From: remkoduursma at gmail.com (Remko Duursma) Date: Tue, 6 Sep 2011 19:04:47 +1000 Subject: [R] Sweave : some comments disappear In-Reply-To: References: Message-ID: Sorry about that - I did give an example that I thought was reproducible (it is here), but apparently wasn't. I suspect my problem might have to do with the 'highlight' package I am using - which I should have mentioned. I am using the latest version of R. remko ------------------------------------------------- Remko Duursma Research Lecturer Hawkesbury Institute for the Environment University of Western Sydney Hawkesbury Campus, Richmond Mobile: +61 (0)422 096908 www.remkoduursma.com On Tue, Sep 6, 2011 at 6:55 PM, Prof Brian Ripley wrote: > On Tue, 6 Sep 2011, Achim Zeileis wrote: > >> On Tue, 6 Sep 2011, Remko Duursma wrote: >> >>> Dear R-helpers, >>> >>> >>> when I have an R code chunk in a sweave file like this: >>> >>> <<>>= >>> x <- 1:10 >>> >>> # this comment disapears >>> x >>> >>> # this one does not! >>> print(x) >>> >>> #mean >>> mean(x) >>> @ >>> >>> The first comment does not appear in the sweaved document, the second >>> one does. How can this be? >>> >>> I have tried print=TRUE and ?keep.source=TRUE, but neither seem to >>> affect this behavior. >> >> I cannot replicate this. I included this in a simple document with >> >> \documentclass[a4paper]{article} >> \begin{document} >> <<>>= >> ... >> @ >> \end{document} >> >> When I use the version above, _no_ comment shows up (as expected). When I >> use <>= then _all_ comments show up (as expected). >> >> This is R 2.13.1 on Debian/GNU Linux. >> Z > > I suspected he used an obsolete version of R, where such things happened. > ?Which is why we ask in the posting guide to update before posting, give 'at > a minimum' information and a reproducible example. > >> >>> >>> thanks, >>> Remko >>> >>> >>> >>> >>> >>> ------------------------------------------------- >>> Remko Duursma >>> Research Lecturer >>> >>> Hawkesbury Institute for the Environment >>> University of Western Sydney >>> Hawkesbury Campus, Richmond >>> >>> Mobile: +61 (0)422 096908 >>> www.remkoduursma.com >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Brian D. Ripley, ? ? ? ? ? ? ? ? ?ripley at stats.ox.ac.uk > Professor of Applied Statistics, ?http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, ? ? ? ? ? ? Tel: ?+44 1865 272861 (self) > 1 South Parks Road, ? ? ? ? ? ? ? ? ? ? +44 1865 272866 (PA) > Oxford OX1 3TG, UK ? ? ? ? ? ? ? ?Fax: ?+44 1865 272595 > From ggrothendieck at gmail.com Tue Sep 6 13:08:28 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Tue, 6 Sep 2011 07:08:28 -0400 Subject: [R] read.xls (gdata) problem In-Reply-To: <1315295050681-3792792.post@n4.nabble.com> References: <1315295050681-3792792.post@n4.nabble.com> Message-ID: On Tue, Sep 6, 2011 at 3:44 AM, Peter Engelbrecht wrote: > I've suddenly started seeing a consistent problem with read.xls. > > No matter what xls file I try I always get an error message of this type: > > Error in xls2sep(xls, sheet, verbose = verbose, ..., method = method, ?: > ?Intermediate file > '/var/folders/cb/vvshkpm90lx_y2n69qlyw4z40000gn/T//RtmpK50r4g/file546a2722.csv' > missing! > In addition: Warning message: > running command '"/usr/bin/perl" > "/Library/Frameworks/R.framework/Versions/2.13/Resources/library/gdata/perl/xls2csv.pl" > "/Users/peter/dev/R/telenor pricelists/ild_price_processor/ILD priser > 2011.10.01.xls" > "/var/folders/cb/vvshkpm90lx_y2n69qlyw4z40000gn/T//RtmpK50r4g/file546a2722.csv" > "1"' had status 255 > Error in file.exists(tfn) : invalid 'file' argument > Error parsing file '/Users/peter/dev/R/telenor > pricelists/ild_price_processor/ILD priser 2011.10.01.xls'. > > E.g. using this code: >> fn = file.choose() >> read.xls(fn) > With many different .xls files, including ones I have read with read.xls in > the past. > > When I try running the xls2csv perl script (which read.xls depends on) > directly from the command line I get the following error message: > > $ perl xls2csv.pl ~/Downloads/900numre.xls ~/Downloads/900numre.csv 1 > Loading '/Users/peter/Downloads/900numre.xls'... > Error parsing file '/Users/peter/Downloads/900numre.xls'. > > So it's pretty obvious the perl script part has broken down. The frustrating > things is that this worked perfectly fine until today, so I (or my system) > has clearly unknowingly changed some part of the configuration (I'm on OS > X). > That message is given when it tries to open the file and its not able to or its not a valid spreadsheet file, -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From felix at nfrac.org Tue Sep 6 13:18:16 2011 From: felix at nfrac.org (Felix Andrews) Date: Tue, 6 Sep 2011 21:18:16 +1000 Subject: [R] Coloring Dirichlet Tiles In-Reply-To: References: <1315160783676-3789746.post@n4.nabble.com> Message-ID: On 5 September 2011 06:59, baptiste auguie wrote: > Hi, > > Try this, > > d <- data.frame(x=runif(1e3, 0, 30), y=runif(1e3, 0, 30)) > d$z = (d$x - 15)^2 + (d$y - 15)^2 > > library(spatstat) > library(maptools) > > W <- ripras(df, shape="rectangle") > W <- owin(c(0, 30), c(0, 30)) > X <- as.ppp(d, W=W) > Y <- dirichlet(X) > Z <- as(Y, "SpatialPolygons") > plot(Z, col=grey(d$z/max(d$z))) > > and also panel.voronoi in latticeExtra. > > Unfortunately I do not know of a solution that uses more efficient > algorithms for computing the Dirichlet tessellation and extracting > tiles than those relying on deldir. The Triangle package (r-forge) > looks promising. panel.voronoi() can use either deldir or tripack. tripack is non-free but is much faster. xyz <- data.frame(x = rnorm(1000), y = rnorm(1000), z = rnorm(1000)) library(latticeExtra) system.time(print(tileplot(z ~ x * y, xyz))) ## deldir by default # user system elapsed # 3.636 0.004 3.651 system.time(print(tileplot(z ~ x * y, xyz, use.tripack = TRUE))) # user system elapsed # 1.176 0.000 1.359 > > HTH, > > baptiste > > > On 5 September 2011 06:26, awesolow wrote: >> Hi, >> >> I have a set of x, y points (longitude/latitude) along with a z value >> representing an attribute at each point. ?I want to create a >> Voronoi/Dirichlet tesselation of these points coloring each tile by the z >> value. ?I tried searching for a way to solve this and it was suggested to >> use the dirichlet() command to get the correct coloring. ?However, my >> coloring is not correct. >> >> Any thoughts? >> >> Thanks in advance. >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Coloring-Dirichlet-Tiles-tp3789746p3789746.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Felix Andrews / ??? http://www.neurofractal.org/felix/ From paul.hiemstra at knmi.nl Tue Sep 6 13:23:32 2011 From: paul.hiemstra at knmi.nl (Paul Hiemstra) Date: Tue, 06 Sep 2011 11:23:32 +0000 Subject: [R] Hessian Matrix Issue In-Reply-To: <4E628C9E.1060200@otter-rsch.com> References: <4E628C9E.1060200@otter-rsch.com> Message-ID: <4E6602B4.6040301@knmi.nl> On 09/03/2011 08:22 PM, dave fournier wrote: > > I wonder if your code is correct? > > I ran your script until an error was reported. the data set > of 30 obs was > > > [1] 0 0 1 3 3 3 4 4 4 4 5 5 5 5 5 7 7 7 7 7 7 8 > 9 10 11 > [26] 12 12 12 15 16 > > I created a tiny AD Model Builder program to do MLE on it. > > DATA_SECTION > init_int nobs > init_vector y(1,nobs) > PARAMETER_SECTION > init_number log_mu > init_number log_alpha > sdreport_number mu > sdreport_number tau > objective_function_value f > PROCEDURE_SECTION > mu=exp(log_mu); > tau=1.0+exp(log_alpha); > for (int i=1;i<=nobs;i++) > { > f-=log_negbinomial_density(y(i),mu,tau); > } > It converged quickly and > > The eigenvalues of the Hessian were > > 4.711089774 78.27632341 > > and the estimates and std devs of the parameters mu and tau were > > index name value std dev > > 3 mu 6.6000e+00 7.7318e-01 > 4 tau 2.7173e+00 7.8944e-01 > > where tau is the variance divided by the mean. > > This was all so simple that I suspect your (rather difficult to read) > R code is wrong, otherwise R must really suck at this kind of problem. I'd put my money on R! Paul > > Dave > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 From paul.hiemstra at knmi.nl Tue Sep 6 13:26:13 2011 From: paul.hiemstra at knmi.nl (Paul Hiemstra) Date: Tue, 06 Sep 2011 11:26:13 +0000 Subject: [R] Receive "unable to load shared object RNetCDF.o" during R INSTALL of RNetCDF In-Reply-To: References: Message-ID: <4E660355.20306@knmi.nl> Hi, You could try installing the ncdf package. This installed for my without any problems when RNetCDF failed. I suspect ncdf has very similar functionality, but i've only worked with ncdf. install.packages("ncdf") should do the trick... cheers, Paul On 09/05/2011 08:56 PM, David Brown wrote: > On a Red Hat Linux cluster I am seeing the following after multiple > other packages were successfully installed. The error seems to suggest > that RNetCDF.o was not copied to the appropriate lib folder. The admin > user performing the install has the required privileges to perform the > install. Words of wisdom are greatly appreciated: > > R CMD INSTALL --configure-args="--with-netcdf-include='/soft/local/netcdf/netcdf-3.6.2/include/' > --with-netcdf-lib='/soft/local/netcdf/netcdf-3.6.2/libso' > --with-udunits-include='/soft/local/udunits-2.1.23/include' > --with-udunits-lib='/soft/local/udunits-2.1.23/lib'" > RNetCDF_1.5.2-2.tar.gz > > * installing to library ?/soft/local/r/R-2.13.1/lib64/R/library? > * installing *source* package ?RNetCDF? ... > checking for gcc... gcc -std=gnu99 > checking for C compiler default output file name... a.out > checking whether the C compiler works... yes > checking whether we are cross compiling... no > checking for suffix of executables... > checking for suffix of object files... o > checking whether we are using the GNU C compiler... yes > checking whether gcc -std=gnu99 accepts -g... yes > checking for gcc -std=gnu99 option to accept ISO C89... none needed > checking for nc_open in -lnetcdf... yes > checking for utInit in -ludunits2... no > checking for utScan in -ludunits2... yes > checking how to run the C preprocessor... gcc -std=gnu99 -E > checking for grep that handles long lines and -e... /bin/grep > checking for egrep... /bin/grep -E > checking for ANSI C header files... no > checking for sys/types.h... yes > checking for sys/stat.h... yes > checking for stdlib.h... yes > checking for string.h... yes > checking for memory.h... yes > checking for strings.h... yes > checking for inttypes.h... yes > checking for stdint.h... yes > checking for unistd.h... yes > checking netcdf.h usability... yes > checking netcdf.h presence... yes > checking for netcdf.h... yes > checking udunits.h usability... yes > checking udunits.h presence... yes > checking for udunits.h... yes > configure: creating ./config.status > config.status: creating R/load.R > config.status: creating src/Makevars > ** libs > gcc -std=gnu99 -I/soft/local/r/R-2.13.1/lib64/R/include > -I/soft/local/udunits-2.1.23/include > -I/soft/local/netcdf/netcdf-3.6.2/include/ -I/usr/local/include > -fpic -g -O2 -c RNetCDF.c -o RNetCDF.o > gcc -std=gnu99 -shared -L/usr/local/lib64 -o RNetCDF.so RNetCDF.o > -ludunits2 -lnetcdf -L/soft/local/udunits-2.1.23/lib > -L/soft/local/netcdf/netcdf-3.6.2/libso -lexpat > installing to /soft/local/r/R-2.13.1/lib64/R/library/RNetCDF/libs > ** R > ** preparing package for lazy loading > ** help > *** installing help indices > ** building package indices ... > ** testing if installed package can be loaded > Error in dyn.load(file, DLLpath = DLLpath, ...) : > unable to load shared object > '/soft/local/r/R-2.13.1/lib64/R/library/RNetCDF/libs/RNetCDF.so': > libudunits2.so.0: cannot open shared object file: No such file or directory > Error: loading failed > Execution halted > ERROR: loading failed > * removing ?/soft/local/r/R-2.13.1/lib64/R/library/RNetCDF? > [swadm at boar01]/soft/local/temp% > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 From mailinglist.honeypot at gmail.com Tue Sep 6 14:25:43 2011 From: mailinglist.honeypot at gmail.com (Steve Lianoglou) Date: Tue, 6 Sep 2011 08:25:43 -0400 Subject: [R] Re : Need more information about VGLM In-Reply-To: <1315294493.31686.YahooMailNeo@web26003.mail.ukl.yahoo.com> References: <1315219578.34961.YahooMailNeo@web26007.mail.ukl.yahoo.com> <1315294493.31686.YahooMailNeo@web26003.mail.ukl.yahoo.com> Message-ID: Hi, On Tue, Sep 6, 2011 at 3:34 AM, privat NDOUTOUME wrote: > Hi?Steven > > Thank you?for taking?the time to answer?my question. > I want to?reproduce?the same function?VGLM?on?C?+ +.?That's why?I wanted to > know?where to find?the algorithm?in?code?VGLM?R. > Opening the?package?GAMT?CRAN, I found a?file called?"scr", I found?code?in > C?+ +.?Do you believe?that these?codes?can help me? Well, somewhere in that packages R or src director is the code that runs the function you are looking for, so yes .. I believe (know) that it will help you. Your job is to step through the code of the R function you are initially calling in order to find where the part is that you want to re-implement. It may or may not be written c/c++ already, I do not know as I've never used the package, and don't have the time myself to do this exercise. -steve > > Sincerely > ________________________________ > De?: Steve Lianoglou > ??: privat NDOUTOUME > Cc?: "R-help at r-project.org" > Envoy? le : Lundi 5 Septembre 2011 20h48 > Objet?: Re: [R] Need more information about VGLM > > Hi, > > On Mon, Sep 5, 2011 at 6:46 AM, privat NDOUTOUME wrote: >> Hi, >> >> I'm working on multiple logistic regression. I used the function vglm >> (Package VGAM) in R. Now, i'd like to implent this function (vglm) in C++. >> Could someone help me or send me the algorithm. > > Download and extract the source version of the package from its page on > CRAN: > > http://cran.r-project.org/web/packages/VGAM/index.html > > Look for the "Package source" link. > > It's not clear what part you want to re-implement, but the code for > the entire package is in there. I reckon you'll be able to fish out > the parts of whatever you are looking for yourself. > > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > ?| Memorial Sloan-Kettering Cancer Center > ?| Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > > > -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact From paul.hiemstra at knmi.nl Tue Sep 6 14:24:54 2011 From: paul.hiemstra at knmi.nl (Paul Hiemstra) Date: Tue, 06 Sep 2011 12:24:54 +0000 Subject: [R] How do I define moving window fequency In-Reply-To: <1315298658.56235.YahooMailNeo@web37107.mail.mud.yahoo.com> References: <1315298658.56235.YahooMailNeo@web37107.mail.mud.yahoo.com> Message-ID: <4E661116.7080303@knmi.nl> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From paul.hiemstra at knmi.nl Tue Sep 6 14:28:48 2011 From: paul.hiemstra at knmi.nl (Paul Hiemstra) Date: Tue, 06 Sep 2011 12:28:48 +0000 Subject: [R] ggplot2-grid/viewport and PNG In-Reply-To: <1315216773768-3790866.post@n4.nabble.com> References: <1315216773768-3790866.post@n4.nabble.com> Message-ID: <4E661200.1080704@knmi.nl> On 09/05/2011 09:59 AM, ashz wrote: > Dear All, > > The following code save my graphs as pdf: > pdf("j:/mix.pdf", width = 18, height = 16) > grid.newpage() > pushViewport(viewport(layout = grid.layout(3,1))) > vplayout <- function(x, y) > viewport(layout.pos.row = x, layout.pos.col = y) > print(Aplot, vp = vplayout(1, 1)) > print(Bplot, vp = vplayout(2, 1)) > print(Cplot, vp = vplayout(3, 1)) > dev.off() > > How can I save it in PNG and maintain the same graph structure? > > Thanks > > > -- > View this message in context: http://r.789695.n4.nabble.com/ggplot2-grid-viewport-and-PNG-tp3790866p3790866.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. see ?ggsave. btw, have you considered using facetting in stead of plotting several plots manually? Paul -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 From bbolker at gmail.com Tue Sep 6 14:28:36 2011 From: bbolker at gmail.com (Ben Bolker) Date: Tue, 6 Sep 2011 12:28:36 +0000 Subject: [R] r-help volcano plot References: <146393c.e200.1323d534767.Coremail.knifeboot@163.com> <1315290507382-3792696.post@n4.nabble.com> Message-ID: Duny gmail.com> writes: > > Use the ggplot2 package in order to make a volcano plot! Check out the > following book for more information about the package: ggplot2: Elegant > Graphics for Data Analysis (Use R) by Hadley Wickham. ggplot2 is great for > creating professional graphics in no time. > > If you look up stat_density in R, you will find the following example at the > bottom of the page: > > # Make a volcano plot > ggplot(diamonds, aes(x = price)) [snip snip snip] ggplot is indeed great, but that looks like a violin plot and not a volcano plot. Unless I'm just confused about the terminology this looks like a typo in the documentation. library("sos"); findFn("{volcano plot}") finds a few other options, but not many. To the original poster: can you show us the actual error message? Ben Bolker From jvadams at usgs.gov Tue Sep 6 14:40:21 2011 From: jvadams at usgs.gov (Jean V Adams) Date: Tue, 6 Sep 2011 07:40:21 -0500 Subject: [R] confusion matrix In-Reply-To: <1315014402783-3787363.post@n4.nabble.com> References: <1315014402783-3787363.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From fomcl at yahoo.com Tue Sep 6 14:50:51 2011 From: fomcl at yahoo.com (Albert-Jan Roskam) Date: Tue, 6 Sep 2011 05:50:51 -0700 (PDT) Subject: [R] list of all methods winthin an S4 class Message-ID: <1315313451.14128.YahooMailNeo@web110703.mail.gq1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From xkziloj at gmail.com Tue Sep 6 14:55:58 2011 From: xkziloj at gmail.com (. .) Date: Tue, 6 Sep 2011 09:55:58 -0300 Subject: [R] Generalizing call to function Message-ID: Hello guys, I would like to ask for help to understand what is going on in "func2". My plan is to generalize "func1", so that are expected same results in "func2" as in "func1". Executing "func1" returns... 0.25 with absolute error < 8.4e-05 But for "func2" I get... Error in dpois(1, 0.1, 23.3065168689948, 0.000429064542600244, 3.82988398013855, : unused argument(s) (0.000429064542600244, 3.82988398013855, 0.00261104515224461, 1.37999516465199, 0.0072464022020844, 0.673787740945863, 0.0148414691931943, 0.383193602946711, 0.0260964690514175, 0.236612585866545, 0.0422631787036055, 0.152456705113438, 0.0655923922306948) Thanks in advance. func1 <- function(y, a, rate) { f1 <- function(n, y, a, rate) { lambda <- a * n dexp(n, rate) * dpois(y, lambda) } integrate(f1, 0, Inf, y, a, rate) } func1(1, 0.1, 0.1) func2 <- function(y, a, rate, samp) { f1 <- function(n, y, a, rate, samp) { SampDist <- function(y, a, n, samp) { lambda <- a * n dcom <- paste("d", samp, sep="") dots <- as.list(c(y, lambda)) do.call(dcom, dots) } dexp(n, rate) * SampDist(y, a, n, samp) } integrate(f1, 0, Inf, y, a, rate, samp) } func2(1, 0.1, 0.1, "pois") From therneau at mayo.edu Tue Sep 6 14:58:20 2011 From: therneau at mayo.edu (Terry Therneau) Date: Tue, 06 Sep 2011 07:58:20 -0500 Subject: [R] Parameters in Gamma Frailty model Message-ID: <1315313900.27828.3.camel@nemo> I don't know how you got the output you did -- I cannot get a printout that says "gamma:1". Please provide the information specified in the posting guide, e.g., version of R, platform, loaded packages, the statements you typed to get this output, etc, Terry Therneau (author of survival) From therneau at mayo.edu Tue Sep 6 15:01:19 2011 From: therneau at mayo.edu (Terry Therneau) Date: Tue, 06 Sep 2011 08:01:19 -0500 Subject: [R] Weights using Survreg Message-ID: <1315314079.27828.7.camel@nemo> Survreg produces MLE estimates. For your second question, don't know what you are asking. Can you be more specific and detailed? ---begin included message -- Do you know if the parameters estimators are MLE estimators? One more question: In my case study I have failures that occured on different objects that have different age and length, could I use weight to find the estimates of a weibull law and so to find the probabilty of failure per unit of length for example? From therneau at mayo.edu Tue Sep 6 15:11:53 2011 From: therneau at mayo.edu (Terry Therneau) Date: Tue, 06 Sep 2011 08:11:53 -0500 Subject: [R] How to understand the plotting of the cox.zph function Message-ID: <1315314713.27828.14.camel@nemo> Under the assumption of PH, the graph would be a horizontal line at y= log(2.9) = coefficient of the model. That is, a constant hazard ratio. Your plot shows that there is no effect early (HR of 1) but there is an impact later. Terry T From kamil.barton at uni-wuerzburg.de Tue Sep 6 15:14:38 2011 From: kamil.barton at uni-wuerzburg.de (=?UTF-8?B?S2FtaWwgQmFydG/FhA==?=) Date: Tue, 06 Sep 2011 15:14:38 +0200 Subject: [R] excluding models during dredge and model averaging in MuMIn In-Reply-To: References: Message-ID: <4E661CBE.6000804@uni-wuerzburg.de> > dredge(x, subset = !(X1 & (X2 | X3)) & !(X2 & X3) & !(X1 & X3)) see > help("Logic", "base") Dnia 2011-08-26 12:00, r-help-request at r-project.org pisze: > ------------------------------ > > Message: 157 > Date: Fri, 26 Aug 2011 14:53:00 +0900 > From: Andrew MacIntosh > To:R-help at r-project.org > Subject: [R] excluding models during dredge and model averaging in > MuMIn > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Dear R Users, > > I am using the package "MuMIn" to sort through models, with the goal > of estimating parameters through model averaging of the candidate set. > I have been using the dredge function to build all possible models > based on my starting point (hypothetical global model). I see that > there is a simple way to control which models are excluded from this > set... > > # exclude models containing both X1 and X2 >> >dredge(x, subset = !(X1& X2)) > However, I would like to add to this and exclude 4 different > combinations of predictors (i.e. X1&X2, X1&X3, X2&X3, X1&X2&X3). > > I would very much appreciate it if somebody had any ideas about how to > tackle this. Also, > > Cheers, > Andrew From mtmorgan at fhcrc.org Tue Sep 6 15:19:05 2011 From: mtmorgan at fhcrc.org (Martin Morgan) Date: Tue, 06 Sep 2011 06:19:05 -0700 Subject: [R] list of all methods winthin an S4 class In-Reply-To: <1315313451.14128.YahooMailNeo@web110703.mail.gq1.yahoo.com> References: <1315313451.14128.YahooMailNeo@web110703.mail.gq1.yahoo.com> Message-ID: <4E661DC9.1000002@fhcrc.org> On 09/06/2011 05:50 AM, Albert-Jan Roskam wrote: > Hello, > > How can I generate an overview/vector of all the methods winthin an S4 class? Similar to dir() in this Python code: >>>> class SomeClass(): > def some_method_1(self): > pass > def some_method_2(self): > pass > >>>> dir(SomeClass) > ['__doc__', '__module__', 'some_method_1', 'some_method_2'] >>>> similar to methods(), for a class Foo with methods in Bar use showMethods(class="Foo", where=getNamespace("Bar") Martin > > Thanks in advance! > > Cheers!! > Albert-Jan > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > [[alternative HTML version deleted]] > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 From therneau at mayo.edu Tue Sep 6 15:39:32 2011 From: therneau at mayo.edu (Terry Therneau) Date: Tue, 06 Sep 2011 08:39:32 -0500 Subject: [R] SAS code in R Message-ID: <1315316372.27828.34.camel@nemo> -- Begin included message /* Combinations of Risk Factors */ data test2; input sex treat; DATALINES; 0 0 1 0 0 1 1 1 ; run; /* Survival estimates for the above combinations */ proc phreg data = pudat2; model withtime*wcens(0) = sex treat /ties = efron; baseline out = surv2 survival = survival lower = slower upper = supper covariates = test2 /method = ch nomean cltype=loglog; run; /* Survival estimates at 1 year */ proc print data = surv2 noobs; where withtime = 364; fit1 <- coxph(Surv(withtime,wcens)~ sex+treat,data=pudat2) my.fit <- survift(fit1) summary(my.fit) Unfortunately, the estimates shown do not match those of SAS. -- End inclusion You need to add a newdata=test2 argument to the survfit call. Like the SAS code, test2 is a data set containing the list of subjects for which you want an estimate. You can also add a times= argument to the summary function. Last, if you look at the documentation you will see that the default handling for ties in the baseline hazard is the Fleming-Harrington method in R, whenever the "Efron" approx was used in computing the coefficients. (Same approximation, 2 different names when used in the two contexts, for more details see the chapter in my book). The "CH" or Aalen approx for a baseline hazard is the same math as the Breslow approx for the coefficients. So to match the chimeric mixture you used in the SAS code you will need to add type="aalen" to the survfit call. The numeric effect of this is usually neglible, but when I see "do not match" in a query I assume you are looking to have all the digits the same. As to competing risks, I have some code for R but documentation is yet to be done -- on my to do list. Walter Kremers (kremers.walter at mayo.edu) has a nice SAS macro for this case, however. You might want to drop him a line. Terry Therneau From kamil.barton at uni-wuerzburg.de Tue Sep 6 15:44:44 2011 From: kamil.barton at uni-wuerzburg.de (=?UTF-8?B?S2FtaWwgQmFydG/FhA==?=) Date: Tue, 06 Sep 2011 15:44:44 +0200 Subject: [R] MuMIn Problem getting adjusted Confidence intervals In-Reply-To: References: Message-ID: <4E6623CC.4030508@uni-wuerzburg.de> Hi Marcos, The 'adjusted CI' (based on the 'adjusted se estimator' as in section 4.3.3 of Burnham & Anderson 2002) cannot be calculated for 'lmer' model because it does not give df's for the coefficients. kamil Dnia 2011-08-30 12:00, r-help-request at r-project.org pisze: > Message: 42 > Date: Mon, 29 Aug 2011 08:28:22 -0700 (PDT) > From: Marcos Lima > To:r-help at r-project.org > Subject: [R] MuMIn Problem getting adjusted Confidence intervals > Message-ID:<1314631702645-3776500.post at n4.nabble.com> > Content-Type: text/plain; charset=UTF-8 > > Hello R users > > I'm using MuMIn but for some reason I'm not getting the adjusted confidence > interval and uncoditional SE whe I use model.avg(). > > I took into consideration the steps provided by Grueber et al (2011) > Multimodel inference in ecology and evolution: challenges and solutions in > JEB. > > I created a global model to see if malaria prevalence (binomial > distribution) is related to any life history traits of 14 different birds > species, while controling for Family and genus in a GLMM: > > global.model.Para<-lmer(cbind(Parahaemoproteus,FailPh)~factor(SS)+factor(NT)+NH+W+IT+factor(MS)+(1|Family/Genus),family=binomial,data=malaria) > > I than standardize the input variables using the function standardize form > the arm package: > > stdz.model.Para<-standardize(global.model.Para,standardize.y=FALSE) > > But I get this message: > Warning messages lost: > In is.na(thedata): > is.na() aplied to an object different from list or vector of type "Null" > > I then proceed to use the dredge fucntion: > model.set.Para<-dredge(stdz.model.Para) > <...> > top.models.Para<-get.models(model.set.Para,subset=delta<=7) > top.models > > But when I do the model average I do not seem to be getting the variance or > Uncoditional SE and I'm guessing that the Confidence interval are no > conditional either: > > model.avg(top.models.Para,method="NA") > > <...> > > Averaged model parameters: > Coefficient SE Lower CI Upper CI > (Intercept) -4.75 1.410 -7.510 -1.9900 > factor(MS)1 -1.54 0.809 -3.120 0.0471 > factor(NT)1 2.28 1.310 -0.286 4.8500 > factor(SS)1 3.30 0.968 1.400 5.2000 > z.IT -2.79 2.230 -7.160 1.5800 > z.NH 2.28 1.660 -0.968 5.5300 > z.W -1.74 1.490 -4.650 1.1800 > Confidence intervals are unadjusted > > Relative variable importance: > factor(SS) factor(MS) z.NH z.IT z.W factor(NT) > 0.82 0.33 0.32 0.20 0.07 0.01 > > Does anyone know what I might be doing wrong? > > thanks for the help > > Marcos From sarah.goslee at gmail.com Tue Sep 6 15:45:12 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Tue, 6 Sep 2011 09:45:12 -0400 Subject: [R] capturing a figure to PDF or Image In-Reply-To: References: Message-ID: Eran, > On Tue, Sep 6, 2011 at 8:53 AM, Eran Eidinger wrote: >> >> Thank you Sarah and David, >> I only used a simple plot, and remembered to dev.off(). >> attached is the session: >> >> > plot(c(1,2,3),c(3,2,4)) >> > jpeg("test.jpg") >> > dev.off() So you plotted something to your default device, THEN opened a JPG device, and closed with without plotting anything to it. How about instead you try: jpeg("test.jpg") plot(c(1,2,3),c(3,2,4)) dev.off() Open the device, plot something close it. Reading ?Devices might also be of use. Sarah -- Sarah Goslee http://www.functionaldiversity.org From borisberanger at gmail.com Tue Sep 6 15:50:47 2011 From: borisberanger at gmail.com (Boris Beranger) Date: Tue, 6 Sep 2011 06:50:47 -0700 (PDT) Subject: [R] Weights using Survreg In-Reply-To: <1315314079.27828.7.camel@nemo> References: <1315314079.27828.7.camel@nemo> Message-ID: <1315317047290-3793462.post@n4.nabble.com> Sorry when we talk about about MLE estimates does that mean WLE?I am trying to understand if the survreg function is allowing a weight for each density function when calculating the likelihood. In my second question I was trying to explain that my problem is that I have pipes of different length and I want to know their probability to break per metre. My idea was to weight each of my observations to get estimate probabilities per metre.Does that sound realistic? Thank you very much, Boris -- View this message in context: http://r.789695.n4.nabble.com/Weights-using-Survreg-tp3781803p3793462.html Sent from the R help mailing list archive at Nabble.com. From lehmannk at informatik.uni-tuebingen.de Tue Sep 6 16:10:03 2011 From: lehmannk at informatik.uni-tuebingen.de (netzwerkerin) Date: Tue, 6 Sep 2011 07:10:03 -0700 (PDT) Subject: [R] subsetting tables Message-ID: <1315318203870-3793509.post@n4.nabble.com> Hi guys, one of the questions where you need a real human instead of a search engine, so it would be great if you could help. I have a matrix of z-scores which I would like to filter, sometimes columnwise, sometimes rowwise. Data looks like this: Allstar hsa.let.7a hsa.let.7a.1 hsa.let.7a.2 2 0.87 0.79 -0.57 1.07 3 0.67 -1.14 -0.78 -0.95 4 -0.46 -0.30 -0.36 1.14 Now I want to find all elements which are below/above some threshold. Subset works fine with the columns: > subset(red[,4], red[,4] > 0.5) [1] 1.07 1.14 But not with the rows: > subset(red[2,], red[2,] > 0.5) Allstar hsa.let.7a hsa.let.7a.1 hsa.let.7a.2 3 0.67 -1.14 -0.78 -0.95 If I try to find all values above 0.5 (any row, any column, I just need the number of entries), this is what I try (and get): > subset(red[,], red[,] > 0.5) Allstar hsa.let.7a hsa.let.7a.1 hsa.let.7a.2 2 0.87 0.79 -0.57 1.07 3 0.67 -1.14 -0.78 -0.95 NA NA NA NA NA NA.1 NA NA NA NA NA.2 NA NA NA NA Obviously I'm doing something wrong, but what? Help very much appreciated. Netzwerkerin -- View this message in context: http://r.789695.n4.nabble.com/subsetting-tables-tp3793509p3793509.html Sent from the R help mailing list archive at Nabble.com. From gpetris at uark.edu Tue Sep 6 16:21:44 2011 From: gpetris at uark.edu (Giovanni Petris) Date: Tue, 06 Sep 2011 09:21:44 -0500 Subject: [R] Hessian matrix issue In-Reply-To: <4E62665E.2020602@uottawa.ca> References: <4E62665E.2020602@uottawa.ca> Message-ID: <1315318904.1653.445.camel@definetti> About the numerical calculation of the Hessian matrix, I have found numDeriv:::hessian to be often more accurate than the Hessian returned by optim. Best, Giovanni Petris On Sat, 2011-09-03 at 13:39 -0400, John C Nash wrote: > Unless you are supplying analytic hessian code, you are almost certainly getting an > approximation. Worse, if you do not provide gradients, these are the result of two levels > of differencing, so you should expect some loss of precision in the approximate Hessian. > > Moreover, if your estimate of the optimum is a little bit off, or the optimizer has > terminated (algorithms converge, programs terminate) to a point that is not an optimum, > there is no reason the Hessian should be positive definite. > > Package optimx() uses the Jacobian of the gradient if the analytic gradient is available. > This drops the differencing to 1 level. Even better is to code the Hessian, but that is > messy and tedious in most cases. > > Best, JN > > > On 09/03/2011 06:00 AM, r-help-request at r-project.org wrote: > > Message: 59 > > Date: Fri, 2 Sep 2011 15:33:13 -0400 > > From: tzaihra at alcor.concordia.ca > > To: r-help at r-project.org > > Subject: [R] Hessian Matrix Issue > > Message-ID: > > > > Content-Type: text/plain;charset=iso-8859-1 > > > > Dear All, > > > > I am running a simulation to obtain coverage probability of Wald type > > confidence intervals for my parameter d in a function of two parameters > > (mu,d). > > > > I am optimizing it using "optim" method "L-BFGS-B" to obtain MLE. As, I > > want to invert the Hessian matrix to get Standard errors of the two > > parameter estimates. However, my Hessian matrix at times becomes > > non-invertible that is it is no more positive definite and I get the > > following error msg: > > > > "Error in solve.default(ac$hessian) : system is computationally singular: > > reciprocal condition number = 6.89585e-21" > > Thank you > > > > Following is the code I am running I would really appreciate your comments > > and suggestions: > > > > #Start Code > > #option to trace /recover error > > #options(error = recover) > > > > #Sample Size > > n<-30 > > mu<-5 > > size<- 2 > > > > #true values of parameter d > > d.true<-1+mu/size > > d.true > > > > #true value of zero inflation index phi= 1+log(d)/(1-d) > > z.true<-1+(log(d.true)/(1-d.true)) > > z.true > > > > # Allocating space for simulation vectors and setting counters for simulation > > counter<-0 > > iter<-10000 > > lower.d<-numeric(iter) > > upper.d<-numeric(iter) > > > > #set.seed(987654321) > > > > #begining of simulation loop######## > > > > for (i in 1:iter){ > > r.NB<-rnbinom(n, mu = mu, size = size) > > y<-sort(r.NB) > > iter.num<-i > > print(y) > > print(iter.num) > > #empirical estimates or sample moments > > xbar<-mean(y) > > variance<-(sum((y-xbar)2))/length(y) > > dbar<-variance/xbar > > #sample estimate of proportion of zeros and zero inflation index > > pbar<-length(y[y==0])/length(y) > > > > ### Simplified function ############################################# > > > > NegBin<-function(th){ > > mu<-th[1] > > d<-th[2] > > n<-length(y) > > > > arg1<-n*mean(y)*ifelse(mu >= 0, log(mu),0) > > #arg1<-n*mean(y)*log(mu) > > > > #arg2<-n*log(d)*((mean(y))+mu/(d-1)) > > arg2<-n*ifelse(d>=0, log(d), 0)*((mean(y))+mu/ifelse((d-1)>= 0, (d-1), > > 0.0000001)) > > > > aa<-numeric(length(max(y))) > > a<-numeric(length(y)) > > for (i in 1:n) > > { > > for (j in 1:y[i]){ > > aa[j]<-ifelse(((j-1)*(d-1))/mu >0,log(1+((j-1)*(d-1))/mu),0) > > #aa[j]<-log(1+((j-1)*(d-1))/mu) > > #print(aa[j]) > > } > > > > a[i]<-sum(aa) > > #print(a[i]) > > } > > a > > arg3<-sum(a) > > llh<-arg1+arg2+arg3 > > if(! is.finite(llh)) > > llh<-1e+20 > > -llh > > } > > ac<-optim(NegBin,par=c(xbar,dbar),method="L-BFGS-B",hessian=TRUE,lower= > > c(0,1) ) > > ac > > print(ac$hessian) > > muhat<-ac$par[1] > > dhat<-ac$par[2] > > zhat<- 1+(log(dhat)/(1-dhat)) > > infor<-solve(ac$hessian) > > var.dhat<-infor[2,2] > > se.dhat<-sqrt(var.dhat) > > var.muhat<-infor[1,1] > > se.muhat<-sqrt(var.muhat) > > var.func<-dhat*muhat > > var.func > > d.prime<-cbind(dhat,muhat) > > > > se.var.func<-d.prime%*%infor%*%t(d.prime) > > se.var.func > > lower.d[i]<-dhat-1.96*se.dhat > > upper.d[i]<-dhat+1.96*se.dhat > > > > if(lower.d[i] <= d.true & d.true<= upper.d[i]) > > counter <-counter+1 > > } > > counter > > covg.prob<-counter/iter > > covg.prob > > > > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From paul.hiemstra at knmi.nl Tue Sep 6 16:22:39 2011 From: paul.hiemstra at knmi.nl (Paul Hiemstra) Date: Tue, 06 Sep 2011 14:22:39 +0000 Subject: [R] write.matrix row names vs sink vs capture.output In-Reply-To: <53A88CC9-FD22-4CFC-B107-1AFE20FBF93E@hci.utah.edu> References: <53A88CC9-FD22-4CFC-B107-1AFE20FBF93E@hci.utah.edu> Message-ID: <4E662CAF.9030501@knmi.nl> On 09/06/2011 06:24 AM, Mark Ebbert wrote: > Dear R gurus, > > I am trying to write several large matrices (~ 1GB) to separate files. I have learned that write.table is simply too slow for this task and was attempting to use write.matrix, but write.matrix does not have the ability to include row names in the output. Anyone know why that's the case? I've seen a thread stating that write.matrix is the way to go for large prints to files, but it doesn't do what I need it to. Since write.matrix wasn't working I tried both sink and capture.output, but then the output is printed to the file using the same 'width' restrictions as the general "options(width=)" limit. > > Any ideas on how to print a large matrix with row names? I could write a perl script to modify the files after the fact, but I shouldn't have to do that. > > Thanks for your help! > > Mark T. W. Ebbert > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. Hi, What do you want with the data? If you want to store an R matrix on disk for later use in R, take a look at ?save. If it is for use in another programming language, I would write the matrix in binary format (?writebin). This saves a lot of space and prevents any (significant) rounding errors. It is probably also quite a bit faster. If you really need some more metadata (such as rownames), I would add a second text file which stores this information. Sort of a binary file plus a header, which is a quite common format for storing data. Maybe you can even find a standard binary format which you can use. But it is impossible to comment on this because you did not provide information as to what you want to do with the saved data. good luck! Paul -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 From bt_jannis at yahoo.de Tue Sep 6 16:26:20 2011 From: bt_jannis at yahoo.de (Jannis) Date: Tue, 6 Sep 2011 15:26:20 +0100 (BST) Subject: [R] several functions in one *.R file in a R package Message-ID: <1315319180.35832.YahooMailClassic@web28207.mail.ukl.yahoo.com> Dear list members, i have build a package which contains a collection of my frequently used functions. To keep the code organized I have broken down some rather extensive and long functions into individual steps and bundled these steps in sub-functions that are called inside the main function. To keep an overview over which sub-function belongs to which main function I saved all the respective sub-functions to the same *.R file as their main-function and gave them names beginning with . to somehow hide the sub-functions. The result would be one *.R file in /R for each 'main-function' containing something like: mainfunction <- function() { .subfunction1() .subfunction2() #... } .subfunction1() <- function() { #do some stuff } .subfunction2() <- function() { #do some more stuff } According to the way I understood the "Writing R Extensions" Manual I expected this to work. When I load the package, however, I get the error message that the sub-functions could not be found. Manually sourcing all files in the /R directory however yields the expected functionality. In what way am I mistaken here? Any ideas? Cheers Jannis From bt_jannis at yahoo.de Tue Sep 6 16:35:33 2011 From: bt_jannis at yahoo.de (Jannis) Date: Tue, 6 Sep 2011 15:35:33 +0100 (BST) Subject: [R] subsetting tables In-Reply-To: <1315318203870-3793509.post@n4.nabble.com> Message-ID: <1315319733.43266.YahooMailClassic@web28209.mail.ukl.yahoo.com> Sorry If I miss the point you want to achieve, but why not just: which(red > 0.5) or red[which(red > 0.5)] if you want to use this col/row wise you could do: apply(red, 1, function(x)x[x>0.5]) and apply(red, 1, function(x)x[x>0.5]) ? --- netzwerkerin schrieb am Di, 6.9.2011: > Von: netzwerkerin > Betreff: [R] subsetting tables > An: r-help at r-project.org > Datum: Dienstag, 6. September, 2011 14:10 Uhr > Hi guys, > > one of the questions where you need a real human instead of > a search engine, > so it would be great if you could help. > > I have a matrix of z-scores which I would like to filter, > sometimes > columnwise, sometimes rowwise. Data looks like this: > > ? Allstar hsa.let.7a hsa.let.7a.1 hsa.let.7a.2 > 2? ? 0.87? ? > ???0.79? ? ? ? > -0.57? ? ? ???1.07 > 3? ? 0.67? ? ? -1.14? ? > ? ? -0.78? ? ? ? -0.95 > 4???-0.46? ? ? -0.30? > ? ? ? -0.36? ? ? > ???1.14 > > Now I want to find all elements which are below/above some > threshold. Subset > works fine with the columns: > > > subset(red[,4], red[,4] > 0.5) > [1] 1.07 1.14 > > But not with the rows: > > > subset(red[2,], red[2,] > 0.5) > ? Allstar hsa.let.7a hsa.let.7a.1 hsa.let.7a.2 > 3? ? 0.67? ? ? -1.14? ? > ? ? -0.78? ? ? ? -0.95 > > If I try to find all values above 0.5 (any row, any column, > I just need the > number of entries), this is what I try (and get): > > > subset(red[,], red[,] > 0.5) > ? ???Allstar hsa.let.7a hsa.let.7a.1 > hsa.let.7a.2 > 2? ? ???0.87? ? > ???0.79? ? ? ? > -0.57? ? ? ???1.07 > 3? ? ???0.67? ? ? > -1.14? ? ? ? -0.78? ? ? > ? -0.95 > NA? ? ? ? NA? ? ? > ???NA? ? ? ? > ???NA? ? ? ? > ???NA > NA.1? ? ? NA? ? ? > ???NA? ? ? ? > ???NA? ? ? ? > ???NA > NA.2? ? ? NA? ? ? > ???NA? ? ? ? > ???NA? ? ? ? > ???NA > > Obviously I'm doing something wrong, but what? > Help very much appreciated. > Netzwerkerin > > -- > View this message in context: http://r.789695.n4.nabble.com/subsetting-tables-tp3793509p3793509.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > From geof at stat.colostate.edu Tue Sep 6 16:24:07 2011 From: geof at stat.colostate.edu (Geof Givens) Date: Tue, 06 Sep 2011 08:24:07 -0600 Subject: [R] object.size() not recognized within .First() Message-ID: <4E662D07.8000908@stat.colostate.edu> I have a function called within .First(), as in .First=function() { ...blah... BIG(partofblah) #BIG is my function, n=partofblah when called ...foo... } The partofblah component of blah is a number obtained from readline(), which is then an argument to BIG() BIG=function(n=10,removeask=T) { z <- sapply(ls(pos=1), function(x)object.size(get(x))) ...stuff... } When .First() executes upon startup and readline() determines partofblah, the call to BIG() creates an error as follows: Hello. How many huge items to consider deleting? (Enter=none) 3 Error in FUN(c("a", "a1", "a2", "aa", "aaa", "abline", "ad", "adj11", : could not find function "object.size" The "Hello" line is a prompt from .First(). "3" is my input. The error occurs when BIG() is executed because object.size() is called immediately upon entry to BIG(). After this error, execution of .First() from the command line works fine. I can't figure out what is going wrong. Any ideas? Thanks, Geof BELOW HERE ARE THE COMPLETE FUNCTIONS: .First=function() { cat("Hello. How many huge items to consider deleting? (Enter=none)\n") g=eval(parse(text=readline())) if (is.numeric(g)) { BIG(n=g)} cat("\nOk. What .Rdata file and directory do you want to use?\n\n") cat("0 = Default (C:/Users/geof/My Documents)\n") cat("1 = ~geof/teach/stat540/2011/R\n") cat("c = choose by browsing\n") thechosen=readline() if (thechosen=="1") { load("c:\\users\\geof\\csu\\teach\\stat540\\2011\\R\\.RData",.GlobalEnv) setwd("c:\\users\\geof\\csu\\teach\\stat540\\2011\\R") } if (thechosen=="c") { cat("Hello. Choose .RData file to work with...\n") where=file.choose() rwhere=sub("\\.RData","",where) load(where,.GlobalEnv) setwd(rwhere) } cat("\n") cat(paste("Okay, I've got you working/saving in ", getwd()[1],"\n",sep="")) } BIG=function(n=10,removeask=T) { z <- sapply(ls(pos=1), function(x)object.size(get(x))) zlab=names(z) z=as.matrix(rev(sort(z))) zlab=as.matrix(rev(sort(zlab))) myzmat=data.frame(id=1:n,name=zlab[1:n],size=z[1:n]) row.names(myzmat)=NULL print(myzmat) cat("\nGive c() vector of id's to delete, or return to exit\n") thechosen=readline() if (thechosen=="") { invisible(return()) } else { byebye=eval(parse(text=thechosen)) goodbye=as.character(myzmat$name)[byebye] goodtogo=T if(removeask) { cat("Are you sure to delete (y/n):\n") print(goodbye) a=readline() if(a!="y") goodtogo=F } if(goodtogo) { rm(list=goodbye,pos=1) } } } From murdoch.duncan at gmail.com Tue Sep 6 16:44:00 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Tue, 06 Sep 2011 10:44:00 -0400 Subject: [R] object.size() not recognized within .First() In-Reply-To: <4E662D07.8000908@stat.colostate.edu> References: <4E662D07.8000908@stat.colostate.edu> Message-ID: <4E6631B0.60704@gmail.com> On 11-09-06 10:24 AM, Geof Givens wrote: > I have a function called within .First(), as in > > .First=function() { > ...blah... > BIG(partofblah) #BIG is my function, n=partofblah when called > ...foo... > } > > The partofblah component of blah is a number obtained from readline(), > which is then an argument to BIG() > > BIG=function(n=10,removeask=T) { > z<- sapply(ls(pos=1), function(x)object.size(get(x))) > ...stuff... } > > When .First() executes upon startup and readline() determines > partofblah, the call to BIG() creates an error as follows: > > Hello. How many huge items to consider deleting? (Enter=none) > 3 > Error in FUN(c("a", "a1", "a2", "aa", "aaa", "abline", "ad", "adj11", : > could not find function "object.size" > > The "Hello" line is a prompt from .First(). "3" is my input. The error > occurs when BIG() is executed because object.size() is called > immediately upon entry to BIG(). > > After this error, execution of .First() from the command line works fine. > > I can't figure out what is going wrong. Any ideas? See the help page ?.First: It is run before the standard packages are attached. You need to say where to find that function if you want to use it, i.e. use utils::object.size. Duncan Murdoch > > > Thanks, > > Geof > > > > BELOW HERE ARE THE COMPLETE FUNCTIONS: > > .First=function() { > cat("Hello. How many huge items to consider deleting? (Enter=none)\n") > g=eval(parse(text=readline())) > if (is.numeric(g)) { BIG(n=g)} > cat("\nOk. What .Rdata file and directory do you want to use?\n\n") > cat("0 = Default (C:/Users/geof/My Documents)\n") > cat("1 = ~geof/teach/stat540/2011/R\n") > cat("c = choose by browsing\n") > thechosen=readline() > if (thechosen=="1") { > > load("c:\\users\\geof\\csu\\teach\\stat540\\2011\\R\\.RData",.GlobalEnv) > setwd("c:\\users\\geof\\csu\\teach\\stat540\\2011\\R") } > if (thechosen=="c") { > cat("Hello. Choose .RData file to work with...\n") > where=file.choose() > rwhere=sub("\\.RData","",where) > load(where,.GlobalEnv) > setwd(rwhere) } > cat("\n") > cat(paste("Okay, I've got you working/saving in ", > getwd()[1],"\n",sep="")) } > > BIG=function(n=10,removeask=T) { > z<- sapply(ls(pos=1), function(x)object.size(get(x))) > zlab=names(z) > z=as.matrix(rev(sort(z))) > zlab=as.matrix(rev(sort(zlab))) > myzmat=data.frame(id=1:n,name=zlab[1:n],size=z[1:n]) > row.names(myzmat)=NULL > print(myzmat) > cat("\nGive c() vector of id's to delete, or return to exit\n") > thechosen=readline() > if (thechosen=="") { > invisible(return()) } else { > byebye=eval(parse(text=thechosen)) > goodbye=as.character(myzmat$name)[byebye] > goodtogo=T > if(removeask) { > cat("Are you sure to delete (y/n):\n") > print(goodbye) > a=readline() > if(a!="y") goodtogo=F } > if(goodtogo) { > rm(list=goodbye,pos=1) } } } > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From murdoch.duncan at gmail.com Tue Sep 6 16:46:48 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Tue, 06 Sep 2011 10:46:48 -0400 Subject: [R] several functions in one *.R file in a R package In-Reply-To: <1315319180.35832.YahooMailClassic@web28207.mail.ukl.yahoo.com> References: <1315319180.35832.YahooMailClassic@web28207.mail.ukl.yahoo.com> Message-ID: <4E663258.8030705@gmail.com> On 11-09-06 10:26 AM, Jannis wrote: > Dear list members, > > > i have build a package which contains a collection of my frequently used functions. To keep the code organized I have broken down some rather extensive and long functions into individual steps and bundled these steps in sub-functions that are called inside the main function. > > To keep an overview over which sub-function belongs to which main function I saved all the respective sub-functions to the same *.R file as their main-function and gave them names beginning with . to somehow hide the sub-functions. The result would be one *.R file in/R for each 'main-function' containing something like: > > > mainfunction<- function() { > .subfunction1() > .subfunction2() > #... > } > > .subfunction1()<- function() { > #do some stuff > } > .subfunction2()<- function() { > #do some more stuff > } > > > According to the way I understood the "Writing R Extensions" Manual I expected this to work. When I load the package, however, I get the error message that the sub-functions could not be found. Manually sourcing all files in the/R directory however yields the expected functionality. > > In what way am I mistaken here? Any ideas? Those definitions of .subfunction1 and .subfunction2 are not syntactically correct: extra parens. If that's just a typo in the message, then you'll have to show us real code. What you describe should work. Duncan Murdoch From lcipriano at iol.pt Tue Sep 6 17:08:38 2011 From: lcipriano at iol.pt (=?utf-8?q?L=C3=ADvio_Cipriano?=) Date: Tue, 6 Sep 2011 16:08:38 +0100 Subject: [R] Q and R mode Message-ID: <201109061608.38439.lcipriano@iol.pt> Hi, Can anyone explain me the differences in Q and R mode in Principal Component Analysis, as performed by prcomp and princom respectively. Regards L?vio Cipriano From lcipriano at iol.pt Tue Sep 6 17:10:45 2011 From: lcipriano at iol.pt (=?utf-8?q?L=C3=ADvio_Cipriano?=) Date: Tue, 6 Sep 2011 16:10:45 +0100 Subject: [R] Q and R mode in Principal Component Analysis Message-ID: <201109061610.45238.lcipriano@iol.pt> Hi, Can anyone explain me the differences in Q and R mode in Principal Component Analysis, as performed by prcomp and princom respectively. Regards L?vio Cipriano From Mark.Ebbert at hci.utah.edu Tue Sep 6 16:58:10 2011 From: Mark.Ebbert at hci.utah.edu (Mark Ebbert) Date: Tue, 6 Sep 2011 08:58:10 -0600 Subject: [R] write.matrix row names vs sink vs capture.output In-Reply-To: <4E662CAF.9030501@knmi.nl> References: <53A88CC9-FD22-4CFC-B107-1AFE20FBF93E@hci.utah.edu> <4E662CAF.9030501@knmi.nl> Message-ID: Thank you for your help. The data is meant to be processed by a separate program that expects a simple matrix with row and column names in ascii format. "write.matrix" does exactly what I want except for the row names. It baffles me that this is not an option? On Sep 6, 2011, at 8:22 AM, Paul Hiemstra wrote: > On 09/06/2011 06:24 AM, Mark Ebbert wrote: >> Dear R gurus, >> >> I am trying to write several large matrices (~ 1GB) to separate files. I have learned that write.table is simply too slow for this task and was attempting to use write.matrix, but write.matrix does not have the ability to include row names in the output. Anyone know why that's the case? I've seen a thread stating that write.matrix is the way to go for large prints to files, but it doesn't do what I need it to. Since write.matrix wasn't working I tried both sink and capture.output, but then the output is printed to the file using the same 'width' restrictions as the general "options(width=)" limit. >> >> Any ideas on how to print a large matrix with row names? I could write a perl script to modify the files after the fact, but I shouldn't have to do that. >> >> Thanks for your help! >> >> Mark T. W. Ebbert >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > Hi, > > What do you want with the data? If you want to store an R matrix on disk > for later use in R, take a look at ?save. If it is for use in another > programming language, I would write the matrix in binary format > (?writebin). This saves a lot of space and prevents any (significant) > rounding errors. It is probably also quite a bit faster. If you really > need some more metadata (such as rownames), I would add a second text > file which stores this information. Sort of a binary file plus a header, > which is a quite common format for storing data. Maybe you can even find > a standard binary format which you can use. But it is impossible to > comment on this because you did not provide information as to what you > want to do with the saved data. > > good luck! > Paul > > -- > Paul Hiemstra, Ph.D. > Global Climate Division > Royal Netherlands Meteorological Institute (KNMI) > Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 > P.O. Box 201 | 3730 AE | De Bilt > tel: +31 30 2206 494 > > http://intamap.geo.uu.nl/~paul > http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 > From E.Vettorazzi at uke.de Tue Sep 6 16:48:39 2011 From: E.Vettorazzi at uke.de (Eik Vettorazzi) Date: Tue, 6 Sep 2011 16:48:39 +0200 Subject: [R] subsetting tables In-Reply-To: <1315318203870-3793509.post@n4.nabble.com> References: <1315318203870-3793509.post@n4.nabble.com> Message-ID: <4E6632C7.5040200@uke.de> Hi Netzwerkerin, subset is a generic function and behaves different for different object classes. txt<-" Allstar hsa.let.7a hsa.let.7a.1 hsa.let.7a.2 2 0.87 0.79 -0.57 1.07 3 0.67 -1.14 -0.78 -0.95 4 -0.46 -0.30 -0.36 1.14" red<-read.table(textConnection(txt)) #compare str(red[,2]) str(red[2,]) but if it comes to just counting values meeting some conditions, using "subset" is not needed at all. sum(red>.5) length(which(red>.5)) and the arr.ind option of which may be useful as well. hth Am 06.09.2011 16:10, schrieb netzwerkerin: > Hi guys, > > one of the questions where you need a real human instead of a search engine, > so it would be great if you could help. > > I have a matrix of z-scores which I would like to filter, sometimes > columnwise, sometimes rowwise. Data looks like this: > > Allstar hsa.let.7a hsa.let.7a.1 hsa.let.7a.2 > 2 0.87 0.79 -0.57 1.07 > 3 0.67 -1.14 -0.78 -0.95 > 4 -0.46 -0.30 -0.36 1.14 > > Now I want to find all elements which are below/above some threshold. Subset > works fine with the columns: > >> subset(red[,4], red[,4] > 0.5) > [1] 1.07 1.14 > > But not with the rows: > >> subset(red[2,], red[2,] > 0.5) > Allstar hsa.let.7a hsa.let.7a.1 hsa.let.7a.2 > 3 0.67 -1.14 -0.78 -0.95 > > If I try to find all values above 0.5 (any row, any column, I just need the > number of entries), this is what I try (and get): > >> subset(red[,], red[,] > 0.5) > Allstar hsa.let.7a hsa.let.7a.1 hsa.let.7a.2 > 2 0.87 0.79 -0.57 1.07 > 3 0.67 -1.14 -0.78 -0.95 > NA NA NA NA NA > NA.1 NA NA NA NA > NA.2 NA NA NA NA > > Obviously I'm doing something wrong, but what? > Help very much appreciated. > Netzwerkerin > > -- > View this message in context: http://r.789695.n4.nabble.com/subsetting-tables-tp3793509p3793509.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. J?rg F. Debatin (Vorsitzender), Dr. Alexander Kirstein, Joachim Pr?l?, Prof. Dr. Dr. Uwe Koch-Gromus From jim.trabas at googlemail.com Tue Sep 6 16:59:01 2011 From: jim.trabas at googlemail.com (Jim Trabas) Date: Tue, 6 Sep 2011 07:59:01 -0700 (PDT) Subject: [R] How to understand the plotting of the cox.zph function In-Reply-To: <1315314713.27828.14.camel@nemo> References: <1315122754949-3788886.post@n4.nabble.com> <1315314713.27828.14.camel@nemo> Message-ID: <1315321141433-3793651.post@n4.nabble.com> Thank you very much for your answer. I would like to construct for presentation purposes the HR(t), not the beta(t). How can I perform this? I obtained the x-y values of the cox.zph plot time= as.numeric(as.character(rownames(cox.zph.object$y))) HR=exp(cox.zph.object$y[,1]) However when I plot the plot(time, HR) and fit a lowess line (to time, HR), the HR seems to jump over proportionally high (higher that what the cox.zph plot suggests) Am I making any error in my thinking? Many thanks JT -- View this message in context: http://r.789695.n4.nabble.com/How-to-understand-the-plotting-of-the-cox-zph-function-tp3788886p3793651.html Sent from the R help mailing list archive at Nabble.com. From descostes at ciml.univ-mrs.fr Tue Sep 6 17:16:55 2011 From: descostes at ciml.univ-mrs.fr (Nico902) Date: Tue, 6 Sep 2011 08:16:55 -0700 (PDT) Subject: [R] mclust: modelName="E" vs modelName="V" In-Reply-To: References: <1315138638856-3789167.post@n4.nabble.com> Message-ID: <1315322215007-3793697.post@n4.nabble.com> Hi, Thanks a lot for your answer. I effectively was able to get rid of this message by doing: > resClust <- > Mclust(data,G=3,modelName="V",prior=priorControl(scale=c(1.44,0.81,0.49))); However, I would like to be able to retrieve the variances I defined in the result. I found: > resClust$parameters $Vinv NULL $pro [1] 0.5502496 0.1986852 0.2510652 $mean 1 2 3 -2.8390006980 -0.0003267873 3.1072574619 $variance $variance$modelName [1] "V" $variance$d [1] 1 $variance$G [1] 3 $variance$sigmasq [1] 0.840267666 0.009466821 1.510263146 $variance$scale [1] 0.840267666 0.009466821 1.510263146 I do not manage to get where the sigmasq is coming from. I tried to sqrt or square the sigmasq but it does not correspond to what I defined. I found nothing in the manual. If I am missing something obvious or if somebody has the solution it will help me a lot. I want to retrieve those values automatically to plot the different curves of the fitting and to be sure this is doing what I want. Thank you very much again. -- View this message in context: http://r.789695.n4.nabble.com/mclust-modelName-E-vs-modelName-V-tp3789167p3793697.html Sent from the R help mailing list archive at Nabble.com. From emmanuelle.comets at inserm.fr Mon Sep 5 13:02:02 2011 From: emmanuelle.comets at inserm.fr (Emmanuelle Comets) Date: Mon, 5 Sep 2011 13:02:02 +0200 Subject: [R] [R-pkgs] saemix: SAEM algorithm for parameter estimation in non-linear mixed-effect models (version 0.96) Message-ID: <4E64AC2A.8090901@inserm.fr> saemix implements the SAEM (stochastic approximation EM) algorithm for parameter estimation in non-linear mixed effect models, used to model longitudinal data. Longitudinal data are particularly prominent in pharmacokinetics (study of drug concentrations versus time) and pharmacodynamics (study of drug effect versus time), but the SAEM algorithm has also been successfully applied in many other areas and we would like to encourage you to try saemix. More details can be found in the user guide included in the package, which encloses a section showing different examples using SAEMIX. As always, I would be very grateful for comments and suggestions, and would welcome any feed-back. Emmanuelle Comets Authors: Emmanuelle Comets, Audrey Lavenu and Marc Lavielle -- Statistiquement, tout s'explique. Personnellement, tout se complique. (Daniel Pennac) _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages From ripley at stats.ox.ac.uk Tue Sep 6 17:47:31 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Tue, 6 Sep 2011 16:47:31 +0100 (BST) Subject: [R] Receive "unable to load shared object RNetCDF.o" during R INSTALL of RNetCDF In-Reply-To: <4E660355.20306@knmi.nl> References: <4E660355.20306@knmi.nl> Message-ID: The problem appears to be udunits2, which RNetCDF uses and ncdf does not. Specifically >> --with-udunits-lib='/soft/local/udunits-2.1.23/lib'" is that path in the ld.so cache or in LD_LIBRARY_PATH? If not, it will not be found by dlload. On Tue, 6 Sep 2011, Paul Hiemstra wrote: > Hi, > > You could try installing the ncdf package. This installed for my without > any problems when RNetCDF failed. I suspect ncdf has very similar > functionality, but i've only worked with ncdf. > > install.packages("ncdf") > > should do the trick... > > cheers, > Paul > > On 09/05/2011 08:56 PM, David Brown wrote: >> On a Red Hat Linux cluster I am seeing the following after multiple >> other packages were successfully installed. The error seems to suggest >> that RNetCDF.o was not copied to the appropriate lib folder. The admin >> user performing the install has the required privileges to perform the >> install. Words of wisdom are greatly appreciated: >> >> R CMD INSTALL --configure-args="--with-netcdf-include='/soft/local/netcdf/netcdf-3.6.2/include/' >> --with-netcdf-lib='/soft/local/netcdf/netcdf-3.6.2/libso' >> --with-udunits-include='/soft/local/udunits-2.1.23/include' >> --with-udunits-lib='/soft/local/udunits-2.1.23/lib'" >> RNetCDF_1.5.2-2.tar.gz >> >> * installing to library ?/soft/local/r/R-2.13.1/lib64/R/library? >> * installing *source* package ?RNetCDF? ... >> checking for gcc... gcc -std=gnu99 >> checking for C compiler default output file name... a.out >> checking whether the C compiler works... yes >> checking whether we are cross compiling... no >> checking for suffix of executables... >> checking for suffix of object files... o >> checking whether we are using the GNU C compiler... yes >> checking whether gcc -std=gnu99 accepts -g... yes >> checking for gcc -std=gnu99 option to accept ISO C89... none needed >> checking for nc_open in -lnetcdf... yes >> checking for utInit in -ludunits2... no >> checking for utScan in -ludunits2... yes >> checking how to run the C preprocessor... gcc -std=gnu99 -E >> checking for grep that handles long lines and -e... /bin/grep >> checking for egrep... /bin/grep -E >> checking for ANSI C header files... no >> checking for sys/types.h... yes >> checking for sys/stat.h... yes >> checking for stdlib.h... yes >> checking for string.h... yes >> checking for memory.h... yes >> checking for strings.h... yes >> checking for inttypes.h... yes >> checking for stdint.h... yes >> checking for unistd.h... yes >> checking netcdf.h usability... yes >> checking netcdf.h presence... yes >> checking for netcdf.h... yes >> checking udunits.h usability... yes >> checking udunits.h presence... yes >> checking for udunits.h... yes >> configure: creating ./config.status >> config.status: creating R/load.R >> config.status: creating src/Makevars >> ** libs >> gcc -std=gnu99 -I/soft/local/r/R-2.13.1/lib64/R/include >> -I/soft/local/udunits-2.1.23/include >> -I/soft/local/netcdf/netcdf-3.6.2/include/ -I/usr/local/include >> -fpic -g -O2 -c RNetCDF.c -o RNetCDF.o >> gcc -std=gnu99 -shared -L/usr/local/lib64 -o RNetCDF.so RNetCDF.o >> -ludunits2 -lnetcdf -L/soft/local/udunits-2.1.23/lib >> -L/soft/local/netcdf/netcdf-3.6.2/libso -lexpat >> installing to /soft/local/r/R-2.13.1/lib64/R/library/RNetCDF/libs >> ** R >> ** preparing package for lazy loading >> ** help >> *** installing help indices >> ** building package indices ... >> ** testing if installed package can be loaded >> Error in dyn.load(file, DLLpath = DLLpath, ...) : >> unable to load shared object >> '/soft/local/r/R-2.13.1/lib64/R/library/RNetCDF/libs/RNetCDF.so': >> libudunits2.so.0: cannot open shared object file: No such file or directory >> Error: loading failed >> Execution halted >> ERROR: loading failed >> * removing ?/soft/local/r/R-2.13.1/lib64/R/library/RNetCDF? >> [swadm at boar01]/soft/local/temp% >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > -- > Paul Hiemstra, Ph.D. > Global Climate Division > Royal Netherlands Meteorological Institute (KNMI) > Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 > P.O. Box 201 | 3730 AE | De Bilt > tel: +31 30 2206 494 > > http://intamap.geo.uu.nl/~paul > http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From mailinglist.honeypot at gmail.com Tue Sep 6 17:55:49 2011 From: mailinglist.honeypot at gmail.com (Steve Lianoglou) Date: Tue, 6 Sep 2011 11:55:49 -0400 Subject: [R] Q and R mode in Principal Component Analysis In-Reply-To: <201109061610.45238.lcipriano@iol.pt> References: <201109061610.45238.lcipriano@iol.pt> Message-ID: Hi, 2011/9/6 L?vio Cipriano : > Hi, > > Can anyone explain me the differences in Q and R mode in Principal Component > Analysis, as performed by prcomp and princom respectively. Perhaps this tutorial will help? http://www.uga.edu/strata/software/pdf/pcaTutorial.pdf Search that pdf for "q-mode" if you don't need to go through the intro. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact From eran at taykey.com Tue Sep 6 17:56:43 2011 From: eran at taykey.com (Eran Eidinger) Date: Tue, 6 Sep 2011 18:56:43 +0300 Subject: [R] capturing a figure to PDF or Image In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From chrish at stats.ucl.ac.uk Tue Sep 6 18:19:17 2011 From: chrish at stats.ucl.ac.uk (Christian Hennig) Date: Tue, 6 Sep 2011 17:19:17 +0100 (BST) Subject: [R] mclust: modelName="E" vs modelName="V" In-Reply-To: <1315322215007-3793697.post@n4.nabble.com> References: <1315138638856-3789167.post@n4.nabble.com> <1315322215007-3793697.post@n4.nabble.com> Message-ID: I probably don't understand problem. I'd assume that variance$sigmasq are the three estimated component variances (probably estimated by maximum a posteriori, but consult the mclust documentation). What's wrong with that? (The values you submit as scale in "prior" are not fixed variances, but parameters of the prior distribtion - your problem may be that you believe that they are meant to be variances fixed by you!?) Christian On Tue, 6 Sep 2011, Nico902 wrote: > Hi, > > Thanks a lot for your answer. I effectively was able to get rid of this > message by doing: > >> resClust <- >> Mclust(data,G=3,modelName="V",prior=priorControl(scale=c(1.44,0.81,0.49))); > > > However, I would like to be able to retrieve the variances I defined in the > result. I found: > >> resClust$parameters > $Vinv > NULL > > $pro > [1] 0.5502496 0.1986852 0.2510652 > > $mean > 1 2 3 > -2.8390006980 -0.0003267873 3.1072574619 > > $variance > $variance$modelName > [1] "V" > > $variance$d > [1] 1 > > $variance$G > [1] 3 > > $variance$sigmasq > [1] 0.840267666 0.009466821 1.510263146 > > $variance$scale > [1] 0.840267666 0.009466821 1.510263146 > > > I do not manage to get where the sigmasq is coming from. I tried to sqrt or > square the sigmasq but it does not correspond to what I defined. I found > nothing in the manual. If I am missing something obvious or if somebody has > the solution it will help me a lot. I want to retrieve those values > automatically to plot the different curves of the fitting and to be sure > this is doing what I want. > > Thank you very much again. > > -- > View this message in context: http://r.789695.n4.nabble.com/mclust-modelName-E-vs-modelName-V-tp3789167p3793697.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche From dwinsemius at comcast.net Tue Sep 6 18:19:56 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 6 Sep 2011 12:19:56 -0400 Subject: [R] Weights using Survreg In-Reply-To: <1315317047290-3793462.post@n4.nabble.com> References: <1315314079.27828.7.camel@nemo> <1315317047290-3793462.post@n4.nabble.com> Message-ID: <515B021D-7A15-4E80-8D1E-FE258DB75BEA@comcast.net> I think you are replying to Dr Therneau without including this context: >> --- begin---- >> Survreg produces MLE estimates. >> >> For your second question, don't know what you are asking. Can you be >> more specific and detailed? >> >> ---begin included message -- >> Do you know if the parameters estimators are MLE estimators? >> >> One more question: >> In my case study I have failures that occured on different objects >> that >> have different age and length, could I use weight to find the >> estimates of a >> weibull law and so to find the probabilty of failure per unit of >> length >> for example? -----end--------- On Sep 6, 2011, at 9:50 AM, Boris Beranger wrote: > Sorry when we talk about about MLE estimates does that mean WLE?I am > trying > to understand if the survreg function is allowing a weight for each > density > function when calculating the likelihood. > > In my second question I was trying to explain that my problem is > that I have > pipes of different length and I want to know their probability to > break per > metre. My idea was to weight each of my observations to get estimate > probabilities per metre.Does that sound realistic? I have generally used Poisson regression [ glm(..., family="poisson") ] in that situation. It lets you do two things: a) apply weighting by using offset=log(length_of_pipe) and b) model multiple breaks in a pipe if such an occurrence is possible. (It also produces an MLE estimate if that feature is of some special importance.) I respectfully defer to anything Dr Therneau says on this matter and am only really posting in hopes that he will clarify whether there is any value in thinking about the use of offset terms in either parametric or Cox survival models. There is an offset argument in glm but I do not see one (any longer?) in survreg or coxph. I have what must be an extremely vague memory of seeing an offset term in coxph formulas, but I do not see such a possibility described in the current help pages. Therenau and Grambsch indicates that CPH models with certain forms of frailty are similar to models with offsets but the help apge for `Surv` specifically warns against the use of "gamma/ml or gaussian/reml [frailty terms] with survreg". -- David Winsemius, MD West Hartford, CT From bogaso.christofer at gmail.com Tue Sep 6 20:29:42 2011 From: bogaso.christofer at gmail.com (Bogaso Christofer) Date: Tue, 6 Sep 2011 23:59:42 +0530 Subject: [R] p values greater than 1 from lme4 In-Reply-To: References: <1315236643683-3791526.post@n4.nabble.com> Message-ID: <019e01cc6cc2$f7d44280$e77cc780$@gmail.com> Hello Bolker, Hope you will make available at least the problem and reasoning to this list. I am also very much interested to see the problem -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Ben Bolker Sent: 06 September 2011 02:58 To: r-help at stat.math.ethz.ch Subject: Re: [R] p values greater than 1 from lme4 RTSlider gmail.com> writes: > > Hello, > I'm running linear regressions using the following script where I have > separated out species using the "IDtotsInLn" identifier > > x<-read.csv('tbl02TOTSInLn_ENV.csv', header=T) > x > attach (x) > library(lme4) > > rInLn<-lmList(InLn~pMoist | IDtotsInLn, x, pool=F) > write.table(summary(rInLn)$coefficients, "rInLnPlots.csv") > write.table(summary(rInLn)$r.squared, append=T, "rInLnPlots.csv") > write.table(summary(rInLn)$df, append=T, "rInLnPlots.csv") > > The script seems to be working for most of the species, but for some > it is returning a p value of greater than 1 (e.g. 20). I thought this > might be for the few cases where the independent variable remained > constant, but found other species where this was not the case and the > p value was still much greater than 1. > Any help would be appreciated > -RTS This is very interesting but practically impossible to solve because it's not reproducible; is there any chance that you can make the data available? You can send it directly to me (Ben Bolker -- my e-mail is pretty easy to find on the web) if you like. Ben Bolker ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From bbolker at gmail.com Tue Sep 6 20:30:54 2011 From: bbolker at gmail.com (Ben Bolker) Date: Tue, 6 Sep 2011 14:30:54 -0400 Subject: [R] p values greater than 1 from lme4 In-Reply-To: <019e01cc6cc2$f7d44280$e77cc780$@gmail.com> References: <1315236643683-3791526.post@n4.nabble.com> <019e01cc6cc2$f7d44280$e77cc780$@gmail.com> Message-ID: <4E6666DE.2020502@gmail.com> On 09/06/2011 02:29 PM, Bogaso Christofer wrote: > Hello Bolker, Hope you will make available at least the problem and > reasoning to this list. I am also very much interested to see the > problem From an earlier off-list e-mail: The problem (which was not at all trivial) is that one of the species in the example has only one level of pMoist, so the slope parameter is not identifiable (aliased), so the coefficient matrix produced by summary.lm() has only a single row rather than two, so summary.lmList gets confused when it tries to boil down the coefficient tables from all of the different fits into a single array. One normally doesn't notice this in the output of summary() from a single lm fit because print.summary.lm does a little bit of magic to replace the missing rows (i.e. slope estimate, std. err, t statistic, p value) with NAs in the printed summary. Solutions: (1) e-mail me for the code to the hacked version of summary.lmList; (2) remove units from your data that have this kind of unidentifiability/aliasing problem; (3) wait for Doug Bates to implement my fix in the next patched version of nlme (the summary.lmList function lives in nlme, not lme4). Ben Bolker > > -----Original Message----- From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Ben Bolker Sent: > 06 September 2011 02:58 To: r-help at stat.math.ethz.ch Subject: Re: > [R] p values greater than 1 from lme4 > > RTSlider gmail.com> writes: > >> >> Hello, I'm running linear regressions using the following script >> where I have separated out species using the "IDtotsInLn" >> identifier >> >> x<-read.csv('tbl02TOTSInLn_ENV.csv', header=T) x attach (x) >> library(lme4) >> >> rInLn<-lmList(InLn~pMoist | IDtotsInLn, x, pool=F) >> write.table(summary(rInLn)$coefficients, "rInLnPlots.csv") >> write.table(summary(rInLn)$r.squared, append=T, "rInLnPlots.csv") >> write.table(summary(rInLn)$df, append=T, "rInLnPlots.csv") >> >> The script seems to be working for most of the species, but for >> some it is returning a p value of greater than 1 (e.g. 20). I >> thought this might be for the few cases where the independent >> variable remained constant, but found other species where this was >> not the case and the p value was still much greater than 1. Any >> help would be appreciated -RTS > > This is very interesting but practically impossible to solve because > it's not reproducible; is there any chance that you can make the data > available? You can send it directly to me (Ben Bolker -- my e-mail is > pretty easy to find on the web) if you like. > > Ben Bolker > > ______________________________________________ R-help at r-project.org > mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do > read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From bbolker at gmail.com Tue Sep 6 20:34:38 2011 From: bbolker at gmail.com (Ben Bolker) Date: Tue, 6 Sep 2011 18:34:38 +0000 Subject: [R] MuMIn Problem getting adjusted Confidence intervals References: <4E6623CC.4030508@uni-wuerzburg.de> Message-ID: Kamil Barto? uni-wuerzburg.de> writes: > > Hi Marcos, > > The 'adjusted CI' (based on the 'adjusted se estimator' > as in section 4.3.3 of Burnham & Anderson > 2002) cannot be calculated for 'lmer' model because it > does not give df's for the coefficients. > > kamil If you're willing to use MASS::glmmPQL instead (which gives df's, because it is based on lme which does give df's) it might work. From jinruixu at umich.edu Tue Sep 6 19:20:45 2011 From: jinruixu at umich.edu (Jinrui Xu) Date: Tue, 06 Sep 2011 13:20:45 -0400 Subject: [R] heatmap In-Reply-To: <515B021D-7A15-4E80-8D1E-FE258DB75BEA@comcast.net> References: <1315314079.27828.7.camel@nemo> <1315317047290-3793462.post@n4.nabble.com> <515B021D-7A15-4E80-8D1E-FE258DB75BEA@comcast.net> Message-ID: <20110906132045.10895wn43671i4ws@web.mail.umich.edu> Hi everyone, I have three numerica vectors: x, y, z. I want to plot a heatmap or surface plot of z against x and y. Is there any package for this? If possible, please drop me several lines of example code. Thanks! jinrui, From chenshu at umich.edu Tue Sep 6 19:31:56 2011 From: chenshu at umich.edu (shu chen) Date: Tue, 6 Sep 2011 13:31:56 -0400 Subject: [R] bootstrap function Message-ID: Hi, all, When I ran the following code, library (Design) reri <- function(datsam) { fitlr <- glm(PTSDpy ~ toxo*PTE, family=binomial, data=datsam) reri <- exp(fitlr$coef[2]+fitlr$coef[3]+fitlr$coef[4])-exp(fitlr$coef[2]) - exp(fitlr$coef[3]) + 1 } summary.bootstrap(bootstrap(PTE.work, reri(PTE.work),B=10000, group=PTE.work$PTSDpy), probs=c(0.025,0.5,0.975)) Error: could not find function "summary.bootstrap" Does anyone know which package Functions "summary.bootstrap" and "bootstrap" are located? Thanks a lot. Sue From mathijsdevaan at gmail.com Tue Sep 6 18:39:27 2011 From: mathijsdevaan at gmail.com (mdvaan) Date: Tue, 6 Sep 2011 09:39:27 -0700 (PDT) Subject: [R] Selecting and multiplying In-Reply-To: <1314915643870-3784901.post@n4.nabble.com> References: <1314915643870-3784901.post@n4.nabble.com> Message-ID: <1315327167498-3793908.post@n4.nabble.com> Anyone any idea on how to tackle this problem? Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Selecting-and-multiplying-tp3784901p3793908.html Sent from the R help mailing list archive at Nabble.com. From Rachida.Elmehdi at uclouvain.be Tue Sep 6 19:23:20 2011 From: Rachida.Elmehdi at uclouvain.be (Rachida El Mehdi) Date: Tue, 6 Sep 2011 19:23:20 +0200 Subject: [R] Help on the multivariate interpolation with R Message-ID: <027663dce938fb434cad04ec123e8a9a.squirrel@mmp.sipr-dc.ucl.ac.be> Hello, I work on the Stochastic Frontier Analysis (SFA) and I am looking for a function in a R package which done the multivariate interpolation. My problem is: For all i=1, ..., n, I have values of (xi1, xi2, ..., xi7)in IR^7 and f(xi1, xi2, ..., xi7)in R and I have also values of (x'i1, x'i2, ..., x'i7) in IR^7 and I need f(x'i1, x'i2, ..., x'i7) in R by interpolation. So x is a (n,7) or a (7,n) matrix and x? is also a matrix with the same format as x. Can someone help me? Thank you in advance. Rachida El Mehdi From eeadie at unm.edu Tue Sep 6 20:41:24 2011 From: eeadie at unm.edu (eeadie) Date: Tue, 6 Sep 2011 11:41:24 -0700 (PDT) Subject: [R] help with glmm.admb In-Reply-To: References: Message-ID: <1315334484305-3794210.post@n4.nabble.com> Hi Ben, Thanks you were right. I was using R-studio and needed to close and reopen this after downloading the lastest glmmADMB version in R. Then the new version was on there. Now I have a new problem with the same model that I've been working on. Here is the model and the error message: > modelnbbb<-glmmadmb(total_bites_rounded~age_class_back+(1|focal_individual)+(1|food.dif.id)+offset(log(forage_time)),data=data,family="nbinom") Error in parse(text = x) : :1:41: unexpected ')' 1: total_bites_rounded ~ age_class_back ++ ) ^ It seems like this message is trying to tell me that I have an extra ) somewhere, but I don't think this is true because the exact same model works fine with lmer. Is there something special about glmmadmb syntax that is giving me problems? Thank you for any advice you or anyone else can provide!!! Lizzy -- View this message in context: http://r.789695.n4.nabble.com/help-with-glmm-admb-tp3788082p3794210.html Sent from the R help mailing list archive at Nabble.com. From sarah.goslee at gmail.com Tue Sep 6 20:56:14 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Tue, 6 Sep 2011 14:56:14 -0400 Subject: [R] heatmap In-Reply-To: <20110906132045.10895wn43671i4ws@web.mail.umich.edu> References: <1315314079.27828.7.camel@nemo> <1315317047290-3793462.post@n4.nabble.com> <515B021D-7A15-4E80-8D1E-FE258DB75BEA@comcast.net> <20110906132045.10895wn43671i4ws@web.mail.umich.edu> Message-ID: You mean like the examples in help("heatmap") ? On Tue, Sep 6, 2011 at 1:20 PM, Jinrui Xu wrote: > Hi everyone, > > I have three numerica vectors: x, y, z. I want to plot a heatmap or surface > plot of z against x and y. Is there any package for this? If possible, > please drop me several lines of example code. Thanks! > > jinrui, > -- Sarah Goslee http://www.functionaldiversity.org From jinruixu at umich.edu Tue Sep 6 21:14:13 2011 From: jinruixu at umich.edu (Jinrui Xu) Date: Tue, 06 Sep 2011 15:14:13 -0400 Subject: [R] heatmap In-Reply-To: References: <1315314079.27828.7.camel@nemo> <1315317047290-3793462.post@n4.nabble.com> <515B021D-7A15-4E80-8D1E-FE258DB75BEA@comcast.net> <20110906132045.10895wn43671i4ws@web.mail.umich.edu> Message-ID: <20110906151413.63355oojxokep7ok@web.mail.umich.edu> Hi Sarah, To me, the heatmap function calculates "density value" for each grid of the heatmap automatically from the input matrix. In my case, I already got the "density value" as a vector, say Z. I want to plot a heat map with x and y as is axsis and z values as the "density" of grid. I am not familiar with R code, so I am writting to ask how to. Thanks! jinrui, Quoting Sarah Goslee : > You mean like the examples in help("heatmap") ? > > On Tue, Sep 6, 2011 at 1:20 PM, Jinrui Xu wrote: >> Hi everyone, >> >> I have three numerica vectors: x, y, z. I want to plot a heatmap or surface >> plot of z against x and y. Is there any package for this? If possible, >> please drop me several lines of example code. Thanks! >> >> jinrui, >> > > -- > Sarah Goslee > http://www.functionaldiversity.org > > > -- Ph.D Student, Bioinformatics Program Center for Computational Medicine and Bioinformatics (CCMB) The University of Michigan 100 Washtenaw Avenue Ann Arbor, MI 48109-2218 1075 Natural Science Building 830 North University Avenue Ann Arbor, MI 48109-1048 Tel (lab): 734-763-0514 http://www-personal.umich.edu/~jinruixu/ From diggsb at ohsu.edu Tue Sep 6 21:18:06 2011 From: diggsb at ohsu.edu (Brian Diggs) Date: Tue, 6 Sep 2011 12:18:06 -0700 Subject: [R] xtable with conditional formatting using \textcolor In-Reply-To: References: <8C7A8850-599F-4C5E-83E7-2AFBF80808FB@me.com> Message-ID: <4E6671EE.8000601@ohsu.edu> On 9/6/2011 4:01 AM, eldor ado wrote: > I have a related question: > > dataframe df contains values like > >> df > .. "\\textbf{ 0.644 }" .. > > and the line > >> print( xtable(df , sanitize.text.function = function(x){x})) sanitize.text.function is an argument of print.xtable, not xtable. Try print( xtable(df) , sanitize.text.function = function(x){x}) or even shorter print( xtable(df) , sanitize.text.function = identity) > > converts them to > > ..& $\backslash$textbf\{ 0.644 \}& .. > escaping both double backslashes and brackes. > > maybe somebody here knows how to prevent xtable from escaping the code? > > best regards, > lukas kohl > > On Wed, Jun 1, 2011 at 8:47 PM, Marc Schwartz wrote: >> On Jun 1, 2011, at 1:33 PM, Walmes Zeviani wrote: >> >>> Hello list, >>> >>> I'm doing a table with scores and I want include colors to represent status >>> of an individual. I'm using sweave<>= and xtable but I can't >>> get a result I want. My attemps are >>> >>> #----------------------------------------------------------------------------- >>> # code R >>> >>> da<- data.frame(id=letters[1:5], score=1:5*2) >>> >>> col<- function(x){ >>> ifelse(x>7, >>> paste("\textcolor{blue}{", formatC(x, dig=2, format="f"), "}"), >>> paste("\textcolor{red}{", formatC(x, dig=2, format="f"), "}")) >>> } >>> >>> da$score.string<- col(da$score) >>> >>> require(xtable) >>> xtable(da[,c("id","score.string")]) >>> >>> #----------------------------------------------------------------------------- >>> >>> actual result >>> #----------------------------------------------------------------------------- >>> \begin{tabular}{rll} >>> \hline >>> & id& score.string \\ >>> \hline >>> 1& a& extcolor\{red\}\{ 2.00 \} \\ >>> 2& b& extcolor\{red\}\{ 4.00 \} \\ >>> 3& c& extcolor\{red\}\{ 6.00 \} \\ >>> 4& d& extcolor\{blue\}\{ 8.00 \} \\ >>> 5& e& extcolor\{blue\}\{ 10.00 \} \\ >>> \hline >>> \end{tabular} >>> #----------------------------------------------------------------------------- >>> >>> desired result (lines omited to save space) >>> #----------------------------------------------------------------------------- >>> 1& a& \textcolor{red}{ 2.00 } \\ >>> 2& b& \textcolor{red}{ 4.00} \\ >>> #----------------------------------------------------------------------------- >>> >>> Any contribution will be useful. Thanks. >>> Walmes. >> >> >> When the '\t' is being cat()'d to the TeX file (or console) by print.xtable(), it is being interpreted as a tab character. You need to escape it with additional backslashes and then adjust the sanitize.text.function in print.xtable() so that it does not touch the backslashes: >> >> >> da<- data.frame(id=letters[1:5], score=1:5*2) >> >> col<- function(x){ >> ifelse(x>7, >> paste("\\textcolor{blue}{", formatC(x, dig=2, format="f"), "}"), >> paste("\\textcolor{red}{", formatC(x, dig=2, format="f"), "}")) >> } >> >> da$score.string<- col(da$score) >> >> >>> da >> id score score.string >> 1 a 2 \\textcolor{red}{ 2.00 } >> 2 b 4 \\textcolor{red}{ 4.00 } >> 3 c 6 \\textcolor{red}{ 6.00 } >> 4 d 8 \\textcolor{blue}{ 8.00 } >> 5 e 10 \\textcolor{blue}{ 10.00 } >> >> >> require(xtable) >> >> print(xtable(da[,c("id","score.string")]), sanitize.text.function = function(x){x}) >> >> >> That will give you: >> >> % latex table generated in R 2.13.0 by xtable 1.5-6 package >> % Wed Jun 1 13:44:54 2011 >> \begin{table}[ht] >> \begin{center} >> \begin{tabular}{rll} >> \hline >> & id& score.string \\ >> \hline >> 1& a& \textcolor{red}{ 2.00 } \\ >> 2& b& \textcolor{red}{ 4.00 } \\ >> 3& c& \textcolor{red}{ 6.00 } \\ >> 4& d& \textcolor{blue}{ 8.00 } \\ >> 5& e& \textcolor{blue}{ 10.00 } \\ >> \hline >> \end{tabular} >> \end{center} >> \end{table} >> >> >> HTH, >> >> Marc Schwartz >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > -- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health & Science University From therneau at mayo.edu Tue Sep 6 21:34:25 2011 From: therneau at mayo.edu (Terry Therneau) Date: Tue, 06 Sep 2011 14:34:25 -0500 Subject: [R] Weights using Survreg In-Reply-To: <515B021D-7A15-4E80-8D1E-FE258DB75BEA@comcast.net> References: <1315314079.27828.7.camel@nemo> <1315317047290-3793462.post@n4.nabble.com> <515B021D-7A15-4E80-8D1E-FE258DB75BEA@comcast.net> Message-ID: <1315337665.27828.60.camel@nemo> I agree with David that poisson regression would be the simplest thing. It's a consequence of the poison formulation and an exponential "trick" E(#breaks) = breaks per meter * length in meters = exp(Xb) * exp(log(length)) = exp(Xb + log(length)) X = covariates that affect "breaks per meter", b=coefficients log(length) appears as an offset, i.e., a covariate that has a known coefficient of 1. You could also use log(length) as an offset in a Cox model, for the same logic. relative risk that a given pipe breaks = length * risk per meter = exp(Xb + log(length)) You need to decide if such a model is scientifically defensible, e.g., if this involved flexing I would expect breakage to go up faster than linear. Notes: offset has always been a part of coxph and survreg, time to improve the documentation I guess I forgot to include the context in my first reply. Terry T. On Tue, 2011-09-06 at 12:19 -0400, David Winsemius wrote: > I think you are replying to Dr Therneau without including this context: > >> --- begin---- > >> Survreg produces MLE estimates. > >> > >> For your second question, don't know what you are asking. Can you be > >> more specific and detailed? > >> > >> ---begin included message -- > >> Do you know if the parameters estimators are MLE estimators? > >> > >> One more question: > >> In my case study I have failures that occured on different objects > >> that > >> have different age and length, could I use weight to find the > >> estimates of a > >> weibull law and so to find the probabilty of failure per unit of > >> length > >> for example? > -----end--------- > > On Sep 6, 2011, at 9:50 AM, Boris Beranger wrote: > > > Sorry when we talk about about MLE estimates does that mean WLE?I am > > trying > > to understand if the survreg function is allowing a weight for each > > density > > function when calculating the likelihood. > > > > In my second question I was trying to explain that my problem is > > that I have > > pipes of different length and I want to know their probability to > > break per > > metre. My idea was to weight each of my observations to get estimate > > probabilities per metre.Does that sound realistic? > > I have generally used Poisson regression [ glm(..., > family="poisson") ] in that situation. It lets you do two things: a) > apply weighting by using offset=log(length_of_pipe) and b) model > multiple breaks in a pipe if such an occurrence is possible. (It also > produces an MLE estimate if that feature is of some special importance.) > > I respectfully defer to anything Dr Therneau says on this matter and > am only really posting in hopes that he will clarify whether there is > any value in thinking about the use of offset terms in either > parametric or Cox survival models. > > There is an offset argument in glm but I do not see one (any longer?) > in survreg or coxph. I have what must be an extremely vague memory of > seeing an offset term in coxph formulas, but I do not see such a > possibility described in the current help pages. Therenau and Grambsch > indicates that CPH models with certain forms of frailty are similar to > models with offsets but the help apge for `Surv` specifically warns > against the use of "gamma/ml or gaussian/reml [frailty terms] with > survreg". > From roger.bos at rothschild.com Tue Sep 6 21:47:02 2011 From: roger.bos at rothschild.com (Bos, Roger) Date: Tue, 6 Sep 2011 15:47:02 -0400 Subject: [R] How to speed up regressions (related to data.frame) Message-ID: All, I have a function that runs a set of regressions (using the rlm function) and I notice that it run much slower on my 64-bit R than it does on my 32-bit R. I guess the bigger bit size slows it down. Anyway, I looked into Rprof to see how I can speed it up. I saw that 78% of the total time is spent in [.data.frame, so I tried converting my data to a matrix using data.matrix, but then rlm complained that data needs to be in the form of a data.frame. Am I stuck or is there still maybe a way to speed up my function? Thanks, Roger *************************************************************** This message is for the named person's use only. It may\...{{dropped:14}} From jcbouette at gmail.com Tue Sep 6 22:02:56 2011 From: jcbouette at gmail.com (=?ISO-8859-1?Q?Jean=2DChristophe_BOU=CBTT=C9?=) Date: Tue, 6 Sep 2011 16:02:56 -0400 Subject: [R] How to speed up regressions (related to data.frame) In-Reply-To: References: Message-ID: I just tried library(MASS) rlm(1:12+rnorm(12),1:12) it seems it works on vectors too, but maybe I miss something. 2011/9/6 Bos, Roger : > All, > > I have a function that runs a set of regressions (using the rlm > function) and I notice that it run much slower on my 64-bit R than it > does on my 32-bit R. ?I guess the bigger bit size slows it down. > Anyway, I looked into Rprof to see how I can speed it up. ?I saw that > 78% of the total time is spent in [.data.frame, so I tried converting my > data to a matrix using data.matrix, but then rlm complained that data > needs to be in the form of a data.frame. ?Am I stuck or is there still > maybe a way to speed up my function? > > Thanks, > > Roger > *************************************************************** > > This message is for the named person's use only. It may\...{{dropped:14}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From lehmannk at informatik.uni-tuebingen.de Tue Sep 6 22:24:28 2011 From: lehmannk at informatik.uni-tuebingen.de (netzwerkerin) Date: Tue, 6 Sep 2011 13:24:28 -0700 (PDT) Subject: [R] subsetting tables In-Reply-To: <1315319733.43266.YahooMailClassic@web28209.mail.ukl.yahoo.com> References: <1315318203870-3793509.post@n4.nabble.com> <1315319733.43266.YahooMailClassic@web28209.mail.ukl.yahoo.com> Message-ID: <1315340668391-3794485.post@n4.nabble.com> Hi Jannis, and thanks for the quick answer jannis-2 wrote: > > > which(red > 0.5) > > this works but what are the actual numbers that are spit out? Because the next step: jannis-2 wrote: > > > red[which(red > 0.5)] > does not work. It gives an Error in `[.data.frame`(tableReduced, which(tableReduced[, -1] > 0.5)) : undefined columns selected and if I change it in: red[which(red > 0.5), ] to select all columns (which is of course not really what I would want here), it spits out a lot of NA values although there is not a single one in the original table. Any idea what happens here? Regarding your question: "why don't you just ... " can almost always be answered with: because I did not know ... ;-) No, really, as a newbie to R and even if one has extensive knowledge of another programming language (or maybe because of that) it is difficult in the beginning to get your head around the very fascinating but different R philosophy. So, I'm grateful for this great help this list provides so fast. -- View this message in context: http://r.789695.n4.nabble.com/subsetting-tables-tp3793509p3794485.html Sent from the R help mailing list archive at Nabble.com. From lehmannk at informatik.uni-tuebingen.de Tue Sep 6 22:42:03 2011 From: lehmannk at informatik.uni-tuebingen.de (netzwerkerin) Date: Tue, 6 Sep 2011 13:42:03 -0700 (PDT) Subject: [R] subsetting tables In-Reply-To: <4E6632C7.5040200@uke.de> References: <1315318203870-3793509.post@n4.nabble.com> <4E6632C7.5040200@uke.de> Message-ID: <1315341723660-3794527.post@n4.nabble.com> Hi Eik, greetings to Hamburg! :-) Thanks for the fast and helpful answer Eik Vettorazzi-2 wrote: > > #compare > str(red[,2]) > str(red[2,]) > I understand that the first is a real vector of nums in R and the second is a ?? matrix/list/data.frame ?? of single ? entries? Can I transpose/transform it into one vector? Tried 'as.vector' but did not help. Eik Vettorazzi-2 wrote: > > sum(red>.5) > length(which(red>.5)) > Sorry for being unprecise. Yes, in this case it was mainly the sum (thanks! helpful function!), but in general I'd like to understand what happened with subset here... Eik Vettorazzi-2 wrote: > > > and the arr.ind option of which may be useful as well. > Thanks a lot, very helpful. For other newbies, here is the line: tableReduced[,-1][which(tableReduced[,-1]>0.5, arr.ind=TRUE)] I needed to exclude the first column (-1) since these were titles (factors) of my rows. In the first trial I forgot to add this information to the first notion of the table as well, i.e., I tried: tableReduced[which(tableReduced[,-1]>0.5, arr.ind=TRUE)] This will (of course, I have to admit) result in subsetting fields that are in one column to the left of the intended column. So, if there are any subsetting indices in the which-function, they also need to be put in front of it to make the indices match. Just for my understanding, do you know what R did with here? Where do the NA values come from, what is the row-title NA.1, why does it print the first two rows unchanged and then goes crazy? > subset(red[,], red[,] > 0.5) > Allstar hsa.let.7a hsa.let.7a.1 hsa.let.7a.2 > 2 0.87 0.79 -0.57 1.07 > 3 0.67 -1.14 -0.78 -0.95 > NA NA NA NA NA > NA.1 NA NA NA NA > NA.2 NA NA NA NA Thanks for this community with fast and reliable help. Amazing to see! -- View this message in context: http://r.789695.n4.nabble.com/subsetting-tables-tp3793509p3794527.html Sent from the R help mailing list archive at Nabble.com. From axel.urbiz at gmail.com Tue Sep 6 23:08:16 2011 From: axel.urbiz at gmail.com (Axel Urbiz) Date: Tue, 6 Sep 2011 17:08:16 -0400 Subject: [R] Question about Natural Splines (ns function) Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From deelman at hotmail.com Tue Sep 6 23:10:44 2011 From: deelman at hotmail.com (B Jessop) Date: Tue, 6 Sep 2011 18:10:44 -0300 Subject: [R] Update packages problem Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From salvo_mac at yahoo.com Tue Sep 6 23:16:40 2011 From: salvo_mac at yahoo.com (Salvo Mac) Date: Tue, 6 Sep 2011 14:16:40 -0700 (PDT) Subject: [R] calibrate.cph plots Message-ID: <1315343800.89793.YahooMailNeo@web121501.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From antonioparedes14 at gmail.com Tue Sep 6 23:47:12 2011 From: antonioparedes14 at gmail.com (Antonio Paredes) Date: Tue, 6 Sep 2011 17:47:12 -0400 Subject: [R] Ubuntu Upgrading to R 1.13.0 Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mailinglist.honeypot at gmail.com Tue Sep 6 23:51:06 2011 From: mailinglist.honeypot at gmail.com (Steve Lianoglou) Date: Tue, 6 Sep 2011 17:51:06 -0400 Subject: [R] Ubuntu Upgrading to R 1.13.0 In-Reply-To: References: Message-ID: Hi, On Tue, Sep 6, 2011 at 5:47 PM, Antonio Paredes wrote: > Hello everyone, > > I'll like to know if there is an easy way to upgrade R in Ubuntu. I'd use > google to search for information, but nothing seems to work. Googling for "install r ubuntu" gives you a good first hit: http://cran.r-project.org/bin/linux/ubuntu/README I guess that will do? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact From gleynes+r at gmail.com Tue Sep 6 23:56:44 2011 From: gleynes+r at gmail.com (Gene Leynes) Date: Tue, 6 Sep 2011 16:56:44 -0500 Subject: [R] Possible to access a USB volume by name in windows Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From lists at revelle.net Wed Sep 7 00:01:05 2011 From: lists at revelle.net (William Revelle) Date: Tue, 6 Sep 2011 17:01:05 -0500 Subject: [R] Q and R mode in Principal Component Analysis In-Reply-To: <201109061610.45238.lcipriano@iol.pt> References: <201109061610.45238.lcipriano@iol.pt> Message-ID: At 4:10 PM +0100 9/6/11, L?vio Cipriano wrote: >Hi, > >Can anyone explain me the differences in Q and R mode in Principal Component >Analysis, as performed by prcomp and princom respectively. Dear Livio, The help file of prcomp says it pretty well: "The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix. This is generally the preferred method for numerical accuracy. " with the help file from princomp: princomp only handles so-called R-mode PCA, that is feature extraction of variables. If a data matrix is supplied (possibly via a formula) it is required that there are at least as many units as variables. For Q-mode PCA use prcomp. This R and Q (as well as S and T) terminology was introduced (at least in psychology) by Ray Cattell in his discussion of the "Data Box". It is the idea that you can consider three dimensions of data (across subjects, variables, and time). Then there are six different ways to cut up the data. A typical data matrix has rows for observations and columns for variables. Typically the number of rows >> columns. If you are trying to find a structure that reduces the complexity of the variables, you do the normal analysis (R) of the variables. An alternative is do the analysis on the transpose of the data matrix (Q analysis). That is, to try to reduce the complexity of the rows. This is not a problem if you do aingular value decomposition (which is what prcomp does). It can be if you do a princomp analysis which is based upon the covariance of the data. Let nXv represent your original matrix. (n observations on v variables). For an R analysis, using princomp, you are finding the principal components of the covariance matrix C which is of size v x v with rank = the lesser of n and v. But for a Q analysis, if you are using princomp, you are still trying to find the principal components of a covariance matrix C* which has dimensions n x n but has a rank of the lesser of n and v. That is, if the number of rows > number of columns the rank of the covariance matrix of the transposed matrix will still be the number of columns although the size of the correlation matrix will be n x n. Q analysis is looking for patterns of similarity in the subjects over variables, R analysis is looking for similarity in the variables over subjects. This then gets generalized to the case of subjects over time, variables, over time, .... "The data box emphasized that we are not limited to correlating tests over people at one time. In its 1946 formulation, there were six 'designs of covariation using literal measurement' and 12 'designs of covariation using differential or ratio measurement' (Cattell, 1946c, p 94-95). Considering Persons, Tests, and Occasions as the fundamental dimensions, it was possible to generalize the normal correlation of Tests over Persons design (R analysis) to consider how Persons correlated over Tests (Q analysis), or Tests over Occasions (P analysis), etc. Cattell (1966) extended the data box's original three dimensions to five by adding Background or preceding conditions as well as Observers (see also Cattell (1977)). Applications of the data box concept have been seen throughout psychology, but the primary influence has probably been on those who study personality development and change over the life span (McArdle & Bell, 2000, Mroczek, 2007, Nesselroade, 1984). Unfortunately, even for the original three dimensions, Cattell (1978) used a different notation than he did in Cattell (1966, 1977) or Cattell (1946b)." British Journal of Psychology (2009), 100, 253-257 q 2009 The British Psychological Society [1] R. B. Cattell. The data box: Its ordering of total resources in terms of possible relational systems. In R. B. Cattell, editor, Handbook of multivariate experimental psychology, pages 67-128. Rand-McNally, Chicago, 1966. I suspect this is more than you wanted to know. Bill > >Regards > >L?vio Cipriano > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. From erinm.hodgess at gmail.com Wed Sep 7 00:07:34 2011 From: erinm.hodgess at gmail.com (Erin Hodgess) Date: Tue, 6 Sep 2011 17:07:34 -0500 Subject: [R] Rweb and setting up R on a server Message-ID: Dear R People: At one time, Rweb existed, which had R on a server. I looked for it, but can't find it. Has anyone used that recently, or is there a new equivalent, please? Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodgess at gmail.com From dwinsemius at comcast.net Wed Sep 7 00:11:41 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 6 Sep 2011 18:11:41 -0400 Subject: [R] calibrate.cph plots In-Reply-To: <1315343800.89793.YahooMailNeo@web121501.mail.ne1.yahoo.com> References: <1315343800.89793.YahooMailNeo@web121501.mail.ne1.yahoo.com> Message-ID: <5D0A651E-1BC1-41F7-BC3C-7F1BE1AEA9B7@comcast.net> On Sep 6, 2011, at 5:16 PM, Salvo Mac wrote: > Hi! > > How can I exclude the legends from calibration plots > generated by calibrate.cph > You may be confused. `calibrate.cph` does not do plotting. There is a `plot.calibrate` function and a `plot.calibrate.default` function. Looking at the code you can see that the legend= argument is checked before the legend is created by `plot.calibrate.default`. plot(cal, legend=FALSE) # which works for calibrate on an lrm model. # But ... there are no legends in plot.calibrate done on cph fits. So do you mean the subtitles? If so, then this works: plot(cal, subtitles=FALSE) > regards, > > Salvo > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From skhanvil at qualcomm.com Tue Sep 6 22:47:09 2011 From: skhanvil at qualcomm.com (Khanvilkar, Shashank) Date: Tue, 6 Sep 2011 20:47:09 +0000 Subject: [R] Add elements to a global list In-Reply-To: <1315340668391-3794485.post@n4.nabble.com> References: <1315318203870-3793509.post@n4.nabble.com> <1315319733.43266.YahooMailClassic@web28209.mail.ukl.yahoo.com> <1315340668391-3794485.post@n4.nabble.com> Message-ID: Hello All, Thanks in advance for all help. In my prog, I have a global list object that is used as a container for storing some data frames. Here is an example code. --SNIP-- temp1 <- function(X){ statName = c("Mean", "stdDev", "NumSamples") statVal = c("0.5", "0.51", "5") r = data.frame(statName=statName, statVal = statVal) X[["temp1"]] = r } reportList = list() temp1(reportList) print(reportList) --SNIP-- I was expecting that the new data frame in temp1 would get added to the list (reportList), but it doesn't, probably because, arguments to functions are passed "Call-by-value" and loose all locally made changes. Any ideas how I can do this. Shank From gleynes+r at gmail.com Wed Sep 7 00:25:50 2011 From: gleynes+r at gmail.com (Gene Leynes) Date: Tue, 6 Sep 2011 17:25:50 -0500 Subject: [R] heatmap In-Reply-To: <20110906151413.63355oojxokep7ok@web.mail.umich.edu> References: <1315314079.27828.7.camel@nemo> <1315317047290-3793462.post@n4.nabble.com> <515B021D-7A15-4E80-8D1E-FE258DB75BEA@comcast.net> <20110906132045.10895wn43671i4ws@web.mail.umich.edu> <20110906151413.63355oojxokep7ok@web.mail.umich.edu> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rolf.turner at xtra.co.nz Wed Sep 7 01:41:33 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Wed, 07 Sep 2011 11:41:33 +1200 Subject: [R] Add elements to a global list In-Reply-To: References: <1315318203870-3793509.post@n4.nabble.com> <1315319733.43266.YahooMailClassic@web28209.mail.ukl.yahoo.com> <1315340668391-3794485.post@n4.nabble.com> Message-ID: <4E66AFAD.7060406@xtra.co.nz> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From berryboessenkool at hotmail.com Wed Sep 7 00:29:34 2011 From: berryboessenkool at hotmail.com (Berry Boessenkool) Date: Wed, 7 Sep 2011 00:29:34 +0200 Subject: [R] Histogram messed up Message-ID: Hey all, I encountered a problem drawing a histogram. You can view the picture here: http://dl.dropbox.com/u/4836866/Bad_Histogramm.png What happens: the bars are drawn with different starting points, thus no straight zero-line is there. And bars are overlapping. (or sometimes apart from each other.) How it happens: hist(volcano, breaks=10) # and any other data This also happens with barplot(rnorm(10,10,1), space=0). resizing the graphics window shows the double line in differing places. What I thought may cause it: Just installed "xlsReadWrite", but I don't think that should be a problem, even though it's kind of irregular, with the xls.getshlib(). I restarted R when I noticed this, but even without lybrarying the package, it still happens. I only recently upgraded to R 2.13.1, so I'm not sure it didn't happen before the package. It did not happen with older R versions, that I do know. But that isn't necessarily causal. What I want to know: Any idea what may be causing this? Or better yet, what may be solving this? Am I the only one with the problem? Could it be a bug? Is it my computer? Any help would be highly appreciated! What you may need to know: Im using R on a Windows XP machine. here's my > sessionInfo() R version 2.13.1 (2011-07-08) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=German_Germany.1252? LC_CTYPE=German_Germany.1252??? LC_MONETARY=German_Germany.1252 [4] LC_NUMERIC=C??????????????????? LC_TIME=German_Germany.1252??? attached base packages: [1] stats???? graphics? grDevices utils???? datasets? methods?? base?? ------------------------------------- Berry Boessenkool D-14476 Potsdam ------------------------------------- From statfar at gmail.com Wed Sep 7 02:07:18 2011 From: statfar at gmail.com (Farhad Shokoohi) Date: Tue, 6 Sep 2011 19:07:18 -0500 Subject: [R] check availability of a file in R Message-ID: I need to use a loop and each time go to folder i and check availability of .RData file. If it exist load it and if not submit a command in linux. Something like this for (i in 1:10){ setwd(~/i/) if .Rdata (?????) load (.RData else } Any idea how to do that in R? Farhad From jacksie at eden.rutgers.edu Wed Sep 7 02:13:30 2011 From: jacksie at eden.rutgers.edu (Jack Siegrist) Date: Tue, 6 Sep 2011 17:13:30 -0700 (PDT) Subject: [R] sample within groups-slight problem Message-ID: <1315354410551-3794912.post@n4.nabble.com> I want to sample within groups, and when a group has only one associated number to just return that number. If I use this code: groups <- c(1, 2, 2, 2, 3) numbers <- 1:5 tapply(numbers, groups, FUN = sample) I get the following output: > groups <- c(1, 2, 2, 2, 3) > numbers <- 1:5 > tapply(numbers, groups, FUN = sample) $`1` [1] 1 $`2` [1] 3 2 4 $`3` [1] 2 3 5 1 4 Can someone tell me why the $'3' result samples all of the numbers and how to prevent it from doing so? I want the output for the $'3' part to just be 5 in this example. Thanks for your help. > sessionInfo() R version 2.11.1 (2010-05-31) i386-pc-mingw32 -- View this message in context: http://r.789695.n4.nabble.com/sample-within-groups-slight-problem-tp3794912p3794912.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Wed Sep 7 02:38:18 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 6 Sep 2011 20:38:18 -0400 Subject: [R] sample within groups-slight problem In-Reply-To: <1315354410551-3794912.post@n4.nabble.com> References: <1315354410551-3794912.post@n4.nabble.com> Message-ID: On Sep 6, 2011, at 8:13 PM, Jack Siegrist wrote: > I want to sample within groups, and when a group has only one > associated > number to just return that number. And what we supposed to do when it has more than one value????? > > If I use this code: > > groups <- c(1, 2, 2, 2, 3) > numbers <- 1:5 > tapply(numbers, groups, FUN = sample) > > I get the following output: > >> groups <- c(1, 2, 2, 2, 3) >> numbers <- 1:5 >> tapply(numbers, groups, FUN = sample) > $`1` > [1] 1 > > $`2` > [1] 3 2 4 > > $`3` > [1] 2 3 5 1 4 > > Can someone tell me why the $'3' result samples all of the numbers > and how > to prevent it from doing so? I want the output for the $'3' part to > just be > 5 in this example. Type: sample(5) If typing it once, does not lead you to enlightenment, then continue typing it until enlightenment is achieved. -- David "Buddha" Winsemius From murdoch.duncan at gmail.com Wed Sep 7 02:38:52 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Tue, 06 Sep 2011 20:38:52 -0400 Subject: [R] Histogram messed up In-Reply-To: References: Message-ID: <4E66BD1C.4030206@gmail.com> On 11-09-06 6:29 PM, Berry Boessenkool wrote: > > > Hey all, > > I encountered a problem drawing a histogram. > You can view the picture here: > http://dl.dropbox.com/u/4836866/Bad_Histogramm.png > This has been fixed in R-patched: see https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14628. Duncan Murdoch > > What happens: > the bars are drawn with different starting points, thus no straight zero-line is there. > And bars are overlapping. (or sometimes apart from each other.) > > > How it happens: > hist(volcano, breaks=10) # and any other data > This also happens with barplot(rnorm(10,10,1), space=0). > resizing the graphics window shows the double line in differing places. > > > What I thought may cause it: > Just installed "xlsReadWrite", but I don't think that should be a problem, even though it's kind of irregular, with the xls.getshlib(). > I restarted R when I noticed this, but even without lybrarying the package, it still happens. > I only recently upgraded to R 2.13.1, so I'm not sure it didn't happen before the package. > It did not happen with older R versions, that I do know. But that isn't necessarily causal. > > > What I want to know: > Any idea what may be causing this? Or better yet, what may be solving this? > Am I the only one with the problem? Could it be a bug? Is it my computer? > > Any help would be highly appreciated! > > > What you may need to know: > Im using R on a Windows XP machine. > here's my >> sessionInfo() > R version 2.13.1 (2011-07-08) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 > [4] LC_NUMERIC=C LC_TIME=German_Germany.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > > ------------------------------------- > Berry Boessenkool > D-14476 Potsdam > ------------------------------------- > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From murdoch.duncan at gmail.com Wed Sep 7 02:42:52 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Tue, 06 Sep 2011 20:42:52 -0400 Subject: [R] sample within groups-slight problem In-Reply-To: <1315354410551-3794912.post@n4.nabble.com> References: <1315354410551-3794912.post@n4.nabble.com> Message-ID: <4E66BE0C.8030109@gmail.com> On 11-09-06 8:13 PM, Jack Siegrist wrote: > I want to sample within groups, and when a group has only one associated > number to just return that number. > > If I use this code: > > groups<- c(1, 2, 2, 2, 3) > numbers<- 1:5 > tapply(numbers, groups, FUN = sample) > > I get the following output: > >> groups<- c(1, 2, 2, 2, 3) >> numbers<- 1:5 >> tapply(numbers, groups, FUN = sample) > $`1` > [1] 1 > > $`2` > [1] 3 2 4 > > $`3` > [1] 2 3 5 1 4 > > Can someone tell me why the $'3' result samples all of the numbers and how > to prevent it from doing so? I want the output for the $'3' part to just be > 5 in this example. Because sample(5) gives a permutation of the numbers 1:5. See the last example in ?sample for a way to avoid this "feature". Duncan Murdoch > > Thanks for your help. > > >> sessionInfo() > R version 2.11.1 (2010-05-31) > i386-pc-mingw32 > > -- > View this message in context: http://r.789695.n4.nabble.com/sample-within-groups-slight-problem-tp3794912p3794912.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From duncan at wald.ucdavis.edu Wed Sep 7 03:01:30 2011 From: duncan at wald.ucdavis.edu (Duncan Temple Lang) Date: Tue, 06 Sep 2011 18:01:30 -0700 Subject: [R] htmlParse hangs or crashes In-Reply-To: References: Message-ID: <4E66C26A.6020106@wald.ucdavis.edu> Hi Simon Unfortunately, it works for me on my OS X machine. So I can't reproduce the problem. I'd be curious to know which version of libxml2 you are using. That might be the cause of the problem. You can find this with library(XML) libxmlVersion() You might install a more recent version (e.g. libxml >= 2.07.0) You can send the info to me off list and we can try to resolve the problem. htmlParse() returns a reference to the internal C-level XML tree/document. When you print the value of the variable .x, we then serialize that C-level data structure to a string. htmlTreeParse(), by default, converts that C-level XML tree/document into regular R objects. So it traverses the tree and creates those R list()s before it returns and then throws the C-level tree away. D. On 9/5/11 2:48 PM, Simon Kiss wrote: > Dear colleagues, > each time I use htmlParse, R crashes or hangs. The url I'd like to parse is included below as is the results of a series of basic commands that describe what I'm experiencing. The results of sessionInfo() are attached at the bottom of the message. > The thing is, htmlTreeParse appears to work just fine, although it doesn't appear to contain the information I need (the URLs of the articles linked to on this search page). Regardless, I'd still like to understand why htmlParse doesn't work. > Thank you for any insight. > Yours, > Simon Kiss > > > myurl<-c("http://timesofindia.indiatimes.com/searchresult.cms?sortorder=score&searchtype=2&maxrow=10&startdate=2001-01-01&enddate=2011-08-25&article=2&pagenumber=1&isphrase=no&query=IIM&searchfield=§ion=&kdaterange=30&date1mm=01&date1dd=01&date1yyyy=2001&date2mm=08&date2dd=25&date2yyyy=2011") > > .x<-htmlParse(myurl) > > class(.x) > #returns "HTMLInternalDocument" "XMLInternalDocument" > > .x > #returns > *** caught segfault *** > address 0x1398754, cause 'memory not mapped' > > Traceback: > 1: .Call("RS_XML_dumpHTMLDoc", doc, as.integer(indent), as.character(encoding), as.logical(indent), PACKAGE = "XML") > 2: saveXML(from) > 3: saveXML(from) > 4: asMethod(object) > 5: as(x, "character") > 6: cat(as(x, "character"), "\n") > 7: print.XMLInternalDocument() > 8: print() > > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > > sessionInfo() > R version 2.13.0 (2011-04-13) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] en_CA.UTF-8/en_CA.UTF-8/C/C/en_CA.UTF-8/en_CA.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] XML_3.4-0 RCurl_1.5-0 bitops_1.0-4.1 > ********************************* > Simon J. Kiss, PhD > Assistant Professor, Wilfrid Laurier University > 73 George Street > Brantford, Ontario, Canada > N3T 2C9 > Cell: +1 905 746 7606 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jdnewmil at dcn.davis.ca.us Wed Sep 7 03:36:12 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Tue, 06 Sep 2011 18:36:12 -0700 Subject: [R] Possible to access a USB volume by name in windows In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Wed Sep 7 03:37:44 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 6 Sep 2011 21:37:44 -0400 Subject: [R] check availability of a file in R In-Reply-To: References: Message-ID: On Sep 6, 2011, at 8:07 PM, Farhad Shokoohi wrote: > I need to use a loop and each time go to folder i and check > availability of .RData file. If it exist load it and if not submit a > command in linux. Something like this > > for (i in 1:10){ > setwd(~/i/) # Perhaps: if (target %in% list.files( patt="\.Rdata") ) { > .Rdata (?????) > #load (.RData load( target) > else # perhaps you could explain your goals for this next operation? > > } > > Any idea how to do that in R? > > Farhad -- David Winsemius, MD Heritage Laboratories West Hartford, CT From dwinsemius at comcast.net Wed Sep 7 03:47:08 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 6 Sep 2011 21:47:08 -0400 Subject: [R] Update packages problem In-Reply-To: References: Message-ID: On Sep 6, 2011, at 5:10 PM, B Jessop wrote: > > > > R-help, I recently updated R from 2.13.0 to 2.13.1 (32-bit Windows > version) and now have a problem updating packages. I am running > Windows 7 as operating system. As the FAQ suggests, I uninstalled > 2.13.0 before installing 2.13.1. On first opening R, I ran > "update.packages(checkBuilt=TRUE, ask=FALSE) ras also recommended Perhaps not necessary for an update from 2.13.0 to 2.13.1 > and selected the NS Cran mirror. On doing this the following error > messages occurs: update.packages(checkBuilt=TRUE, ask=FALSE) > --- Please select a CRAN mirror for use in this session --- > Warning in install.packages(update[instlib == l, "Package"], l, > contriburl = contriburl, : > 'lib = "C:/Program Files/R/R-2.13.1/library"' is not writable Windows is protecting you. You need to convince it you do not need protection. Search for ways to increase your Win-fu. > Error in install.packages(update[instlib == l, "Package"], l, > contriburl = contriburl, : > unable to install packages Any suggestions as to how to correct > this problem would be appreciated. Thanks. Regards,Deelman > -- David Winsemius, MD Heritage Laboratories West Hartford, CT From michael.weylandt at gmail.com Wed Sep 7 05:14:04 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Tue, 6 Sep 2011 22:14:04 -0500 Subject: [R] Alternatives to integrate? In-Reply-To: References: <1314885193156-3783645.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Wed Sep 7 05:25:13 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 6 Sep 2011 23:25:13 -0400 Subject: [R] Question about Natural Splines (ns function) In-Reply-To: References: Message-ID: <0ECF35CF-631C-4D5A-BEE7-8801514D1E1D@comcast.net> On Sep 6, 2011, at 5:08 PM, Axel Urbiz wrote: > Hi - How can I 'manually' reproduce the results in 'pred1' below? Well, not by constructing the prediction function from a different data basis that the glm function gets. > My attempt > is pred_manual, but is not correct. Any help is much appreciated. > > library(splines) > set.seed(12345) > y <- rgamma(1000, shape =0.5) > age <- rnorm(1000, 45, 10) > glm1 <- glm(y ~ ns(age, 4), family=Gamma(link=log)) > dd <- data.frame(age = 16:80) > mm <- model.matrix( ~ ns(dd$age, 4)) > pred1 <- predict(glm1, dd, type = "response") > pred_manual <- exp(coefficients(glm1)[1] * mm[,1] + > coefficients(glm1)[2] * mm[,2] + > coefficients(glm1)[3] * mm[,3] + > coefficients(glm1)[4] * mm[,4] + > coefficients(glm1)[5] * mm[,5]) > attr(glm1$terms, "predvars") list(y, ns(age, knots = c(38.7407342480734, 44.9093960482465, 51.5913373894399), Boundary.knots = c(14.7723335249845, 76.5536098692015), intercept = FALSE)) > pred_man <- exp(coefficients(glm1) %*% t(mm)) > str(pred_man) num [1, 1:65] 0.327 0.335 0.343 0.351 0.36 ... - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:65] "1" "2" "3" "4" ... > str(pred1) Named num [1:65] 0.336 0.343 0.351 0.359 0.366 ... - attr(*, "names")= chr [1:65] "1" "2" "3" "4" ... > plot(pred_man, pred1) So a 4 knot natural spline derived from a random rnorm against a random gamma series is somewhat similar to one derived from from a series of integers over the same range. > dd2 <- data.frame(age = age) > mm2 <- model.matrix( ~ ns(dd2$age, 4)) > pred_man2 <- exp(coefficients(glm1) %*% t(mm2)) > plot(age,pred_man2) > points(dd$age, pred_man) Different bases, different predictions. -- David Winsemius, MD Heritage Laboratories West Hartford, CT From jwiley.psych at gmail.com Wed Sep 7 05:41:58 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Tue, 6 Sep 2011 20:41:58 -0700 Subject: [R] Possible to access a USB volume by name in windows In-Reply-To: References: Message-ID: Hi, This comes with absolutely no guarantees (and a good recommendation to be cautious), but you could try it: myset <- function(name = "", path = "") { res <- vector("character", length(LETTERS)) for(i in LETTERS) { res[i] <- shell(shQuote(paste("VOL ", i, ":", sep = '')), intern = TRUE, ignore.stderr = TRUE)[1L] } tmp <- gsub("[[:space:]]", "", grep(name, res, ignore.case = TRUE, value = TRUE)) if (!nzchar(tmp)) stop("No volume with ", name, "label name could be found") vol <- strsplit(tmp, "drive|is")[[1L]][2L] fullpath <- paste(vol, ":/", path, sep = '') cat("Setting WD to ", fullpath, fill = TRUE) setwd(fullpath) } #### Example usage #### ## set WD to root of volume with label "FLASH_NAME" myset("FLASH_NAME") ## set WD so some subdirectory myset("FLASH_NAME", "path/to/something") At least on my system, this takes awhile to run. It iterates through all the volumes [A-Z], and there will probably be quite a few warnings unless all volumes are mounted, but the warnings (at least about not finding drives/bad exit status) should be ignorable. Cheers, Josh On Tue, Sep 6, 2011 at 2:56 PM, Gene Leynes wrote: > On the Mac it's pretty easy to get to a USB drive by name. ?For example the > following command works if you have a USB drive named "MYUSB" > setwd('/Volumes/MYUSB') > > Is there a way to do the same thing in Windows (without knowing the drive > letter)? > > > Thanks! > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ From jcbouette at gmail.com Wed Sep 7 02:31:56 2011 From: jcbouette at gmail.com (=?ISO-8859-1?Q?Jean=2DChristophe_BOU=CBTT=C9?=) Date: Tue, 6 Sep 2011 20:31:56 -0400 Subject: [R] sample within groups-slight problem In-Reply-To: <1315354410551-3794912.post@n4.nabble.com> References: <1315354410551-3794912.post@n4.nabble.com> Message-ID: Hi there, in the third case you get sample(5) which is exactly what you asked for. from ?sample: "If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x. Note that this convenience feature may lead to undesired behaviour when x is of varying length in calls such as sample(x). See the examples. " 2011/9/6 Jack Siegrist : > I want to sample within groups, and when a group has only one associated > number to just return that number. > > If I use this code: > > groups <- c(1, 2, 2, 2, 3) > numbers <- 1:5 > tapply(numbers, groups, FUN = sample) > > I get the following output: > >> ?groups <- c(1, 2, 2, 2, 3) >> ?numbers <- 1:5 >> ?tapply(numbers, groups, FUN = sample) > $`1` > [1] 1 > > $`2` > [1] 3 2 4 > > $`3` > [1] 2 3 5 1 4 > > Can someone tell me why the $'3' result samples all of the numbers and how > to prevent it from doing so? I want the output for the $'3' part to just be > 5 in this example. > > Thanks for your help. > > >> sessionInfo() > R version 2.11.1 (2010-05-31) > i386-pc-mingw32 > > -- > View this message in context: http://r.789695.n4.nabble.com/sample-within-groups-slight-problem-tp3794912p3794912.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From jcbouette at gmail.com Wed Sep 7 02:40:16 2011 From: jcbouette at gmail.com (=?ISO-8859-1?Q?Jean=2DChristophe_BOU=CBTT=C9?=) Date: Tue, 6 Sep 2011 20:40:16 -0400 Subject: [R] sample within groups-slight problem In-Reply-To: References: <1315354410551-3794912.post@n4.nabble.com> Message-ID: 2011/9/6 Jean-Christophe BOU?TT? : you could tapply function(x) if(length(x)==1) x else sample(x) or something like this > > 2011/9/6 Jean-Christophe BOU?TT? : >> Hi there, >> in the third case you get sample(5) which is exactly what you asked for. >> >> from ?sample: >> "If x has length 1, is numeric (in the sense of is.numeric) and x >= >> 1, sampling via sample takes place from 1:x. Note that this >> convenience feature may lead to undesired behaviour when x is of >> varying length in calls such as sample(x). See the examples. " >> >> 2011/9/6 Jack Siegrist : >>> I want to sample within groups, and when a group has only one associated >>> number to just return that number. >>> >>> If I use this code: >>> >>> groups <- c(1, 2, 2, 2, 3) >>> numbers <- 1:5 >>> tapply(numbers, groups, FUN = sample) >>> >>> I get the following output: >>> >>>> ?groups <- c(1, 2, 2, 2, 3) >>>> ?numbers <- 1:5 >>>> ?tapply(numbers, groups, FUN = sample) >>> $`1` >>> [1] 1 >>> >>> $`2` >>> [1] 3 2 4 >>> >>> $`3` >>> [1] 2 3 5 1 4 >>> >>> Can someone tell me why the $'3' result samples all of the numbers and how >>> to prevent it from doing so? I want the output for the $'3' part to just be >>> 5 in this example. >>> >>> Thanks for your help. >>> >>> >>>> sessionInfo() >>> R version 2.11.1 (2010-05-31) >>> i386-pc-mingw32 >>> >>> -- >>> View this message in context: http://r.789695.n4.nabble.com/sample-within-groups-slight-problem-tp3794912p3794912.html >>> Sent from the R help mailing list archive at Nabble.com. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> > From Suphajak at phatrasecurities.com Wed Sep 7 06:28:34 2011 From: Suphajak at phatrasecurities.com (Suphajak Ngamlak) Date: Wed, 7 Sep 2011 04:28:34 +0000 Subject: [R] Weight in Function RM Message-ID: <429C286A082D0D4A8D2EAE084350DB25221A4B3D@ptsecmsmbx02.phatrasec.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From daniel at umd.edu Wed Sep 7 07:12:01 2011 From: daniel at umd.edu (Daniel Malter) Date: Tue, 6 Sep 2011 22:12:01 -0700 (PDT) Subject: [R] bootstrap function In-Reply-To: References: Message-ID: <1315372321382-3795264.post@n4.nabble.com> install.packages("bootstrap") library(bootstrap) ? -- View this message in context: http://r.789695.n4.nabble.com/bootstrap-function-tp3794224p3795264.html Sent from the R help mailing list archive at Nabble.com. From petr.pikal at precheza.cz Wed Sep 7 09:00:19 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Wed, 7 Sep 2011 09:00:19 +0200 Subject: [R] subsetting tables In-Reply-To: <1315341723660-3794527.post@n4.nabble.com> References: <1315318203870-3793509.post@n4.nabble.com> <4E6632C7.5040200@uke.de> <1315341723660-3794527.post@n4.nabble.com> Message-ID: Hi > > Hi Eik, > > greetings to Hamburg! :-) Thanks for the fast and helpful answer > > > Eik Vettorazzi-2 wrote: > > > > #compare > > str(red[,2]) > > str(red[2,]) > > > > I understand that the first is a real vector of nums in R and the second is > a ?? matrix/list/data.frame ?? of single ? entries? Can I > transpose/transform it into one vector? Tried 'as.vector' but did not help. See ?"[" and its section about data.frame method, drop parameter drop logical. If TRUE the result is coerced to the lowest possible dimension. The default is to drop if only one column is left, but not to drop if only one row is left. iris[1,] Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa as.vector(unlist(iris[1,])) [1] 5.1 3.5 1.4 0.2 1.0 But if your data are not all numeric they are coerced to numeric - see last column values > > > Eik Vettorazzi-2 wrote: > > > > sum(red>.5) > > length(which(red>.5)) > > > > Sorry for being unprecise. Yes, in this case it was mainly the sum (thanks! > helpful function!), but in general I'd like to understand what happened with > subset here... > > > Eik Vettorazzi-2 wrote: > > > > > > and the arr.ind option of which may be useful as well. > > > > Thanks a lot, very helpful. For other newbies, here is the line: > > tableReduced[,-1][which(tableReduced[,-1]>0.5, arr.ind=TRUE)] > > I needed to exclude the first column (-1) since these were titles (factors) > of my rows. In the first trial I forgot to add this information to the first > notion of the table as well, i.e., I tried: > > tableReduced[which(tableReduced[,-1]>0.5, arr.ind=TRUE)] > > This will (of course, I have to admit) result in subsetting fields that are > in one column to the left of the intended column. So, if there are any > subsetting indices in the which-function, they also need to be put in front > of it to make the indices match. > > Just for my understanding, do you know what R did with here? Where do the NA > values come from, what is the row-title NA.1, why does it print the first > two rows unchanged and then goes crazy? > > > subset(red[,], red[,] > 0.5) > > Allstar hsa.let.7a hsa.let.7a.1 hsa.let.7a.2 > > 2 0.87 0.79 -0.57 1.07 > > 3 0.67 -1.14 -0.78 -0.95 > > NA NA NA NA NA > > NA.1 NA NA NA NA > > NA.2 NA NA NA NA > it is rather unusual use of "[". I did not follow whole thread but with subsetting you need to consider what you want to get from it. > str(iris>6) logi [1:150, 1:5] FALSE FALSE FALSE FALSE FALSE FALSE ... Using comparison operator on data frame results in logical matrix which is basically logical vector with dimensions. > which(iris>6) [1] 51 52 53 55 57 59 64 66 69 72 73 74 75 76 77 78 87 88 92 [20] 98 101 103 104 105 106 108 109 110 111 112 113 116 117 118 119 121 123 124 [39] 125 126 127 128 129 130 131 132 133 134 135 136 137 138 140 141 142 144 145 [58] 146 147 148 149 406 408 410 418 419 423 431 432 436 Iris has only 150 rows and you get correct indexing value from first column but not from the others. As you can see from > tail(iris[which(iris>6),], 10) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 149 6.2 3.4 5.4 2.3 virginica NA NA NA NA NA NA.1 NA NA NA NA NA.2 NA NA NA NA NA.3 NA NA NA NA NA.4 NA NA NA NA NA.5 NA NA NA NA NA.6 NA NA NA NA NA.7 NA NA NA NA NA.8 NA NA NA NA you get NA values for those indices which are over 150 (no of iris rows). If you want let say all items bigger than some threshold from data frame you need some small hack iris1 <- iris[,-5] iris[ rowSums(iris1 > 6) > 0, ] or iris[ rowSums(iris > 6, na.rm=T) > 0, ] Regards Petr > Thanks for this community with fast and reliable help. Amazing to see! > > > -- > View this message in context: http://r.789695.n4.nabble.com/subsetting- > tables-tp3793509p3794527.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From bhh at xs4all.nl Wed Sep 7 09:47:53 2011 From: bhh at xs4all.nl (Berend Hasselman) Date: Wed, 7 Sep 2011 00:47:53 -0700 (PDT) Subject: [R] Alternatives to integrate? In-Reply-To: References: <1314885193156-3783645.post@n4.nabble.com> Message-ID: <1315381673809-3795491.post@n4.nabble.com> . wrote: > > Hi, continuing the improvements... > > I've prepared a new code: > > ddae <- function(individuals, frac, sad, samp="pois", trunc=0, ...) { > dots <- list(...) > Compound <- function(individuals, frac, n.species, sad, samp, dots) { > print(c("Size:", length(individuals), "Compound individuals:", > individuals, "End.")) > RegDist <- function(n.species, sad, dots) { # "RegDist" may be > Exponential, Gamma, etc. > dcom <- paste("d", as.name(sad), sep="") > dots <- as.list(c(n.species, dots)) > ans <- do.call(dcom, dots) > return(ans) > } > SampDist <- function(individuals, frac, n.species, samp) { # > "SampDist" may be Poisson or Negative Binomial > dcom <- paste("d", samp, sep="") > lambda <- frac * n.species > dots <- as.list(c(individuals, lambda)) > ans <- do.call(dcom, dots) > return(ans) > } > ans <- RegDist(n.species, sad, dots) * SampDist(individuals, frac, > n.species, samp) > return(ans) > } > IntegrateScheme <- function(Compound, individuals, frac, sad, samp, > dots) { > print(c("Size:", length(individuals), "Integrate individuals:", > individuals)) > ans <- integrate(Compound, 0, 2000, individuals, frac, sad, samp, > dots)$value > return(ans) > } > ans <- IntegrateScheme(Compound, individuals, frac, sad, samp, dots) > return(ans) > } > > ddae(2, 0.05, "exp") > > Now I can't understand what happen to "individuals", why is it > changing in value and size? I've tried to "traceback()" and "debug()", > but I was not smart enough to understand what is going on. > > Could you, please, give some more help? > >From the help for integrate argument f : an R function taking a numeric first argument and returning a numeric vector of the same length. ..... Function Compound is passed to integrate. First argument is "individuals" and integrate is integrating over individuals. That is why it is changing in value and size: integrate is only doing what you asked it do. The code is too uncommented and convoluted to supply further comments. You really should simplify this Berend -- View this message in context: http://r.789695.n4.nabble.com/Alternatives-to-integrate-tp3783624p3795491.html Sent from the R help mailing list archive at Nabble.com. From arrayprofile at yahoo.com Wed Sep 7 10:11:12 2011 From: arrayprofile at yahoo.com (array chip) Date: Wed, 7 Sep 2011 01:11:12 -0700 (PDT) Subject: [R] suggestion for proportions Message-ID: <1315383072.86607.YahooMailNeo@web125805.mail.ne1.yahoo.com> Hi, I am wondering if anyone can suggest how to test the equality of 2 proportions. The caveat here is that the 2 proportions were calculated from the same number of samples using 2 different tests. So essentially we are comparing 2 accuracy rates from same, say 100, samples. I think this is like a paired test, but don't know if really we need to consider the "paired" nature of the data, and if yes then how? Or just use prop.test() to compare 2 proportions? Any suggestion would be greatly appreciated. Thanks John From Samuel.Le at srlglobal.com Wed Sep 7 10:21:58 2011 From: Samuel.Le at srlglobal.com (Samuel Le) Date: Wed, 7 Sep 2011 08:21:58 +0000 Subject: [R] check availability of a file in R In-Reply-To: References: Message-ID: Does file.exists answer to your question? file.exists(".RData") If you are not sure of the exact name of the file but know it contains ".RData", you can try: List.files(directory,".RData") -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Farhad Shokoohi Sent: 07 September 2011 01:07 To: R-help at r-project.org Subject: [R] check availability of a file in R I need to use a loop and each time go to folder i and check availability of .RData file. If it exist load it and if not submit a command in linux. Something like this for (i in 1:10){ setwd(~/i/) if .Rdata (?????) load (.RData else } Any idea how to do that in R? Farhad ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __________ Information from ESET NOD32 Antivirus, version of virus signature database 6275 (20110707) __________ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __________ Information from ESET NOD32 Antivirus, version of virus signature database 6275 (20110707) __________ The message was checked by ESET NOD32 Antivirus. http://www.eset.com From dbjrmn at gmail.com Wed Sep 7 09:22:19 2011 From: dbjrmn at gmail.com (D Brown) Date: Wed, 7 Sep 2011 02:22:19 -0500 Subject: [R] Error: in routine alloca() there is a stack overflow: thread 0, max 535822282KB, used 0KB, request 24B Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From igors.lahanciks at gmail.com Wed Sep 7 10:39:00 2011 From: igors.lahanciks at gmail.com (Igors) Date: Wed, 7 Sep 2011 01:39:00 -0700 (PDT) Subject: [R] function censReg in panel data setting In-Reply-To: References: <1315259907768-3792227.post@n4.nabble.com> <1315288271999-3792639.post@n4.nabble.com> Message-ID: <1315384740517-3795575.post@n4.nabble.com> Dear Arne, Thank you for fixing the package. However I am still struggling to obtain model estmates. The same code: > UpC <- censReg(Power ~ Windspeed, left = -Inf, right = > 2000,data=PData_In,method="BHHH",nGHQ = 4,start=c(-691.18,186.79,3.9,3.9)) Error in maxNRCompute(fn = logLikAttr, fnOrig = fn, gradOrig = grad, hessOrig = hess, : NA in the initial gradient I have tried to change starting values and regressors in the model several times, but I always get the same mentioned error message. How can I make it work? Is this function maxNRCompute() on the last step of calculation (maximization of the ML)? I had an idea that the error could appear since I have huge sample, but I tried to cut it and it still doesn't work. Best wishes, Igors -- View this message in context: http://r.789695.n4.nabble.com/function-censReg-in-panel-data-setting-tp3792227p3795575.html Sent from the R help mailing list archive at Nabble.com. From M.Rosario.Garcia at slu.se Wed Sep 7 10:33:27 2011 From: M.Rosario.Garcia at slu.se (Rosario Garcia Gil) Date: Wed, 7 Sep 2011 10:33:27 +0200 Subject: [R] greyPalette() in legend plot Message-ID: Hello I have a barplot where I get a grey gradient colors for each bar, I want to append a legend with the names of each bar and next to each name I need a box with the corresponding grey color. I managed to get the legend without boxes, but never got the wanted grey gradient in the legend: > y <- matrix(c(46.5102,42.7561,45.7857,45.6047,45.7027,81.4565,69.8824,69.1333,67.56,57.8929,43.3019,42.9184,51.7143,40.2727,51.7692,22.3871,24.2222,24.3226,23.6342,21.5833), ncol=4) se <- matrix(c(13.0707,12.0287,15.3949,16.5344,10.2520,28.6169,31.2398,24.5816,25.9745,22.9981,13.8433,17.3071,21.7355,18.2007,26.2546,6.8199,7.2977,7.1245,7.2345,8.2616), ncol=4) > plot<-barplot(y, beside=TRUE, ylim=c(0,90), axis.lty=0.5,xlab="Light treatment", ylab="root length (mm)", main="Root", names.arg=c("White","Blue","Red","Far-red")) > segments(plot,y-se,plot,y+se) > legend("topleft", c("Sedingehult", "Mobranna", "Aunasvare", "Fallan", "Ylinen"), col=(greyPalette(n=5))) Kind regards Rosario From wardamo.zero at gmail.com Wed Sep 7 10:10:37 2011 From: wardamo.zero at gmail.com (Damian Abalo) Date: Wed, 7 Sep 2011 10:10:37 +0200 Subject: [R] Werid things when write.table Message-ID: Hello, I am having some issues with the order write.table. The fact is that I need to use "," as the decimal character and not "." as default, and when I use: write.table(Sales,file="Sales.xls",quote=FALSE, sep = "\t", dec = ",", row.names = FALSE, col.names = TRUE) It does it perfectly, but then, in a different part of the code: write.table(Error,file="Error report.xls",quote=FALSE, sep = "\t", dec = ",", row.names = FALSE, col.names = TRUE) It does not do it, it keeps the . as the decimal character. The main difference is that the second table is treated as a character matrix, because it has characters on the first line, which work as titles. This table is builded by cbinding several quantile orders, and I don't know how to make the program treat the table as numbers while mantaining the titles (Since it is not a table really). Any ideas? Thanks for your help From miss_unpredictable001 at yahoo.com.my Wed Sep 7 09:02:29 2011 From: miss_unpredictable001 at yahoo.com.my (goong001) Date: Wed, 7 Sep 2011 00:02:29 -0700 (PDT) Subject: [R] Signify level of significance Message-ID: <1315378949909-3795424.post@n4.nabble.com> Im doing test on my sample using 1-sample Kolmogorov-smirnov test(poisson distribution) to specify my sample obtain avalanche effect. before i can compare my D value with the critical value(Dn, alpha), i have to specify which level of significance to use (eg. alpha = 0.01, 0.05, etc) and i don't understand, why is it related to the probability that the hypothesis will be rejected when it is in fact true. please anyone help me.... -- View this message in context: http://r.789695.n4.nabble.com/Signify-level-of-significance-tp3795424p3795424.html Sent from the R help mailing list archive at Nabble.com. From k.brand at erasmusmc.nl Wed Sep 7 10:54:09 2011 From: k.brand at erasmusmc.nl (Karl Brand) Date: Wed, 07 Sep 2011 10:54:09 +0200 Subject: [R] How does one start R within Emacs/ESS with root privileges? Message-ID: <4E673131.9000201@erasmusmc.nl> Esteemed UseRs and DevelopeRs, Apologies if this question belongs else where, but it does concern R's package installation/maintenance. How does one start R within Emacs/ESS with root privileges? I tried without success: > M-x sudo R Why i'm motivated to do so: It seems logical to me, as the only user of the PC, to keep my R library consolidated in the universal library rather than splitting into universal and user libraries. Hence the desire to run R as root. In addition, it's nice to be able to install packages 'on the fly' when and as needed and not need to launch a separate R session (as root) in the terminal just to install a package. Migrating from windows, i'm completey new to linux (ubuntu) and am seeing for myself if Emacs/ESS is as good as its purported to be. So maybe my motivation is nonsensical to expereinced ESS/R users. If so i'd really appreciate tips on efficient package installation/maintenance using Emacs/ESS. TIA, karl -- Karl Brand Department of Genetics Erasmus MC Dr Molewaterplein 50 3015 GE Rotterdam P +31 (0)10 704 3455 | F +31 (0)10 704 4743 | M +31 (0)642 777 268 From paul.hiemstra at knmi.nl Wed Sep 7 11:02:23 2011 From: paul.hiemstra at knmi.nl (Paul Hiemstra) Date: Wed, 07 Sep 2011 09:02:23 +0000 Subject: [R] How does one start R within Emacs/ESS with root privileges? In-Reply-To: <4E673131.9000201@erasmusmc.nl> References: <4E673131.9000201@erasmusmc.nl> Message-ID: <4E67331F.80005@knmi.nl> On 09/07/2011 08:54 AM, Karl Brand wrote: > Esteemed UseRs and DevelopeRs, > > Apologies if this question belongs else where, but it does concern R's > package installation/maintenance. > > How does one start R within Emacs/ESS with root privileges? > > I tried without success: > > > M-x sudo R > > Why i'm motivated to do so: > > It seems logical to me, as the only user of the PC, to keep my R > library consolidated in the universal library rather than splitting > into universal and user libraries. Hence the desire to run R as root. Hi Karl, Why the need to install packages in root? As you are the only user there is not reason to install them system wide (to make them available to all users, which is just you). Installing the packages in your homedir solves your problem much easier, without the need to run R as root continuously. I think you should not run anything as root if it is not absolutely needed, which could potentially damage your system (accidentally overwriting something). hope this helps, Paul > > In addition, it's nice to be able to install packages 'on the fly' > when and as needed and not need to launch a separate R session (as > root) in the terminal just to install a package. > > Migrating from windows, i'm completey new to linux (ubuntu) and am > seeing for myself if Emacs/ESS is as good as its purported to be. So > maybe my motivation is nonsensical to expereinced ESS/R users. If so > i'd really appreciate tips on efficient package > installation/maintenance using Emacs/ESS. > > TIA, > > karl > -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 From tlumley at uw.edu Wed Sep 7 11:05:40 2011 From: tlumley at uw.edu (Thomas Lumley) Date: Wed, 7 Sep 2011 21:05:40 +1200 Subject: [R] Weight in Function RM In-Reply-To: <429C286A082D0D4A8D2EAE084350DB25221A4B3D@ptsecmsmbx02.phatrasec.com> References: <429C286A082D0D4A8D2EAE084350DB25221A4B3D@ptsecmsmbx02.phatrasec.com> Message-ID: On Wed, Sep 7, 2011 at 4:28 PM, Suphajak Ngamlak wrote: > Dear all, > > I am trying to do weighted regression using lm function in R. However, I have a question why the results from > > > 1) ? ? ?lm(formula = Y~aX, weight = w) > > 2) ? ? ?lm(formula = wY~waX) > > are different. Aren't they supposed to have the exactly same result? No. lm(formula = wY~waX) is the same as lm(formula = Y~aX, weight = w^2) -- Thomas Lumley Professor of Biostatistics University of Auckland From descostes at ciml.univ-mrs.fr Wed Sep 7 11:20:01 2011 From: descostes at ciml.univ-mrs.fr (Nico902) Date: Wed, 7 Sep 2011 02:20:01 -0700 (PDT) Subject: [R] mclust: modelName="E" vs modelName="V" In-Reply-To: References: <1315138638856-3789167.post@n4.nabble.com> <1315322215007-3793697.post@n4.nabble.com> Message-ID: <1315387201747-3795661.post@n4.nabble.com> "What's wrong with that? (The values you submit as scale in "prior" are not fixed variances, but parameters of the prior distribtion - your problem may be that you believe that they are meant to be variances fixed by you!?)" Yes I did, so I think it is not possible to fix the variance. Anyway, thanks a lot for your help, I think I will find a way to do it as I want. -- View this message in context: http://r.789695.n4.nabble.com/mclust-modelName-E-vs-modelName-V-tp3789167p3795661.html Sent from the R help mailing list archive at Nabble.com. From azamjaafari at yahoo.com Wed Sep 7 11:54:37 2011 From: azamjaafari at yahoo.com (azam jaafari) Date: Wed, 7 Sep 2011 02:54:37 -0700 (PDT) Subject: [R] diversity map in r Message-ID: <1315389277.85039.YahooMailNeo@web37108.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jholtman at gmail.com Wed Sep 7 12:05:26 2011 From: jholtman at gmail.com (jim holtman) Date: Wed, 7 Sep 2011 06:05:26 -0400 Subject: [R] Werid things when write.table In-Reply-To: References: Message-ID: If your matrix is being converted to characters, then the conversion (and the use of '.' for decimal) is happening before the the write.table call. Do the cbinding without putting titles (characters) in the first line and then once the table is build, use 'colnames' to add the titles. It sounds like in the call you are using 'col.names=TRUE' so are you also getting column names in addition to the titles you have in the first line. It is hard to tell without reproducible data, or at least an 'str' of both of the objects you reference. On Wed, Sep 7, 2011 at 4:10 AM, Damian Abalo wrote: > Hello, I am having some issues with the order write.table. > > The fact is that I need to use "," as the decimal character and not > "." as default, and when I use: > > write.table(Sales,file="Sales.xls",quote=FALSE, sep = "\t", dec = ",", > row.names = FALSE, col.names = TRUE) > > It does it perfectly, but then, in a different part of the code: > > write.table(Error,file="Error report.xls",quote=FALSE, sep = "\t", dec > = ",", row.names = FALSE, col.names = TRUE) > > It does not do it, it keeps the . as the decimal character. The main > difference is that the second table is treated as a character matrix, > because it has characters on the first line, which work as titles. > This table is builded by cbinding several quantile orders, and I don't > know how to make the program treat the table as numbers while > mantaining the titles (Since it is not a table really). Any ideas? > Thanks for your help > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From JRadinger at gmx.at Wed Sep 7 12:17:58 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Wed, 07 Sep 2011 12:17:58 +0200 Subject: [R] linear regression, log-transformation and plotting In-Reply-To: <20110907090713.215100@gmx.net> References: <20110906100604.246830@gmx.net> <20110907090713.215100@gmx.net> Message-ID: <20110907101758.284360@gmx.net> Hello, I've some questions concerning log-transformations and plotting of the regression lines. So far as I know is it a problem to log-transform values smaller than 1 (0-1). In my statistics lecture I was told to do a log(x+1) transformation in such cases. So I provide here a small example to explain my questions: # Some example data for testing a1 <-c(0.2,1.9,0.1,0.2,0.8,22,111.3,19.9,23.9,138,42.3,54.2,0.9) b1 <-c(1.8,28.2,0.3,12.4,3.2,81.1,122.1,2.9,37.2,98.9,21,28.7,1.8) data1 <- data.frame(a1,b1) model <- lm(log(a1+1)~log(b1+1)) because of values less then one I did the log(x+1) transformation for running the lm. Is that correct so far? (Just to mention: These are example data so I haven't checked if the need a transformation at all) Then some questions arise when it comes to plot the data. As usual I'd like to plot the original data (not log transformed) but in a log-scale. I tried two approaches the standard plot function and ggplot. # Plot with ggplot ggplot()+ geom_point(aes(b1,a1,data=data1))+ geom_abline(aes(intercept=coef(model)[1],slope=coef(model)[2]))+ scale_y_log()+ scale_x_log() # Plot with standard plot plot(b1,a1,log="xy") abline(model,untf=T) abline(model,untf=F) 1) The regression lines are different for plot vs. ggplot(transformed or untransformed). So what is actually the correct line? 2) The regression line was calculated on basis of log(x+1), but the log scale on my axis is just simple log (without +1). So how are such cases usually treated? I thought about subtracting the value 1 from the intercept? So my simple question: What is the best way to display such data with a regression line? Thank you /Johannes -- From arne.henningsen at googlemail.com Wed Sep 7 13:44:27 2011 From: arne.henningsen at googlemail.com (Arne Henningsen) Date: Wed, 7 Sep 2011 13:44:27 +0200 Subject: [R] function censReg in panel data setting In-Reply-To: <1315384740517-3795575.post@n4.nabble.com> References: <1315259907768-3792227.post@n4.nabble.com> <1315288271999-3792639.post@n4.nabble.com> <1315384740517-3795575.post@n4.nabble.com> Message-ID: Dear Igors On 7 September 2011 10:39, Igors wrote: > However I am still struggling to obtain model estmates. > > The same code: > >> UpC <- censReg(Power ~ Windspeed, left = -Inf, right = >> 2000,data=PData_In,method="BHHH",nGHQ = 4,start=c(-691.18,186.79,3.9,3.9)) > > Error in maxNRCompute(fn = logLikAttr, fnOrig = fn, gradOrig = grad, > hessOrig = hess, ?: > ?NA in the initial gradient > > > I have tried to change starting values and regressors in the model several > times, but I always get the same mentioned error message. How can I make it > work? ?Is this function maxNRCompute() on the last step of calculation > (maximization of the ML)? > > I had an idea that the error could appear since I have huge sample, but I > tried to cut it and it still doesn't work. It is hard to figure out the cause of this error without a reproducible example. Is is possible that you send a reproducible example to me? Could it be that there are NAs in the data or something in the panel data specification is not as censReg() expects it? /Arne -- Arne Henningsen http://www.arne-henningsen.name From paul.hiemstra at knmi.nl Wed Sep 7 13:45:45 2011 From: paul.hiemstra at knmi.nl (Paul Hiemstra) Date: Wed, 07 Sep 2011 11:45:45 +0000 Subject: [R] diversity map in r In-Reply-To: <1315389277.85039.YahooMailNeo@web37108.mail.mud.yahoo.com> References: <1315389277.85039.YahooMailNeo@web37108.mail.mud.yahoo.com> Message-ID: <4E675969.4020001@knmi.nl> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From JSorkin at grecc.umaryland.edu Wed Sep 7 13:21:58 2011 From: JSorkin at grecc.umaryland.edu (John Sorkin) Date: Wed, 07 Sep 2011 07:21:58 -0400 Subject: [R] suggestion for proportions In-Reply-To: <1315383072.86607.YahooMailNeo@web125805.mail.ne1.yahoo.com> References: <1315383072.86607.YahooMailNeo@web125805.mail.ne1.yahoo.com> Message-ID: <4E671B96020000CB00095795@med-webappgwia1.medicine.umaryland.edu> >From you description, you should not used a paired Student's t-test. One uses a paired test when pairs of observations come from the same experimental unit (and thus are correlated). You describe a study where each experimental unit is tested once and where there are two independent groups of experimental units. Look at t.test (i.e. enter ?t.test). John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) >>> array chip 9/7/2011 4:11 AM >>> Hi, I am wondering if anyone can suggest how to test the equality of 2 proportions. The caveat here is that the 2 proportions were calculated from the same number of samples using 2 different tests. So essentially we are comparing 2 accuracy rates from same, say 100, samples. I think this is like a paired test, but don't know if really we need to consider the "paired" nature of the data, and if yes then how? Or just use prop.test() to compare 2 proportions? Any suggestion would be greatly appreciated. Thanks John ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} From k.brand at erasmusmc.nl Wed Sep 7 14:11:30 2011 From: k.brand at erasmusmc.nl (Karl Brand) Date: Wed, 07 Sep 2011 14:11:30 +0200 Subject: [R] How does one start R within Emacs/ESS with root privileges? In-Reply-To: <4E67331F.80005@knmi.nl> References: <4E673131.9000201@erasmusmc.nl> <4E67331F.80005@knmi.nl> Message-ID: <4E675F72.7050605@erasmusmc.nl> Cheers Paul. Its a very good point. Although i am curious how badly i can damage my R install by running as root. I always ran R in windows with admin. privileges without problems (touch wood). Probably best to never find out by sticking with user privileges. However, even for taking care of R install/maint. i'd prefer to do this interactively within Emacs rather than the terminal. Motivated by this, i'd still like to find out how to invoke R with root privileges. I've also reposted the original email on perhaps a more appropriate forum at: ESS-help at stat.math.ethz.ch Karl On 2011-09-07 11:02, Paul Hiemstra wrote: > On 09/07/2011 08:54 AM, Karl Brand wrote: >> Esteemed UseRs and DevelopeRs, >> >> Apologies if this question belongs else where, but it does concern R's >> package installation/maintenance. >> >> How does one start R within Emacs/ESS with root privileges? >> >> I tried without success: >> >>> M-x sudo R >> >> Why i'm motivated to do so: >> >> It seems logical to me, as the only user of the PC, to keep my R >> library consolidated in the universal library rather than splitting >> into universal and user libraries. Hence the desire to run R as root. > > Hi Karl, > > Why the need to install packages in root? As you are the only user there > is not reason to install them system wide (to make them available to all > users, which is just you). Installing the packages in your homedir > solves your problem much easier, without the need to run R as root > continuously. I think you should not run anything as root if it is not > absolutely needed, which could potentially damage your system > (accidentally overwriting something). > > hope this helps, > Paul > >> >> In addition, it's nice to be able to install packages 'on the fly' >> when and as needed and not need to launch a separate R session (as >> root) in the terminal just to install a package. >> >> Migrating from windows, i'm completey new to linux (ubuntu) and am >> seeing for myself if Emacs/ESS is as good as its purported to be. So >> maybe my motivation is nonsensical to expereinced ESS/R users. If so >> i'd really appreciate tips on efficient package >> installation/maintenance using Emacs/ESS. >> >> TIA, >> >> karl >> > > -- Karl Brand Department of Genetics Erasmus MC Dr Molewaterplein 50 3015 GE Rotterdam P +31 (0)10 704 3455 | F +31 (0)10 704 4743 | M +31 (0)642 777 268 From batholdy at googlemail.com Wed Sep 7 14:15:30 2011 From: batholdy at googlemail.com (Martin Batholdy) Date: Wed, 7 Sep 2011 14:15:30 +0200 Subject: [R] show equal entries in data.frame Message-ID: Hi, I have the following data-frame: x <- data.frame(first = c('a','c','k','b'), second = c('b','k','a','j'), third = c('f','a','h','b')) first second third 1 a b f 2 c k a 3 k a h 4 b j b Now I would like to see wether there are entries that exists in all three columns. In the example data-frame this would be true for 'a' and 'b' (so the row-number of the element is not important). can someone help me on this? thanks! From dwinsemius at comcast.net Wed Sep 7 14:31:42 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 7 Sep 2011 08:31:42 -0400 Subject: [R] Signify level of significance In-Reply-To: <1315378949909-3795424.post@n4.nabble.com> References: <1315378949909-3795424.post@n4.nabble.com> Message-ID: <4D3A3BDB-6C7D-4D97-9F0F-4147F3EAA915@comcast.net> On Sep 7, 2011, at 3:02 AM, goong001 wrote: > Im doing test on my sample using 1-sample Kolmogorov-smirnov > test(poisson > distribution) to specify my sample obtain avalanche effect. before i > can > compare my D value with the critical value(Dn, alpha), i have to > specify > which level of significance to use (eg. alpha = 0.01, 0.05, etc) and > i don't > understand, why is it related to the probability that the hypothesis > will be > rejected when it is in fact true. please anyone help me.... This is not a statistics tutoring mailing list. (And this has no intrinsic connection with R programming.) You should try other sites, perhaps stats.stackexchange.com , to address your basic statistics question. -- David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Wed Sep 7 14:59:08 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 7 Sep 2011 08:59:08 -0400 Subject: [R] show equal entries in data.frame In-Reply-To: References: Message-ID: <1BA727BB-D615-49C2-81FA-AFBEB6525FC3@comcast.net> On Sep 7, 2011, at 8:15 AM, Martin Batholdy wrote: > Hi, > > I have the following data-frame: > > > x <- data.frame(first = c('a','c','k','b'), second = > c('b','k','a','j'), third = c('f','a','h','b')) > > first second third > 1 a b f > 2 c k a > 3 k a h > 4 b j b > > > Now I would like to see wether there are entries that exists in all > three columns. > > In the example data-frame this would be true for 'a' and 'b' > (so the row-number of the element is not important). Because of the way you constructed this data.frame, you have factors. a.in.b <- with( x, first[first %in% second]) a.in.b,in.c <- a.in.b[a.in.b %in% x$third] a.in.b.in.c #[1] a b #Levels: a b c k -- David Winsemius, MD West Hartford, CT From frainj at gmail.com Wed Sep 7 15:08:53 2011 From: frainj at gmail.com (John C Frain) Date: Wed, 7 Sep 2011 14:08:53 +0100 Subject: [R] linear regression, log-transformation and plotting In-Reply-To: <20110907101758.284360@gmx.net> References: <20110906100604.246830@gmx.net> <20110907090713.215100@gmx.net> <20110907101758.284360@gmx.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From gunter.berton at gene.com Wed Sep 7 15:34:11 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Wed, 7 Sep 2011 06:34:11 -0700 Subject: [R] suggestion for proportions In-Reply-To: <4E671B96020000CB00095795@med-webappgwia1.medicine.umaryland.edu> References: <1315383072.86607.YahooMailNeo@web125805.mail.ne1.yahoo.com> <4E671B96020000CB00095795@med-webappgwia1.medicine.umaryland.edu> Message-ID: Please! ... ?prop.test not t tests. -- Bert -- On Wed, Sep 7, 2011 at 4:21 AM, John Sorkin wrote: > >From you description, you should not used a paired Student's t-test. One uses a paired test when pairs of observations come from the same experimental unit (and thus are correlated). You describe a study where each experimental unit is tested once and where there are two independent groups of experimental units. Look at t.test (i.e. enter ?t.test). > John > > John David Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > >>>> array chip 9/7/2011 4:11 AM >>> > Hi, I am wondering if anyone can suggest how to test the equality of 2 proportions. The caveat here is that the 2 proportions were calculated from the same number of samples using 2 different tests. So essentially we are comparing 2 accuracy rates from same, say 100, samples. I think this is like a paired test, but don't know if really we need to consider the "paired" nature of the data, and if yes then how? Or just use prop.test() to compare 2 proportions? > > Any suggestion would be greatly appreciated. > > Thanks > > John > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > Confidentiality Statement: > This email message, including any attachments, is for th...{{dropped:6}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics From wolfgang.viechtbauer at maastrichtuniversity.nl Wed Sep 7 16:28:24 2011 From: wolfgang.viechtbauer at maastrichtuniversity.nl (Viechtbauer Wolfgang (STAT)) Date: Wed, 7 Sep 2011 16:28:24 +0200 Subject: [R] suggestion for proportions In-Reply-To: References: <1315383072.86607.YahooMailNeo@web125805.mail.ne1.yahoo.com> <4E671B96020000CB00095795@med-webappgwia1.medicine.umaryland.edu> Message-ID: <077E31A57DA26E46AB0D493C9966AC730C359B463C@UM-MAIL4112.unimaas.nl> Acutally, ?mcnemar.test since it is paired data. Best, Wolfgang > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of Bert Gunter > Sent: Wednesday, September 07, 2011 15:34 > To: John Sorkin > Cc: r-help at r-project.org > Subject: Re: [R] suggestion for proportions > > Please! ... ?prop.test > > not t tests. > > -- Bert > > -- > > On Wed, Sep 7, 2011 at 4:21 AM, John Sorkin > wrote: > > >From you description, you should not used a paired Student's t-test. > One uses a paired test when pairs of observations come from the same > experimental unit (and thus are correlated). You describe a study where > each experimental unit is tested once and where there are two independent > groups of experimental units. Look at t.test (i.e. enter ?t.test). > > John > > > >>>> array chip 9/7/2011 4:11 AM >>> > > Hi, I am wondering if anyone can suggest how to test the equality of 2 > proportions. The caveat here is that the 2 proportions were calculated > from the same number of samples using 2 different tests. So essentially we > are comparing 2 accuracy rates from same, say 100, samples. I think this > is like a paired test, but don't know if really we need to consider the > "paired" nature of the data, and if yes then how? Or just use prop.test() > to compare 2 proportions? > > > > Any suggestion would be greatly appreciated. > > > > Thanks > > > > John From eran at taykey.com Wed Sep 7 16:31:08 2011 From: eran at taykey.com (Eran Eidinger) Date: Wed, 7 Sep 2011 17:31:08 +0300 Subject: [R] Poly-phase Filters in R Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bps0002 at auburn.edu Wed Sep 7 16:35:31 2011 From: bps0002 at auburn.edu (B77S) Date: Wed, 7 Sep 2011 07:35:31 -0700 (PDT) Subject: [R] reshaping data Message-ID: <1315406131128-3796246.post@n4.nabble.com> I have the following data (see RawData using dput below) How do I get it in the following 3 column format (CO2 measurements are the elements of the original data frame). I'm sure the package reshape is where I should look, but I haven't figured out how. Thanks ahead of time Month Year CO2 J 1958 F 1958 M 1958 315.71 A 1958 317.45 M.1 1958 317.5 J.1 1958 J.2 1958 315.86 A.1 1958 314.93 S 1958 313.19 O 1958 N 1958 313.34 D 1958 314.67 J 1959 315.58 F 1959 316.47 # here is the data RawData <- structure(list(Year = c(1958, 1959, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004), J = c(NA, 315.58, 316.43, 316.89, 317.94, 318.74, 319.57, 319.44, 320.62, 322.33, 322.57, 324, 325.06, 326.17, 326.77, 328.54, 329.35, 330.4, 331.74, 332.92, 334.97, 336.23, 338.01, 339.23, 340.75, 341.37, 343.7, 344.97, 346.29, 348.02, 350.43, 352.76, 353.66, 354.72, 355.98, 356.7, 358.36, 359.96, 362.05, 363.18, 365.32, 368.15, 369.14, 370.28, 372.43, 374.68, 376.79), F = c(NA, 316.47, 316.97, 317.7, 318.56, 319.08, NA, 320.44, 321.59, 322.5, 323.15, 324.42, 325.98, 326.68, 327.63, 329.56, 330.71, 331.41, 332.56, 333.42, 335.39, 336.76, 338.36, 340.47, 341.61, 342.52, 344.51, 346, 346.96, 348.47, 351.72, 353.07, 354.7, 355.75, 356.72, 357.16, 358.91, 361, 363.25, 364, 366.15, 368.87, 369.46, 371.5, 373.09, 375.63, 377.37), M = c(315.71, 316.65, 317.58, 318.54, 319.69, 319.86, NA, 320.89, 322.39, 323.04, 323.89, 325.64, 326.93, 327.18, 327.75, 330.3, 331.48, 332.04, 333.5, 334.7, 336.64, 337.96, 340.08, 341.38, 342.7, 343.1, 345.28, 347.43, 347.86, 349.42, 352.22, 353.68, 355.39, 357.16, 357.81, 358.38, 359.97, 361.64, 364.03, 364.57, 367.31, 369.59, 370.52, 372.12, 373.52, 376.11, 378.41 ), A = c(317.45, 317.71, 319.03, 319.48, 320.58, 321.39, NA, 322.13, 323.7, 324.42, 325.02, 326.66, 328.13, 327.78, 329.72, 331.5, 332.65, 333.31, 334.58, 336.07, 337.76, 338.89, 340.77, 342.51, 343.56, 344.94, 347.08, 348.35, 349.55, 350.99, 353.59, 355.42, 356.2, 358.6, 359.15, 359.46, 361.26, 363.45, 364.72, 366.35, 368.61, 371.14, 371.66, 372.87, 374.86, 377.65, 380.52 ), M.1 = c(317.5, 318.29, 320.03, 320.58, 321.01, 322.24, 322.23, 322.16, 324.07, 325, 325.57, 327.38, 328.07, 328.92, 330.07, 332.48, 333.09, 333.96, 334.87, 336.74, 338.01, 339.47, 341.46, 342.91, 344.13, 345.75, 347.43, 348.93, 350.21, 351.84, 354.22, 355.67, 357.16, 359.34, 359.66, 360.28, 361.68, 363.79, 365.41, 366.79, 369.29, 371, 371.82, 374.02, 375.55, 378.35, 380.63), J.1 = c(NA, 318.16, 319.59, 319.78, 320.61, 321.47, 321.89, 321.87, 323.75, 324.09, 325.36, 326.7, 327.66, 328.57, 329.09, 332.07, 332.25, 333.59, 334.34, 336.27, 337.89, 339.29, 341.17, 342.25, 343.35, 345.32, 346.79, 348.25, 349.54, 351.25, 353.79, 355.13, 356.22, 358.24, 359.25, 359.6, 360.95, 363.26, 364.97, 365.62, 368.87, 370.35, 371.7, 373.3, 375.4, 378.13, 379.57 ), J.2 = c(315.86, 316.55, 318.18, 318.58, 319.61, 319.74, 320.44, 321.21, 322.4, 322.55, 324.14, 325.89, 326.35, 327.37, 328.05, 330.87, 331.18, 331.91, 333.05, 334.93, 336.54, 337.73, 339.56, 340.49, 342.06, 343.99, 345.4, 346.56, 347.94, 349.52, 352.39, 353.9, 354.82, 356.17, 357.03, 357.57, 359.55, 361.9, 363.65, 364.47, 367.64, 369.27, 370.12, 371.62, 374.02, 376.62, 377.79), A.1 = c(314.93, 314.8, 315.91, 316.79, 317.4, 317.77, 318.7, 318.87, 320.37, 320.92, 322.11, 323.67, 324.69, 325.43, 326.32, 329.31, 329.4, 330.06, 330.94, 332.75, 334.68, 336.09, 337.6, 338.43, 339.82, 342.39, 343.28, 344.69, 345.91, 348.1, 350.44, 351.67, 352.91, 354.03, 355, 355.52, 357.49, 359.46, 361.49, 362.51, 365.77, 366.94, 368.12, 369.55, 371.49, 374.5, 375.86), S = c(313.19, 313.84, 314.16, 314.99, 316.26, 316.21, 316.7, 317.81, 318.64, 319.26, 320.33, 322.38, 323.1, 323.36, 324.84, 327.51, 327.44, 328.56, 329.3, 331.58, 332.76, 333.91, 335.88, 336.69, 337.97, 339.86, 341.07, 343.09, 344.86, 346.44, 348.72, 349.8, 350.96, 352.16, 353.01, 353.7, 355.84, 358.06, 359.46, 360.19, 363.9, 364.63, 366.62, 367.96, 370.71, 372.99, 374.06), O = c(NA, 313.34, 313.83, 315.31, 315.42, 315.99, 316.87, 317.3, 318.1, 319.39, 320.25, 321.78, 323.07, 323.56, 325.2, 327.18, 327.37, 328.34, 328.94, 331.16, 332.54, 333.86, 336.01, 336.85, 337.86, 339.99, 341.35, 342.8, 344.17, 346.36, 348.88, 349.99, 351.18, 352.21, 353.31, 353.98, 355.99, 357.75, 359.6, 360.77, 364.23, 365.12, 366.73, 368.09, 370.24, 373, 374.24), N = c(313.34, 314.81, 315, 316.1, 316.69, 317.07, 317.68, 318.87, 319.79, 320.72, 321.32, 322.85, 324.01, 324.8, 326.5, 328.16, 328.46, 329.49, 330.31, 332.4, 333.92, 335.29, 337.1, 338.36, 339.26, 341.16, 342.98, 344.24, 345.66, 347.81, 350.07, 351.3, 352.83, 353.75, 354.16, 355.33, 357.58, 359.56, 360.76, 362.43, 365.46, 366.67, 368.29, 369.68, 372.08, 374.35, 375.86), D = c(314.67, 315.59, 316.19, 317.01, 317.69, 318.36, 318.71, 319.42, 321.03, 321.96, 322.9, 324.12, 325.13, 326.01, 327.55, 328.64, 329.58, 330.76, 331.68, 333.85, 334.95, 336.73, 338.21, 339.61, 340.49, 342.99, 344.22, 345.56, 346.9, 348.96, 351.34, 352.53, 354.21, 354.99, 355.4, 356.8, 359.04, 360.7, 362.33, 364.28, 366.97, 368.01, 369.53, 371.24, 373.78, 375.7, 377.48)), .Names = c("Year", "J", "F", "M", "A", "M.1", "J.1", "J.2", "A.1", "S", "O", "N", "D"), row.names = c(NA, -47L), class = "data.frame") -- View this message in context: http://r.789695.n4.nabble.com/reshaping-data-tp3796246p3796246.html Sent from the R help mailing list archive at Nabble.com. From gunter.berton at gene.com Wed Sep 7 16:47:00 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Wed, 7 Sep 2011 07:47:00 -0700 Subject: [R] suggestion for proportions In-Reply-To: <077E31A57DA26E46AB0D493C9966AC730C359B463C@UM-MAIL4112.unimaas.nl> References: <1315383072.86607.YahooMailNeo@web125805.mail.ne1.yahoo.com> <4E671B96020000CB00095795@med-webappgwia1.medicine.umaryland.edu> <077E31A57DA26E46AB0D493C9966AC730C359B463C@UM-MAIL4112.unimaas.nl> Message-ID: Wolfgang: On Wed, Sep 7, 2011 at 7:28 AM, Viechtbauer Wolfgang (STAT) wrote: > Acutally, > > ?mcnemar.test > > since it is paired data. Actually, it is unclear to me from the OP's message whether this is the case. In one sentence the OP says that the _number_ of samples is the same, and in the next he says that "essentially" the samples are the same. So, as usual, imprecision in the problem description leads to imprecision in the solution. But your point is well taken, of course. -- Bert > > Best, > > Wolfgang > >> -----Original Message----- >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] >> On Behalf Of Bert Gunter >> Sent: Wednesday, September 07, 2011 15:34 >> To: John Sorkin >> Cc: r-help at r-project.org >> Subject: Re: [R] suggestion for proportions >> >> Please! ... ??prop.test >> >> not t tests. >> >> -- Bert >> >> -- >> >> On Wed, Sep 7, 2011 at 4:21 AM, John Sorkin >> wrote: >> > >From you description, you should not used a paired Student's t-test. >> One uses a paired test when pairs of observations come from the same >> experimental unit (and thus are correlated). You describe a study where >> each experimental unit is tested once and where there are two independent >> groups of experimental units. Look at t.test (i.e. enter ?t.test). >> > John >> > >> >>>> array chip 9/7/2011 4:11 AM >>> >> > Hi, I am wondering if anyone can suggest how to test the equality of 2 >> proportions. The caveat here is that the 2 proportions were calculated >> from the same number of samples using 2 different tests. So essentially we >> are comparing 2 accuracy rates from same, say 100, samples. I think this >> is like a paired test, but don't know if really we need to consider the >> "paired" nature of the data, and if yes then how? Or just use prop.test() >> to compare 2 proportions? >> > >> > Any suggestion would be greatly appreciated. >> > >> > Thanks >> > >> > John > From gunter.berton at gene.com Wed Sep 7 16:50:40 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Wed, 7 Sep 2011 07:50:40 -0700 Subject: [R] suggestion for proportions In-Reply-To: <077E31A57DA26E46AB0D493C9966AC730C359B463C@UM-MAIL4112.unimaas.nl> References: <1315383072.86607.YahooMailNeo@web125805.mail.ne1.yahoo.com> <4E671B96020000CB00095795@med-webappgwia1.medicine.umaryland.edu> <077E31A57DA26E46AB0D493C9966AC730C359B463C@UM-MAIL4112.unimaas.nl> Message-ID: Oh ... I should have added that either option could be handled by glm(), of course (provided that you're willing to accept the approximate tests). But this is getting OT. -- Bert On Wed, Sep 7, 2011 at 7:28 AM, Viechtbauer Wolfgang (STAT) wrote: > Acutally, > > ?mcnemar.test > > since it is paired data. > > Best, > > Wolfgang > >> -----Original Message----- >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] >> On Behalf Of Bert Gunter >> Sent: Wednesday, September 07, 2011 15:34 >> To: John Sorkin >> Cc: r-help at r-project.org >> Subject: Re: [R] suggestion for proportions >> >> Please! ... ??prop.test >> >> not t tests. >> >> -- Bert >> >> -- >> >> On Wed, Sep 7, 2011 at 4:21 AM, John Sorkin >> wrote: >> > >From you description, you should not used a paired Student's t-test. >> One uses a paired test when pairs of observations come from the same >> experimental unit (and thus are correlated). You describe a study where >> each experimental unit is tested once and where there are two independent >> groups of experimental units. Look at t.test (i.e. enter ?t.test). >> > John >> > >> >>>> array chip 9/7/2011 4:11 AM >>> >> > Hi, I am wondering if anyone can suggest how to test the equality of 2 >> proportions. The caveat here is that the 2 proportions were calculated >> from the same number of samples using 2 different tests. So essentially we >> are comparing 2 accuracy rates from same, say 100, samples. I think this >> is like a paired test, but don't know if really we need to consider the >> "paired" nature of the data, and if yes then how? Or just use prop.test() >> to compare 2 proportions? >> > >> > Any suggestion would be greatly appreciated. >> > >> > Thanks >> > >> > John > > From divyamurali13 at gmail.com Wed Sep 7 10:51:10 2011 From: divyamurali13 at gmail.com (Divyam) Date: Wed, 7 Sep 2011 01:51:10 -0700 (PDT) Subject: [R] Handling ever growing data in svm predictions Message-ID: <1315385470719-3795602.post@n4.nabble.com> Hi, I am new to R and here is what I am doing in it now. I am using machine learning technique (svm) to do predictions. The data that I am using is one that is bound to grow perpetually. what I want to know is, say, I fed in a data set with 5000 data points to svm initially. The algorithm derives a certain intelligence (i.e.,output) based on these 5000 data points. I have an additional 10000 data points today. Now if i remove the first fed 5000 data and then feed in this new additional 10000 data, I want the algorithm to make use of the intelligence derived from the initial data(5000 data points) too while evaluating the new delta data points(10000) and the end result to be an aggregated measure of the total 15000 data. This is important to me from an efficiency point of view. If there are any other packages in r that does the same (i.e., enable statistical models to learn from the past experience continuously while deleting the prior data used from which the intelligence is derived) kindly post about them. This will be of immense help to me. Thanks in advance. divya -- View this message in context: http://r.789695.n4.nabble.com/Handling-ever-growing-data-in-svm-predictions-tp3795602p3795602.html Sent from the R help mailing list archive at Nabble.com. From divyamurali13 at gmail.com Wed Sep 7 11:25:54 2011 From: divyamurali13 at gmail.com (Divyam) Date: Wed, 7 Sep 2011 02:25:54 -0700 (PDT) Subject: [R] predictive modeling and extremely large data Message-ID: <1315387554818-3795674.post@n4.nabble.com> Hi, I am new to R and here is what I am doing in it now. I am using machine learning technique (svm) to do predictive modeling. The data that I am using is one that is bound to grow perpetually. what I want to know is, say, I fed in a data set with 5000 data points to svm initially. The algorithm derives a certain intelligence (i.e.,output) based on these 5000 data points. I have an additional 10000 data points today. Now if i remove the first fed 5000 data and then feed in this new additional 10000 data, I want the algorithm to make use of the intelligence derived from the initial data(5000 data points) too while evaluating the new delta data points(10000) and the end result to be an aggregated measure of the total 15000 data. This is important to me from an efficiency point of view. If there are any other packages in r that does the same (i.e., enable statistical models to learn from the past experience continuously while deleting the prior data used from which the intelligence is derived) kindly post about them. This will be of immense help to me. Thanks in advance. divya -- View this message in context: http://r.789695.n4.nabble.com/predictive-modeling-and-extremely-large-data-tp3795674p3795674.html Sent from the R help mailing list archive at Nabble.com. From chandratr at gmail.com Wed Sep 7 11:50:36 2011 From: chandratr at gmail.com (Chandrasekhar Rudrappa) Date: Wed, 7 Sep 2011 15:20:36 +0530 Subject: [R] Save Workspace Image? problem at start up Message-ID: Dear Sir, I am using R 2.13.1. Since yesterday when I start the Rgui the message "Save Workspace Image?" shows up immediately on opening. Even if I cancel, it does not get cancelled. If I select either no or yes, it quits. I do not know all of a sudden what happened? Right from yesterday I am trying my best to run R but am not able to do so. Please help. Thanks -- Dr. TR Chandrasekhar, M.Sc., M. Tech., Ph. D., Sr. Scientist Rubber Research Institute of India Hevea Breeding Sub Station Kadaba - 574 221 DK Dt., Karnataka Phone-Land: 08251-214336 Mobile: 9448780118 From varsha.nipfp at gmail.com Wed Sep 7 12:04:17 2011 From: varsha.nipfp at gmail.com (Varsha Agrawal) Date: Wed, 7 Sep 2011 15:34:17 +0530 Subject: [R] (no subject) Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From varsha.nipfp at gmail.com Wed Sep 7 12:18:10 2011 From: varsha.nipfp at gmail.com (Varsha Agrawal) Date: Wed, 7 Sep 2011 15:48:10 +0530 Subject: [R] What happens if we give a factor as an index at a list? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wardamo.zero at gmail.com Wed Sep 7 12:41:17 2011 From: wardamo.zero at gmail.com (Damian Abalo) Date: Wed, 7 Sep 2011 12:41:17 +0200 Subject: [R] Werid things when write.table In-Reply-To: References: Message-ID: It works now. It is much more appropiate to add the titles than to add a column with strings, and, in fact, it were rownames, not column names (thats why colnames was set to true). Anyway using rownames instead of adding an aditional column and setting row.names to true in the write.trable works perfectly. Thanks for the help, and sorry for asking such a basic thing, I am still not very good at R :P 2011/9/7 jim holtman : > If your matrix is being converted to characters, then the conversion > (and the use of '.' for decimal) is happening before the the > write.table call. ?Do the cbinding without putting titles (characters) > in the first line and then once the table is build, use 'colnames' to > add the titles. ?It sounds like in the call you are using > 'col.names=TRUE' so are you also getting column names in addition to > the titles you have in the first line. ?It is hard to tell without > reproducible data, or at least an 'str' of both of the objects you > reference. > > On Wed, Sep 7, 2011 at 4:10 AM, Damian Abalo wrote: >> Hello, I am having some issues with the order write.table. >> >> The fact is that I need to use "," as the decimal character and not >> "." as default, and when I use: >> >> write.table(Sales,file="Sales.xls",quote=FALSE, sep = "\t", dec = ",", >> row.names = FALSE, col.names = TRUE) >> >> It does it perfectly, but then, in a different part of the code: >> >> write.table(Error,file="Error report.xls",quote=FALSE, sep = "\t", dec >> = ",", row.names = FALSE, col.names = TRUE) >> >> It does not do it, it keeps the . as the decimal character. The main >> difference is that the second table is treated as a character matrix, >> because it has characters on the first line, which work as titles. >> This table is builded by cbinding several quantile orders, and I don't >> know how to make the program treat the table as numbers while >> mantaining the titles (Since it is not a table really). Any ideas? >> Thanks for your help >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > From anna.dunietz at gmail.com Wed Sep 7 13:27:38 2011 From: anna.dunietz at gmail.com (Duny) Date: Wed, 7 Sep 2011 04:27:38 -0700 (PDT) Subject: [R] r-help volcano plot In-Reply-To: References: <146393c.e200.1323d534767.Coremail.knifeboot@163.com> <1315290507382-3792696.post@n4.nabble.com> Message-ID: <1315394858719-3795847.post@n4.nabble.com> Ben - I'm sorry, you're right! The following website shows how to make a volcano plot using ggplot2. http://bioinformatics.knowledgeblog.org/2011/06/21/volcano-plots-of-microarray-data/ Enjoy! Anna -- View this message in context: http://r.789695.n4.nabble.com/r-help-volcano-plot-tp3792651p3795847.html Sent from the R help mailing list archive at Nabble.com. From Rachida.Elmehdi at uclouvain.be Wed Sep 7 13:44:13 2011 From: Rachida.Elmehdi at uclouvain.be (Rachida El Mehdi) Date: Wed, 7 Sep 2011 13:44:13 +0200 Subject: [R] Help on the multivariate interpolation with R Message-ID: Hello, I work on the Stochastic Frontier Analysis (SFA) and I am looking for a function in a R package which done the multivariate interpolation. My problem is: For all i=1, ..., n, I have values of (xi1, xi2, ..., xi7)in IR^7 and f(xi1, xi2, ..., xi7)in IR and I have also values of (x'i1, x'i2, ..., x'i7) in IR^7 and I need f(x'i1, x'i2, ..., x'i7) in IR by interpolation. So x is a (n,7) or a (7,n) matrix and x? is also a matrix with the same format as x. Can someone help me? Thank you in advance. Rachida El Mehdi From igors.lahanciks at gmail.com Wed Sep 7 14:36:48 2011 From: igors.lahanciks at gmail.com (Igors) Date: Wed, 7 Sep 2011 05:36:48 -0700 (PDT) Subject: [R] function censReg in panel data setting In-Reply-To: References: <1315259907768-3792227.post@n4.nabble.com> <1315288271999-3792639.post@n4.nabble.com> <1315384740517-3795575.post@n4.nabble.com> Message-ID: <1315399008306-3795987.post@n4.nabble.com> Dear Arne, I have sent you an e-mail to your e-mail at gmail.com. There I have attached my data set and the code for this particular problem. I hope to hear from you soon. Many thanks in advance! Igors -- View this message in context: http://r.789695.n4.nabble.com/function-censReg-in-panel-data-setting-tp3792227p3795987.html Sent from the R help mailing list archive at Nabble.com. From igors.lahanciks at gmail.com Wed Sep 7 14:47:34 2011 From: igors.lahanciks at gmail.com (Igors) Date: Wed, 7 Sep 2011 05:47:34 -0700 (PDT) Subject: [R] Overall SSR in plm package Message-ID: <1315399654425-3796004.post@n4.nabble.com> Dear all, I estimate Fixed-effects model. Since it use "within" transformation, it also reports "within" SSR in summary report. The problem is that I need overall SSR. How can I quickly compute it? Thanks in advance! Best wishes, Igors -- View this message in context: http://r.789695.n4.nabble.com/Overall-SSR-in-plm-package-tp3796004p3796004.html Sent from the R help mailing list archive at Nabble.com. From knifeboot at 163.com Wed Sep 7 15:38:38 2011 From: knifeboot at 163.com (KnifeBoot) Date: Wed, 7 Sep 2011 21:38:38 +0800 (CST) Subject: [R] r-help volcano plot In-Reply-To: References: <146393c.e200.1323d534767.Coremail.knifeboot@163.com> Message-ID: <215a22d8.d23b.132441c9ed2.Coremail.knifeboot@163.com> Using R version 2.13.1 Windows 7 Ultimate 64-bit At 2011-09-06 14:51:09,"Hasan Diwan" wrote: >On 6 September 2011 08:01, KnifeBoot wrote: >> ?Can't installe packag maDB or limma. > >Which R version, and what platform are you using? >-- >Sent from my mobile device >Envoyait de mon telephone mobil From dadrivr at gmail.com Wed Sep 7 15:45:32 2011 From: dadrivr at gmail.com (dadrivr) Date: Wed, 7 Sep 2011 06:45:32 -0700 (PDT) Subject: [R] Problem with by statement for spaghetti plots In-Reply-To: <1315088680053-3788536.post@n4.nabble.com> References: <1315088680053-3788536.post@n4.nabble.com> Message-ID: <1315403132956-3796131.post@n4.nabble.com> Bump, please help! -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-by-statement-for-spaghetti-plots-tp3788536p3796131.html Sent from the R help mailing list archive at Nabble.com. From dadrivr at gmail.com Wed Sep 7 15:46:10 2011 From: dadrivr at gmail.com (dadrivr) Date: Wed, 7 Sep 2011 06:46:10 -0700 (PDT) Subject: [R] Change properties of line summary in interaction.plot In-Reply-To: <1315094013620-3788614.post@n4.nabble.com> References: <1315094013620-3788614.post@n4.nabble.com> Message-ID: <1315403170104-3796133.post@n4.nabble.com> Can anybody help me with this? -- View this message in context: http://r.789695.n4.nabble.com/Change-properties-of-line-summary-in-interaction-plot-tp3788614p3796133.html Sent from the R help mailing list archive at Nabble.com. From dadrivr at gmail.com Wed Sep 7 16:02:32 2011 From: dadrivr at gmail.com (dadrivr) Date: Wed, 7 Sep 2011 07:02:32 -0700 (PDT) Subject: [R] Reshaping data from wide to tall format for multilevel modeling Message-ID: <1315404152635-3796168.post@n4.nabble.com> Hi, I'm trying to reshape my data set from wide to tall format for multilevel modeling. Unfortunately, the function I typically use (make.univ from the multilevel package) does not appear to work with unbalanced data frames, which is what I'm dealing with. Below is an example of the columns of a data frame similar to what I'm working with: ID a1 a2 a4 b2 b3 b4 b5 b6 Below is what I want the columns to be after reshaping the data to long format: ID a b time Here is an example data frame that I want to reshape: ID <- c(1,2,3) a1 <- c(NA, rnorm(2)) a2 <- c(NA, rnorm(1), NA) a4 <- c(NA, rnorm(2)) b2 <- c(rnorm(2), NA) b3 <- rnorm(3) b4 <- NA b5 <- rnorm(3) b6 <- rnorm(3) mydata <- as.data.frame(cbind(ID,a1,a2,a4,b2,b3,b4,b5,b6)) What is the best way to do this efficiently with MANY variables with widely differing time ranges? Note that I will have to manually enter the time for a given measurement because in the wide format, the time is in the variable names. By the way, I have a fairly large data set, with some variables occurring at 2 time points and other variables occurring at 20 time points. Thanks for your help! -- View this message in context: http://r.789695.n4.nabble.com/Reshaping-data-from-wide-to-tall-format-for-multilevel-modeling-tp3796168p3796168.html Sent from the R help mailing list archive at Nabble.com. From jacksie at eden.rutgers.edu Wed Sep 7 16:52:50 2011 From: jacksie at eden.rutgers.edu (Jack Siegrist) Date: Wed, 7 Sep 2011 07:52:50 -0700 (PDT) Subject: [R] greyPalette() in legend plot In-Reply-To: References: Message-ID: <1315407170857-3796277.post@n4.nabble.com> Hi, Using legend as an argument within the call to barplot will automatically add the color key to the legend. I don't know how to add the colors using legend separately. Try this revised code: y <- matrix(c(46.5102,42.7561,45.7857,45.6047,45.7027,81.4565,69.8824,69.1333,67.56,57.8929,43.3019,42.9184, 51.7143,40.2727,51.7692,22.3871,24.2222,24.3226,23.6342,21.5833), ncol=4) se <- matrix(c(13.0707,12.0287,15.3949,16.5344,10.2520,28.6169,31.2398,24.5816,25.9745,22.9981,13.8433,17.3071, 21.7355,18.2007,26.2546,6.8199,7.2977,7.1245,7.2345,8.2616), ncol=4) plot <- barplot( y, beside = TRUE, ylim = c(0, 90), axis.lty = 0.5, xlab = "Light treatment", ylab = "root length (mm)", main = "Root", names.arg = c("White","Blue","Red","Far-red"), legend = c("Sedingehult", "Mobranna", "Aunasvare", "Fallan", "Ylinen") ) segments(plot, y - se, plot, y + se) -- View this message in context: http://r.789695.n4.nabble.com/greyPalette-in-legend-plot-tp3795594p3796277.html Sent from the R help mailing list archive at Nabble.com. From jtor14 at gmail.com Wed Sep 7 17:01:26 2011 From: jtor14 at gmail.com (Justin Haynes) Date: Wed, 7 Sep 2011 08:01:26 -0700 Subject: [R] reshaping data In-Reply-To: <1315406131128-3796246.post@n4.nabble.com> References: <1315406131128-3796246.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mi2kelgrum at yahoo.com Wed Sep 7 17:04:01 2011 From: mi2kelgrum at yahoo.com (Mikkel Grum) Date: Wed, 7 Sep 2011 08:04:01 -0700 (PDT) Subject: [R] process id of an R script Message-ID: <1315407841.19310.YahooMailNeo@web65702.mail.ac4.yahoo.com> I have a script that runs as a cron job every minute (on Ubuntu 10.10 and R 2.11.1), querying a database for new data. Most of the time it takes a few seconds to run, but once in while it takes more than a minute and the next run starts (on the same data) before the previous one has finished. In extreme cases this will fill up memory with a large number of runs of the same script on the same data. My 'solution' has been to create a process id file with the currently running script, first checking whether there is another process id file and whether that process is still running. I use the following code: pid <- max(system("pgrep -x R", intern = TRUE)) if (file.exists("/var/run/myscript.pid")) { rm(pid) pid <- read.table("/var/run/myscript.pid")[[1]] if (length(system(paste("ps -p", pid), intern = TRUE)) != 2) { stop("Myscript is already running in another process.") } else { pid <- max(system("pgrep -x R", intern = TRUE)) write(pid, "/var/run/myscript.pid") } } else { write(pid, "/var/run/myscript.pid") } ....my script ..... file.remove("/var/run/myscript.pid") #The End The trouble here is that I also have other R scripts running on the same system, so while?max(system("pgrep -x R", intern = TRUE)) will almost always give me the right pid, it is not guaranteed to work. There are two situations where it could fail: when the process id numbers round 32000 and start over again, and if another process starts up at the same time, the process ids could get swapped. Is there a way to query for the process id of the specific R script, rather than all R processes? Mikkel From michael.weylandt at gmail.com Wed Sep 7 17:04:20 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Wed, 7 Sep 2011 10:04:20 -0500 Subject: [R] (no subject) In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jvadams at usgs.gov Wed Sep 7 17:10:38 2011 From: jvadams at usgs.gov (Jean V Adams) Date: Wed, 7 Sep 2011 10:10:38 -0500 Subject: [R] Generalizing call to function In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wolfgang.viechtbauer at maastrichtuniversity.nl Wed Sep 7 17:14:18 2011 From: wolfgang.viechtbauer at maastrichtuniversity.nl (Viechtbauer Wolfgang (STAT)) Date: Wed, 7 Sep 2011 17:14:18 +0200 Subject: [R] suggestion for proportions In-Reply-To: References: <1315383072.86607.YahooMailNeo@web125805.mail.ne1.yahoo.com> <4E671B96020000CB00095795@med-webappgwia1.medicine.umaryland.edu> <077E31A57DA26E46AB0D493C9966AC730C359B463C@UM-MAIL4112.unimaas.nl> Message-ID: <077E31A57DA26E46AB0D493C9966AC730C359B468C@UM-MAIL4112.unimaas.nl> Indeed, the original post leaves some room for interpretation. In any case, I hope the OP has enough information now to figure out what approach is best for his data. Best, Wolfgang > -----Original Message----- > From: Bert Gunter [mailto:gunter.berton at gene.com] > Sent: Wednesday, September 07, 2011 16:47 > To: Viechtbauer Wolfgang (STAT) > Cc: r-help at r-project.org; John Sorkin > Subject: Re: [R] suggestion for proportions > > Wolfgang: > > On Wed, Sep 7, 2011 at 7:28 AM, Viechtbauer Wolfgang (STAT) > wrote: > > Acutally, > > > > ?mcnemar.test > > > > since it is paired data. > > Actually, it is unclear to me from the OP's message whether this is the > case. > > In one sentence the OP says that the _number_ of samples is the same, > and in the next he says that "essentially" the samples are the same. > So, as usual, imprecision in the problem description leads to > imprecision in the solution. > > But your point is well taken, of course. > > -- Bert From mailinglist.honeypot at gmail.com Wed Sep 7 17:19:13 2011 From: mailinglist.honeypot at gmail.com (Steve Lianoglou) Date: Wed, 7 Sep 2011 11:19:13 -0400 Subject: [R] predictive modeling and extremely large data In-Reply-To: <1315387554818-3795674.post@n4.nabble.com> References: <1315387554818-3795674.post@n4.nabble.com> Message-ID: Hi, On Wed, Sep 7, 2011 at 5:25 AM, Divyam wrote: > Hi, > > I am new to R and here is what I am doing in it now. I am using machine > learning technique (svm) to do predictive modeling. The data that I am using > is one that is bound to grow perpetually. what I want to know is, say, I fed > in a data set with 5000 data points to svm initially. The algorithm derives > a certain intelligence (i.e.,output) ?based on these 5000 data points. I > have an additional 10000 data points today. Now if i remove the first fed > 5000 data and then feed in this new additional 10000 data, I want the > algorithm to make use of the intelligence derived from the initial data(5000 > data points) too while evaluating the new delta data points(10000) and the > end result to be an aggregated measure of the total 15000 data. This is > important to me from an efficiency point of view. If there are any other > packages in r that does the same (i.e., enable statistical models to learn > from the past experience continuously while deleting the prior data used > from which the intelligence is derived) kindly post about them. This will be > of immense help to me. I'm not sure that I understand what you mean ... maybe because some of the terminology you are using is a bit nonstandard. If you want the predictive model you build to be "immediately effective" and learn from new data later, you can: (1) Train an SVM on the data you have now (ie. do it "offline"). Use this for future/new data. At some point in the future, retrain your SVM on all of the data you have available to you (or some subset of it) -- again, offline. You can see if your new SVM outperforms your old one on your new data to see where your point of diminishing returns is: when it stops making sense to try to learn a new model after you have x many data points already. (2) You can look into "online learning" methods -- search google for online svms and other online methods that might interest you (if you're not married to the SVM). For what it's worth, you mention "extremely large data," but not sure what you mean (certainly 10k datapoints isn't that). If you *really* mean "big data," and you want to explore online learning, take a look at vowpal wabbit: http://hunch.net/~vw/code.html https://github.com/JohnLangford/vowpal_wabbit That's not R though. The recent 1.0 release of the shogun-toolbox includes support for online learning, too (with vw I believe): http://www.shogun-toolbox.org/ It has an R interface of different flavors, but might be a bit painful to use through it (I'm working on making a better one on my spare time, but not too much of that lately). If the features in shogun strike your fancy, from what I understand the best supported way to use it is through its "python_modular" interface. Hope that helps, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact From dwinsemius at comcast.net Wed Sep 7 17:19:22 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 7 Sep 2011 11:19:22 -0400 Subject: [R] (no subject) In-Reply-To: References: Message-ID: <1E3B7D39-892C-469F-BDAB-C78318F28CEE@comcast.net> On Sep 7, 2011, at 6:04 AM, Varsha Agrawal wrote: > The code looks like this: > L1=list(a=1,b=2,c=3) > f1=as.factor(c) > L1[[f1]] returns 1 > > What happens if we give a factor as an index at a list? In a virgin session of R there would be no 'c' for the second line to work with. And since 'c' is a function you would probably get an error. > L1=list(a=1,b=2,c=3) > f1=as.factor(L1$c) > f1 [1] 3 Levels: 3 > L1[[f1]] #returns 1 [1] 1 It is ill-mannered to attach an object and not tell us that was done. So why is "3" as an index returning 1? because the internal representation of a factor was used by the parser. Furthermore even the usual dodge of wrapping as.character around f1 will not succeed since f1 didn't record where it got the value of "3", and there is no L1["3"] or L1[["3"]] only an L1[[3]] and an L1[3] > L1[[as.character(f1)]] NULL So you need the "full-court [factor] press": > L1[[as.numeric(as.character(f1))]] [1] 3 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From jvadams at usgs.gov Wed Sep 7 17:19:28 2011 From: jvadams at usgs.gov (Jean V Adams) Date: Wed, 7 Sep 2011 10:19:28 -0500 Subject: [R] What happens if we give a factor as an index at a list? In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Wed Sep 7 17:22:16 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 7 Sep 2011 11:22:16 -0400 Subject: [R] Change properties of line summary in interaction.plot In-Reply-To: <1315403170104-3796133.post@n4.nabble.com> References: <1315094013620-3788614.post@n4.nabble.com> <1315403170104-3796133.post@n4.nabble.com> Message-ID: On Sep 7, 2011, at 9:46 AM, dadrivr wrote: > Can anybody help me with this? > Your chances of a meaningful reply would increase considerably if you first read the Posting Guide and also followed the other guidance seen at the bottom of every rhelp posting. > -- > View this message in context: http://r.789695.n4.nabble.com/Change-properties-of-line-summary-in-interaction-plot-tp3788614p3796133.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From bps0002 at auburn.edu Wed Sep 7 17:24:01 2011 From: bps0002 at auburn.edu (B77S) Date: Wed, 7 Sep 2011 08:24:01 -0700 (PDT) Subject: [R] reshaping data In-Reply-To: References: <1315406131128-3796246.post@n4.nabble.com> Message-ID: <1315409041182-3796399.post@n4.nabble.com> The terminology (melt, cast, recast) just isn't intuitive to me; but I understand how to use melt now. Thanks! Justin Haynes wrote: > > look at the melt function in reshape, specifically ?melt.data.frame > > require(reshape) > Raw.melt<-melt(RawData,id.vars='Year',variable_name='Month') > > there is an additional feature in the melt function for handling na > values. > names(Raw.melt)[3]<-'CO2' > >> head(Raw.melt) > Year Month CO2 > 1 1958 J NA > 2 1959 J 315.58 > 3 1960 J 316.43 > 4 1961 J 316.89 > 5 1962 J 317.94 > 6 1963 J 318.74 >> > > you can order your data.frame if you'd like > > Raw.melt<-Raw.melt[order(Raw.melt$Year,Raw.melt$Month),] > >> head(Raw.melt) > Year Month CO2 > 1 1958 J NA > 48 1958 F NA > 95 1958 M 315.71 > 142 1958 A 317.45 > 189 1958 M.1 317.50 > 236 1958 J.1 NA > > > On Wed, Sep 7, 2011 at 7:35 AM, B77S <bps0002 at auburn.edu> wrote: > >> I have the following data (see RawData using dput below) >> >> How do I get it in the following 3 column format (CO2 measurements are >> the >> elements of the original data frame). I'm sure the package reshape is >> where >> I should look, but I haven't figured out how. >> >> Thanks ahead of time >> >> Month Year CO2 >> J 1958 >> F 1958 >> M 1958 315.71 >> A 1958 317.45 >> M.1 1958 317.5 >> J.1 1958 >> J.2 1958 315.86 >> A.1 1958 314.93 >> S 1958 313.19 >> O 1958 >> N 1958 313.34 >> D 1958 314.67 >> J 1959 315.58 >> F 1959 316.47 >> >> >> # here is the data >> >> RawData <- structure(list(Year = c(1958, 1959, 1960, 1961, 1962, 1963, >> 1964, >> 1965, 1966, 1967, 1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, >> 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, >> 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, >> 1998, 1999, 2000, 2001, 2002, 2003, 2004), J = c(NA, 315.58, >> 316.43, 316.89, 317.94, 318.74, 319.57, 319.44, 320.62, 322.33, >> 322.57, 324, 325.06, 326.17, 326.77, 328.54, 329.35, 330.4, 331.74, >> 332.92, 334.97, 336.23, 338.01, 339.23, 340.75, 341.37, 343.7, >> 344.97, 346.29, 348.02, 350.43, 352.76, 353.66, 354.72, 355.98, >> 356.7, 358.36, 359.96, 362.05, 363.18, 365.32, 368.15, 369.14, >> 370.28, 372.43, 374.68, 376.79), F = c(NA, 316.47, 316.97, 317.7, >> 318.56, 319.08, NA, 320.44, 321.59, 322.5, 323.15, 324.42, 325.98, >> 326.68, 327.63, 329.56, 330.71, 331.41, 332.56, 333.42, 335.39, >> 336.76, 338.36, 340.47, 341.61, 342.52, 344.51, 346, 346.96, >> 348.47, 351.72, 353.07, 354.7, 355.75, 356.72, 357.16, 358.91, >> 361, 363.25, 364, 366.15, 368.87, 369.46, 371.5, 373.09, 375.63, >> 377.37), M = c(315.71, 316.65, 317.58, 318.54, 319.69, 319.86, >> NA, 320.89, 322.39, 323.04, 323.89, 325.64, 326.93, 327.18, 327.75, >> 330.3, 331.48, 332.04, 333.5, 334.7, 336.64, 337.96, 340.08, >> 341.38, 342.7, 343.1, 345.28, 347.43, 347.86, 349.42, 352.22, >> 353.68, 355.39, 357.16, 357.81, 358.38, 359.97, 361.64, 364.03, >> 364.57, 367.31, 369.59, 370.52, 372.12, 373.52, 376.11, 378.41 >> ), A = c(317.45, 317.71, 319.03, 319.48, 320.58, 321.39, NA, >> 322.13, 323.7, 324.42, 325.02, 326.66, 328.13, 327.78, 329.72, >> 331.5, 332.65, 333.31, 334.58, 336.07, 337.76, 338.89, 340.77, >> 342.51, 343.56, 344.94, 347.08, 348.35, 349.55, 350.99, 353.59, >> 355.42, 356.2, 358.6, 359.15, 359.46, 361.26, 363.45, 364.72, >> 366.35, 368.61, 371.14, 371.66, 372.87, 374.86, 377.65, 380.52 >> ), M.1 = c(317.5, 318.29, 320.03, 320.58, 321.01, 322.24, 322.23, >> 322.16, 324.07, 325, 325.57, 327.38, 328.07, 328.92, 330.07, >> 332.48, 333.09, 333.96, 334.87, 336.74, 338.01, 339.47, 341.46, >> 342.91, 344.13, 345.75, 347.43, 348.93, 350.21, 351.84, 354.22, >> 355.67, 357.16, 359.34, 359.66, 360.28, 361.68, 363.79, 365.41, >> 366.79, 369.29, 371, 371.82, 374.02, 375.55, 378.35, 380.63), >> J.1 = c(NA, 318.16, 319.59, 319.78, 320.61, 321.47, 321.89, >> 321.87, 323.75, 324.09, 325.36, 326.7, 327.66, 328.57, 329.09, >> 332.07, 332.25, 333.59, 334.34, 336.27, 337.89, 339.29, 341.17, >> 342.25, 343.35, 345.32, 346.79, 348.25, 349.54, 351.25, 353.79, >> 355.13, 356.22, 358.24, 359.25, 359.6, 360.95, 363.26, 364.97, >> 365.62, 368.87, 370.35, 371.7, 373.3, 375.4, 378.13, 379.57 >> ), J.2 = c(315.86, 316.55, 318.18, 318.58, 319.61, 319.74, >> 320.44, 321.21, 322.4, 322.55, 324.14, 325.89, 326.35, 327.37, >> 328.05, 330.87, 331.18, 331.91, 333.05, 334.93, 336.54, 337.73, >> 339.56, 340.49, 342.06, 343.99, 345.4, 346.56, 347.94, 349.52, >> 352.39, 353.9, 354.82, 356.17, 357.03, 357.57, 359.55, 361.9, >> 363.65, 364.47, 367.64, 369.27, 370.12, 371.62, 374.02, 376.62, >> 377.79), A.1 = c(314.93, 314.8, 315.91, 316.79, 317.4, 317.77, >> 318.7, 318.87, 320.37, 320.92, 322.11, 323.67, 324.69, 325.43, >> 326.32, 329.31, 329.4, 330.06, 330.94, 332.75, 334.68, 336.09, >> 337.6, 338.43, 339.82, 342.39, 343.28, 344.69, 345.91, 348.1, >> 350.44, 351.67, 352.91, 354.03, 355, 355.52, 357.49, 359.46, >> 361.49, 362.51, 365.77, 366.94, 368.12, 369.55, 371.49, 374.5, >> 375.86), S = c(313.19, 313.84, 314.16, 314.99, 316.26, 316.21, >> 316.7, 317.81, 318.64, 319.26, 320.33, 322.38, 323.1, 323.36, >> 324.84, 327.51, 327.44, 328.56, 329.3, 331.58, 332.76, 333.91, >> 335.88, 336.69, 337.97, 339.86, 341.07, 343.09, 344.86, 346.44, >> 348.72, 349.8, 350.96, 352.16, 353.01, 353.7, 355.84, 358.06, >> 359.46, 360.19, 363.9, 364.63, 366.62, 367.96, 370.71, 372.99, >> 374.06), O = c(NA, 313.34, 313.83, 315.31, 315.42, 315.99, >> 316.87, 317.3, 318.1, 319.39, 320.25, 321.78, 323.07, 323.56, >> 325.2, 327.18, 327.37, 328.34, 328.94, 331.16, 332.54, 333.86, >> 336.01, 336.85, 337.86, 339.99, 341.35, 342.8, 344.17, 346.36, >> 348.88, 349.99, 351.18, 352.21, 353.31, 353.98, 355.99, 357.75, >> 359.6, 360.77, 364.23, 365.12, 366.73, 368.09, 370.24, 373, >> 374.24), N = c(313.34, 314.81, 315, 316.1, 316.69, 317.07, >> 317.68, 318.87, 319.79, 320.72, 321.32, 322.85, 324.01, 324.8, >> 326.5, 328.16, 328.46, 329.49, 330.31, 332.4, 333.92, 335.29, >> 337.1, 338.36, 339.26, 341.16, 342.98, 344.24, 345.66, 347.81, >> 350.07, 351.3, 352.83, 353.75, 354.16, 355.33, 357.58, 359.56, >> 360.76, 362.43, 365.46, 366.67, 368.29, 369.68, 372.08, 374.35, >> 375.86), D = c(314.67, 315.59, 316.19, 317.01, 317.69, 318.36, >> 318.71, 319.42, 321.03, 321.96, 322.9, 324.12, 325.13, 326.01, >> 327.55, 328.64, 329.58, 330.76, 331.68, 333.85, 334.95, 336.73, >> 338.21, 339.61, 340.49, 342.99, 344.22, 345.56, 346.9, 348.96, >> 351.34, 352.53, 354.21, 354.99, 355.4, 356.8, 359.04, 360.7, >> 362.33, 364.28, 366.97, 368.01, 369.53, 371.24, 373.78, 375.7, >> 377.48)), .Names = c("Year", "J", "F", "M", "A", "M.1", "J.1", >> "J.2", "A.1", "S", "O", "N", "D"), row.names = c(NA, -47L), class = >> "data.frame") >> >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/reshaping-data-tp3796246p3796246.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- View this message in context: http://r.789695.n4.nabble.com/reshaping-data-tp3796246p3796399.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Wed Sep 7 17:33:24 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 7 Sep 2011 11:33:24 -0400 Subject: [R] Problem with by statement for spaghetti plots In-Reply-To: <1315403132956-3796131.post@n4.nabble.com> References: <1315088680053-3788536.post@n4.nabble.com> <1315403132956-3796131.post@n4.nabble.com> Message-ID: <5163A5D6-2024-4920-91FD-D837E61C593A@comcast.net> On Sep 7, 2011, at 9:45 AM, dadrivr wrote: > Bump, please help! When I encounter one of these Nabble "bump" messages doing my moderation duties I simply discard it. This is not a website. This is a mailing list. If you didn't get an answer it suggests that your example was not sufficiently clear. Looking back on my mail-client it appears you offered a link to a commercial website with no clear data target. I (foolishly) followed that link and got only (hopefully) ... a blank page. Now read the Posting Guide and produce a workable example before "bumping" your request. > > -- > View this message in context: http://r.789695.n4.nabble.com/Problem-with-by-statement-for-spaghetti-plots-tp3788536p3796131.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From ccberry at ucsd.edu Wed Sep 7 17:35:21 2011 From: ccberry at ucsd.edu (Charles Berry) Date: Wed, 7 Sep 2011 15:35:21 +0000 Subject: [R] process id of an R script References: <1315407841.19310.YahooMailNeo@web65702.mail.ac4.yahoo.com> Message-ID: Mikkel Grum yahoo.com> writes: > > I have a script that runs as a cron job every minute (on Ubuntu 10.10 and R 2.11.1), querying a database for new > data. Most of the time it takes a few seconds to run, but once in while it takes more than a minute and the next > run starts (on the same data) before the previous one has finished. In extreme cases this will fill up > memory with a large number of runs of the same script on the same data. My 'solution' has been to create a > process id file with the currently running script, first checking whether there is another process id > file and whether that process is still running. I use the following code: > > pid <- max(system("pgrep -x R", intern = TRUE)) > if (file.exists("/var/run/myscript.pid")) { > rm(pid) > pid <- read.table("/var/run/myscript.pid")[[1]] > if (length(system(paste("ps -p", pid), intern = TRUE)) != 2) { > stop("Myscript is already running in another process.") > } else { > pid <- max(system("pgrep -x R", intern = TRUE)) > write(pid, "/var/run/myscript.pid") > } > } else { > write(pid, "/var/run/myscript.pid") > } > > ....my script ..... > > file.remove("/var/run/myscript.pid") > #The End > > The trouble here is that I also have other R scripts running on the same system, so > while?max(system("pgrep -x R", intern = TRUE)) will almost always give me the right pid, it is not > guaranteed to work. There are two situations where it could fail: when the process id numbers round 32000 > and start over again, and if another process starts up at the same time, the process ids could get swapped. > > Is there a way to query for the process id of the specific R script, rather than all R processes? Yes. Following the posting guide, you try something like ?process which will tell you to try ??process which will list base::Rdconv Utilities for Processing Rd Files base::Sys.getpid Get the Process ID of the R Session [rest deleted] Sys.getpid seems to be the relevant function. HTH, Chuck From spencer.graves at structuremonitoring.com Wed Sep 7 18:49:04 2011 From: spencer.graves at structuremonitoring.com (Spencer Graves) Date: Wed, 07 Sep 2011 09:49:04 -0700 Subject: [R] How does one start R within Emacs/ESS with root privileges? In-Reply-To: <4E675F72.7050605@erasmusmc.nl> References: <4E673131.9000201@erasmusmc.nl> <4E67331F.80005@knmi.nl> <4E675F72.7050605@erasmusmc.nl> Message-ID: <4E67A080.1040900@structuremonitoring.com> Under Vista and Windows 7, I install R in a local directory "pgms" so I never have to worry about permissions. Under Linux, I use "su" not "sudo". Then I call R and install.packages. Then I quit R and "su" and continue with what I want. hope this helps. Spencer On 9/7/2011 5:11 AM, Karl Brand wrote: > Cheers Paul. > > Its a very good point. Although i am curious how badly i can damage my > R install by running as root. I always ran R in windows with admin. > privileges without problems (touch wood). Probably best to never find > out by sticking with user privileges. > > However, even for taking care of R install/maint. i'd prefer to do > this interactively within Emacs rather than the terminal. Motivated by > this, i'd still like to find out how to invoke R with root privileges. > > I've also reposted the original email on perhaps a more appropriate > forum at: ESS-help at stat.math.ethz.ch > > Karl > > > On 2011-09-07 11:02, Paul Hiemstra wrote: >> On 09/07/2011 08:54 AM, Karl Brand wrote: >>> Esteemed UseRs and DevelopeRs, >>> >>> Apologies if this question belongs else where, but it does concern R's >>> package installation/maintenance. >>> >>> How does one start R within Emacs/ESS with root privileges? >>> >>> I tried without success: >>> >>>> M-x sudo R >>> >>> Why i'm motivated to do so: >>> >>> It seems logical to me, as the only user of the PC, to keep my R >>> library consolidated in the universal library rather than splitting >>> into universal and user libraries. Hence the desire to run R as root. >> >> Hi Karl, >> >> Why the need to install packages in root? As you are the only user there >> is not reason to install them system wide (to make them available to all >> users, which is just you). Installing the packages in your homedir >> solves your problem much easier, without the need to run R as root >> continuously. I think you should not run anything as root if it is not >> absolutely needed, which could potentially damage your system >> (accidentally overwriting something). >> >> hope this helps, >> Paul >> >>> >>> In addition, it's nice to be able to install packages 'on the fly' >>> when and as needed and not need to launch a separate R session (as >>> root) in the terminal just to install a package. >>> >>> Migrating from windows, i'm completey new to linux (ubuntu) and am >>> seeing for myself if Emacs/ESS is as good as its purported to be. So >>> maybe my motivation is nonsensical to expereinced ESS/R users. If so >>> i'd really appreciate tips on efficient package >>> installation/maintenance using Emacs/ESS. >>> >>> TIA, >>> >>> karl >>> > -- Spencer Graves, PE, PhD President and Chief Technology Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San Jos?, CA 95126 ph: 408-655-4567 web: www.structuremonitoring.com From bhh at xs4all.nl Wed Sep 7 18:54:21 2011 From: bhh at xs4all.nl (Berend Hasselman) Date: Wed, 7 Sep 2011 09:54:21 -0700 (PDT) Subject: [R] Generalizing call to function In-Reply-To: References: Message-ID: <1315414461910-3796637.post@n4.nabble.com> . wrote: > > Hello guys, > > I would like to ask for help to understand what is going on in > "func2". My plan is to generalize "func1", so that are expected same > results in "func2" as in "func1". Executing "func1" returns... > > 0.25 with absolute error < 8.4e-05 > > But for "func2" I get... > > Error in dpois(1, 0.1, 23.3065168689948, 0.000429064542600244, > 3.82988398013855, : > unused argument(s) (0.000429064542600244, 3.82988398013855, > 0.00261104515224461, 1.37999516465199, 0.0072464022020844, > 0.673787740945863, 0.0148414691931943, 0.383193602946711, > 0.0260964690514175, 0.236612585866545, 0.0422631787036055, > 0.152456705113438, 0.0655923922306948) > > Thanks in advance. > > func1 <- function(y, a, rate) { > f1 <- function(n, y, a, rate) { > lambda <- a * n > dexp(n, rate) * dpois(y, lambda) > } > integrate(f1, 0, Inf, y, a, rate) > } > > func1(1, 0.1, 0.1) > > > func2 <- function(y, a, rate, samp) { > f1 <- function(n, y, a, rate, samp) { > > SampDist <- function(y, a, n, samp) { > lambda <- a * n > dcom <- paste("d", samp, sep="") > dots <- as.list(c(y, lambda)) > do.call(dcom, dots) > } > > dexp(n, rate) * SampDist(y, a, n, samp) > } > integrate(f1, 0, Inf, y, a, rate, samp) > } > > func2(1, 0.1, 0.1, "pois") > You need to replace the line with "dots <- as.list(c(y, lambda))" with this dots <- list(y, lambda) You could also replace the two lines dots <- as.list(c(y, lambda)) do.call(dcom, dots) with the single line do.call(dcom, list(y, lambda)) See the description of the argument "args" of do.call. You should now get the same answer as for func1. Deciding if this is correct is up to you. Berend -- View this message in context: http://r.789695.n4.nabble.com/Generalizing-call-to-function-tp3793329p3796637.html Sent from the R help mailing list archive at Nabble.com. From juliet.hannah at gmail.com Wed Sep 7 19:02:24 2011 From: juliet.hannah at gmail.com (Juliet Hannah) Date: Wed, 7 Sep 2011 13:02:24 -0400 Subject: [R] error building package: packaging into .tar.gz failed In-Reply-To: References: Message-ID: To follow up (because I received a few emails off-list), things work now. I'm not sure what I did differently. In case it is helpful: I reinstalled R tools, and edited by path so that C:\Rtools\bin;C:\Rtools\perl\bin;C:\Rtools\MinGW\bin;C:\Program files\R\R-2.13.1\bin;C:\Program Files\HTML Help Workshop was at the beginning. With this, my attempts at package creation worked. On Thu, Jun 30, 2011 at 12:51 PM, Juliet Hannah wrote: > I am trying to build a package using windows xp. Here is the error I am getting: > > R CMD build myfunctions > > * checking for file 'myfunctions/DESCRIPTION' ... OK > * preparing 'myfunctions': > * checking DESCRIPTION meta-information ... OK > * checking for LF line-endings in source and make files > * checking for empty or unneeded directories > * building 'myfunctions_1.0.tar.gz' > ?ERROR > packaging into .tar.gz failed > > Could anyone suggest possible things to check? Thanks. > >> sessionInfo() > R version 2.13.0 (2011-04-13) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 ?LC_CTYPE=English_United > States.1252 ? ?LC_MONETARY=English_United States.1252 LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > From RachelL at kff.org Wed Sep 7 19:16:17 2011 From: RachelL at kff.org (Rachel Licata) Date: Wed, 7 Sep 2011 17:16:17 +0000 Subject: [R] 3-Way Crosstab using survey package Message-ID: <25C82EB10823D745A6EA0BA3153F72A80200C4@SVR-DC-VSEXC03.kff.org> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rvaradhan at jhmi.edu Wed Sep 7 19:37:03 2011 From: rvaradhan at jhmi.edu (Ravi Varadhan) Date: Wed, 7 Sep 2011 17:37:03 +0000 Subject: [R] Hessian matrix issue Message-ID: <2F9EA67EF9AE1C48A147CB41BE2E15C3069E4D@DOM-EB-MAIL2.win.ad.jhu.edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From david at revolutionanalytics.com Wed Sep 7 19:54:11 2011 From: david at revolutionanalytics.com (David Smith) Date: Wed, 7 Sep 2011 10:54:11 -0700 Subject: [R] Revolutions Blog: August Roundup Message-ID: I write about R every weekday at the Revolutions blog: http://blog.revolutionanalytics.com and every month I post a summary of articles from the previous month of particular interest to readers of r-help. In case you missed them, here are some articles related to R from the month of August: A contest to showcase applications of R for businesses is offering $20,000 in prizes from Revolution Analytics: http://bit.ly/qufEjy Three new open-source packages integrating R and Hadoop will be introduced by Revolution Analytics' CTO David Champagne in a webinar on September 21: http://bit.ly/n9V1mw Dirk Eddelbuettel will present live one-day master classes on programming with Rcpp in New York (Sep 24) and San Francisco (Oct 8): http://bit.ly/pXyjgY Luke Tierney announced at JSM that R 2.14 will be faster, with the byte compiler used for base and recommended packages: http://bit.ly/puVhJS Three Google employees talk at JSM about how they use R: http://bit.ly/qfXNrE Survey respondents at JSM consider themselves data scientists, expect usage of R and Revolution R to grow: http://bit.ly/obEbZ1 An open-source analyst profiles Revolution Analytics and remarks on big-data applications of R: http://bit.ly/pkB9gC An R user at ANZ bank in Australia talks about how he uses R for credit risk analysis: http://bit.ly/mZTRZc Two grad students at University of Michigan use R to determine what factors most influence the selection committee for the Hockey Hall of Fame: http://bit.ly/pnvbtZ FastCompany published an article on "telling stories with data", featuring two websites that often use R, FlowingData and the OkTrends blog: http://bit.ly/nLWIVA News from the Revolution Analytics August newsletter: http://bit.ly/nexaW6 You can install Emacs with the ESS interface to R on Windows and Macs in less than 2 minutes: http://bit.ly/ribZjy I gave a talk at useR! on the R Ecosystem: the R project, the R community, and companies using and working with R: http://bit.ly/nLatKo Brian Ripley gave some insights into R's development process, and the future of R, in his talk at useR!: http://bit.ly/qIhT0T A profile of R-core member Martyn Plummer: http://bit.ly/oN1Fuk Joseph Rickert uses the RevoScaleR package to look at the residuals from a large linear model: http://bit.ly/oNSMFU Roundups of various talks given at the useR! 2011 conference from me: http://bit.ly/oFMuXF and several other attendees: http://bit.ly/qqmLj7 In a tongue-in-cheek post, Business Intelligence analyst Steve Miller "complains" that there's too much new stuff in R: http://bit.ly/oQfuhE The rdatamarket package makes it easy to download more than 100M time series for use in R: http://bit.ly/ovCTHb , and there are many other packages to bring data into R as well: http://bit.ly/qsBq6c The slides and replay from the recent Revolution Analytics webinar, 100% R and More, are available for download: http://bit.ly/pGZYcD Jeroen Ooms' new project, OpenCPU, lets you embed live R graphics in web pages: http://bit.ly/nbj5Xc An analysis of the R source tree reveals that about 50% of R is written in C, while R packages on CRAN are about 50% R: http://bit.ly/pngg3S A new white paper by Norman Nie looks at the impact of statistical analysis methodology on working with Big Data: http://bit.ly/psdaTZ Other non-R-related stories in the past month included: a really bad infographic (http://bit.ly/mUDZm0 ), exploring abandoned metro tunnels (http://bit.ly/nkgDgH ), a stunning 360-degree view of space (http://bit.ly/p3cUk5 ) and parkour videos (http://bit.ly/njR0Ch ). There is a new R user group (http://bit.ly/eC5YQe ) at the University of Utah (http://bit.ly/oTo2xc ). Meeting times for these groups can be found on the updated R Community Calendar at: http://bit.ly/bb3naW If you're looking for more articles about R, you can find summaries from previous months at http://blog.revolutionanalytics.com/roundups/. Join the Revolution mailing list at http://revolutionanalytics.com/newsletter to be alerted to new articles on a monthly basis. As always, thanks for the comments and please keep sending suggestions to me at david at revolutionanalytics.com . Don't forget you can also follow the blog using an RSS reader like Google Reader, or by following me on Twitter (I'm @revodavid). Cheers, # David -- David M Smith VP of Marketing, Revolution Analytics? http://blog.revolutionanalytics.com Tel: +1 (650) 646-9523 (Palo Alto, CA, USA) From rvaradhan at jhmi.edu Wed Sep 7 20:30:25 2011 From: rvaradhan at jhmi.edu (Ravi Varadhan) Date: Wed, 7 Sep 2011 18:30:25 +0000 Subject: [R] Hessian matrix issue In-Reply-To: <2F9EA67EF9AE1C48A147CB41BE2E15C3069E4D@DOM-EB-MAIL2.win.ad.jhu.edu> References: <2F9EA67EF9AE1C48A147CB41BE2E15C3069E4D@DOM-EB-MAIL2.win.ad.jhu.edu> Message-ID: <2F9EA67EF9AE1C48A147CB41BE2E15C3069EC0@DOM-EB-MAIL2.win.ad.jhu.edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From zugi.young at gmail.com Wed Sep 7 20:57:11 2011 From: zugi.young at gmail.com (zugi young) Date: Wed, 7 Sep 2011 14:57:11 -0400 Subject: [R] reporting ANOVA for nested models Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jwiley.psych at gmail.com Wed Sep 7 21:35:38 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Wed, 7 Sep 2011 12:35:38 -0700 Subject: [R] Possible to access a USB volume by name in windows In-Reply-To: References: Message-ID: <2CC57576-D574-4A08-ACA6-AB0C33A73E42@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bjensvold at hawesfinancial.com Wed Sep 7 17:38:33 2011 From: bjensvold at hawesfinancial.com (Brian Jensvold) Date: Wed, 7 Sep 2011 08:38:33 -0700 Subject: [R] rpart/tree issue Message-ID: I am trying to create a classification tree using either tree or rpart but when it comes to plotting the results the formatting I get is different than what I see in all the tutorials. What I would like to see is the XX/XX format but all I get is a weird decimal value. I was also wondering how you know which is yes and which is no in each leaf of the tree? Is yes always on the left? ******************************** CONFIDENTIALITY NOTICE ******************************** This message (including any attachments) is intended only for the use of the individual or entity to which it is addressed and may contain information that is non-public, proprietary, privileged, confidential, and exempt from disclosure under applicable law or may constitute as attorney work product. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, notify us immediately by telephone at (541)343-5641 and (i) destroy this message if a facsimile or (ii) delete this message immediately if this is an electronic communication. Thank you. ******************************** From berryboessenkool at hotmail.com Wed Sep 7 19:53:51 2011 From: berryboessenkool at hotmail.com (Berry Boessenkool) Date: Wed, 7 Sep 2011 19:53:51 +0200 Subject: [R] access objects Message-ID: hi, say I have consecutively numbered objects obj1, obj2, ... in my R workspace. I want to acces one of them inside a function, with the number given as an argument. Where can I find help on how to do that? Somebody must have been trying to do this before... Some keywords to start a search are appreciated as well. Here's an example, I hope it clarifies what I'm trying to do: obj1 <- 7:9 obj2 <- 6:2 testf <- function(k) plot(? noquote(paste("obj", k, sep=""))? ) testf(1) # should plot obj1 ------------------------------------- Berry Boessenkool D-14476 Potsdam (OT Golm) ------------------------------------- From sbearer at TNC.ORG Wed Sep 7 19:59:20 2011 From: sbearer at TNC.ORG (Scott Bearer) Date: Wed, 7 Sep 2011 13:59:20 -0400 Subject: [R] Subsetting does not remove unwanted data in table Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dadrivr at gmail.com Wed Sep 7 21:34:42 2011 From: dadrivr at gmail.com (dadrivr) Date: Wed, 7 Sep 2011 12:34:42 -0700 (PDT) Subject: [R] Problem with by statement for spaghetti plots In-Reply-To: <5163A5D6-2024-4920-91FD-D837E61C593A@comcast.net> References: <1315088680053-3788536.post@n4.nabble.com> <1315403132956-3796131.post@n4.nabble.com> <5163A5D6-2024-4920-91FD-D837E61C593A@comcast.net> Message-ID: <1315424082175-3797025.post@n4.nabble.com> Sorry, I thought the link would work for people because it is a public link and it works for me when I run it in R. Anyways, here is an example set of data that I am having trouble with: /id <- c(230017,230017,230017,230018,230018,230018,230019,230019,230019,230020,230020, 230020,230021,230021,230021,230022,230022,230022,230023,230023,230023,230024, 230024,230024,230025,230025,230025,230026,230026,230026) age <- rep(c(30,36,42),10) outcome <- c(12,17,10,5,5,2,NA,NA,NA,8,6,5,11,13,10,15,11,15,13,NA,9,0,0,0,20, 14,16,1,2,2) mydata <- as.data.frame(cbind(id,age,outcome)) fit <- by(mydata, mydata$id, function(x) fitted.values(lm(outcome ~ age, data=x))) fit1 <- unlist(fit) names(fit1) <- NULL interaction.plot(mydata$age, mydata$id, fit1,legend=F)/ Note that the following works fine: /lm(outcome ~ age, data=mydata)/ Any help would be greatly appreciated. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-by-statement-for-spaghetti-plots-tp3788536p3797025.html Sent from the R help mailing list archive at Nabble.com. From dadrivr at gmail.com Wed Sep 7 21:46:23 2011 From: dadrivr at gmail.com (dadrivr) Date: Wed, 7 Sep 2011 12:46:23 -0700 (PDT) Subject: [R] Change properties of line summary in interaction.plot In-Reply-To: References: <1315094013620-3788614.post@n4.nabble.com> <1315403170104-3796133.post@n4.nabble.com> Message-ID: <1315424783131-3797044.post@n4.nabble.com> Here's an example: /id <- c(17,17,17,18,18,18,19,19,19,20,20,20,21,21,21,22,22,22,23,23,23,24, 24,24,25,25,25,26,26,26) age <- rep(c(30,36,42),10) outcome <- c(12,17,10,5,5,2,NA,NA,NA,8,6,5,11,13,10,15,11,15,13,NA,9,0,0,0,20, 14,16,1,2,2) mydata <- as.data.frame(cbind(id,age,outcome)) interaction.plot(mydata$age, mydata$id, mydata$outcome, fun = mean, legend = FALSE, lty = 1, xtick = TRUE, type = "l")/ How can I make the 'mean' summary line red and thicker? Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Change-properties-of-line-summary-in-interaction-plot-tp3788614p3797044.html Sent from the R help mailing list archive at Nabble.com. From michael.grant at Colorado.EDU Wed Sep 7 21:46:28 2011 From: michael.grant at Colorado.EDU (Michael Grant) Date: Wed, 7 Sep 2011 13:46:28 -0600 Subject: [R] Testing non-exhaustive Null and Alternative Hypothesis Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From gleynes+r at gmail.com Wed Sep 7 17:30:11 2011 From: gleynes+r at gmail.com (Gene Leynes) Date: Wed, 7 Sep 2011 10:30:11 -0500 Subject: [R] Possible to access a USB volume by name in windows In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From john.4man at gmail.com Wed Sep 7 20:45:59 2011 From: john.4man at gmail.com (John Foreman) Date: Wed, 7 Sep 2011 14:45:59 -0400 Subject: [R] randomForest memory footprint Message-ID: Hello, I am attempting to train a random forest model using the randomForest package on 500,000 rows and 8 columns (7 predictors, 1 response). The data set is the first block of data from the UCI Machine Learning Repo dataset "Record Linkage Comparison Patterns" with the slight modification that I dropped two columns with lots of NA's and I used knn imputation to fill in other gaps. When I load in my dataset, R uses no more than 100 megs of RAM. I'm running a 64-bit R with ~4 gigs of RAM available. When I execute the randomForest() function, however I get memory complaints. Example: > summary(mydata1.clean[,3:10]) cmp_fname_c1 cmp_lname_c1 cmp_sex cmp_bd cmp_bm cmp_by cmp_plz is_match Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.00000 FALSE:572820 1st Qu.:0.2857 1st Qu.:0.1000 1st Qu.:1.0000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.00000 TRUE : 2093 Median :1.0000 Median :0.1818 Median :1.0000 Median :0.0000 Median :0.0000 Median :0.0000 Median :0.00000 Mean :0.7127 Mean :0.3156 Mean :0.9551 Mean :0.2247 Mean :0.4886 Mean :0.2226 Mean :0.00549 3rd Qu.:1.0000 3rd Qu.:0.4286 3rd Qu.:1.0000 3rd Qu.:0.0000 3rd Qu.:1.0000 3rd Qu.:0.0000 3rd Qu.:0.00000 Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.00000 > mydata1.rf.model2 <- randomForest(x = mydata1.clean[,3:9],y=mydata1.clean[,10],ntree=100) Error: cannot allocate vector of size 877.2 Mb In addition: Warning messages: 1: In dim(data) <- dim : Reached total allocation of 3992Mb: see help(memory.size) 2: In dim(data) <- dim : Reached total allocation of 3992Mb: see help(memory.size) 3: In dim(data) <- dim : Reached total allocation of 3992Mb: see help(memory.size) 4: In dim(data) <- dim : Reached total allocation of 3992Mb: see help(memory.size) Other techniques such as boosted trees handle the data size just fine. Are there any parameters I can adjust such that I can use a value of 100 or more for ntree? Thanks, John From kristian.langgaard.lind at gmail.com Wed Sep 7 21:50:12 2011 From: kristian.langgaard.lind at gmail.com (Kristian Lind) Date: Wed, 7 Sep 2011 21:50:12 +0200 Subject: [R] Imposing Feller condition using project constraint in spg Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bbolker at gmail.com Wed Sep 7 21:52:43 2011 From: bbolker at gmail.com (Ben Bolker) Date: Wed, 7 Sep 2011 19:52:43 +0000 Subject: [R] Subsetting does not remove unwanted data in table References: Message-ID: Scott Bearer TNC.ORG> writes: > > Dear all, > > This relatively routine analysis has left me frustrated and in a rut. I > have a dataset (data1), which I subset in order to remove rows where > HabitatDensity="Med". This dataset looks correct when I call it up, > however, when I create a table out of the new subset (data2), my table > continues to show the "Med" information as 0. assuming you're using a recent version of R (2.13.+), ?droplevels otherwise library(gdata) ?drop.levels (you probably want reorder=FALSE) From michael.weylandt at gmail.com Wed Sep 7 21:54:48 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt) Date: Wed, 7 Sep 2011 14:54:48 -0500 Subject: [R] access objects In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Wed Sep 7 22:16:33 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 7 Sep 2011 16:16:33 -0400 Subject: [R] Change properties of line summary in interaction.plot In-Reply-To: <1315424783131-3797044.post@n4.nabble.com> References: <1315094013620-3788614.post@n4.nabble.com> <1315403170104-3796133.post@n4.nabble.com> <1315424783131-3797044.post@n4.nabble.com> Message-ID: On Sep 7, 2011, at 3:46 PM, dadrivr wrote: > Here's an example: > /id <- > c(17,17,17,18,18,18,19,19,19,20,20,20,21,21,21,22,22,22,23,23,23,24, > 24,24,25,25,25,26,26,26) > age <- rep(c(30,36,42),10) > outcome <- > c(12,17,10,5,5,2,NA,NA,NA,8,6,5,11,13,10,15,11,15,13,NA,9,0,0,0,20, > 14,16,1,2,2) > mydata <- as.data.frame(cbind(id,age,outcome)) > interaction.plot(mydata$age, mydata$id, mydata$outcome, fun = mean, > legend = > FALSE, lty = 1, xtick = TRUE, type = "l")/ > > How can I make the 'mean' summary line red and thicker? What "mean summary line"? I count 8 lines and that matches the number if id's with complete data. > > Thanks! > > -- > View this message in context: http://r.789695.n4.nabble.com/Change-properties-of-line-summary-in-interaction-plot-tp3788614p3797044.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From murdoch.duncan at gmail.com Wed Sep 7 22:25:57 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Wed, 07 Sep 2011 16:25:57 -0400 Subject: [R] Testing non-exhaustive Null and Alternative Hypothesis In-Reply-To: References: Message-ID: <4E67D355.50504@gmail.com> On 07/09/2011 3:46 PM, Michael Grant wrote: > I wish to test the hypothesis of mu equal to or less than 5 against the specific alternative mu equal to or greater than 7. I am unable to find how to persuade R to do this with any function (e.g. t.test). Suggestions? According to the standard hypothesis testing theory, that test is identical to the alternative of mu > 5, so t.test would be fine (if it's okay at all, you don't say distributional assumptions). If you want to use a different basis for the test, you need to say what it is. Duncan Murdoch From djmuser at gmail.com Wed Sep 7 23:07:21 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Wed, 7 Sep 2011 14:07:21 -0700 Subject: [R] Problem with by statement for spaghetti plots In-Reply-To: <1315424082175-3797025.post@n4.nabble.com> References: <1315088680053-3788536.post@n4.nabble.com> <1315403132956-3796131.post@n4.nabble.com> <5163A5D6-2024-4920-91FD-D837E61C593A@comcast.net> <1315424082175-3797025.post@n4.nabble.com> Message-ID: Hi: Your code doesn't work, but if you want to do spaghetti plots, they're rather easy to do in ggplot2 or lattice. The first problem you have is that one group in your test data has all NA responses, so by() or any other groupwise summarization function is going to choke unless you give it a way to bypass that group. In other words, you need to write a function that checks for bad cases and then 'skips over' them if found. I needed some practice in using try(), so I gave this a shot. id <- c(230017,230017,230017,230018,230018,230018,230019,230019,230019,230020,230020, 230020,230021,230021,230021,230022,230022,230022,230023,230023,230023,230024, 230024,230024,230025,230025,230025,230026,230026,230026) age <- rep(c(30,36,42), 10) outcome <- c(12,17,10,5,5,2,NA,NA,NA,8,6,5,11,13,10,15,11,15,13,NA,9,0,0,0,20, 14,16,1,2,2) mydata <- data.frame(id,age,outcome) # Checks to make sure that each group has at least two non-NA observations, # otherwise it can't fit a line. The code assumes that the x-variable has no NAs; # if this is not the case in your real data, then you need to check that at least # two observations (i.e., (x, y) pairs) have no NAs in each subgroup. checklm <- function(d) { if(sum(is.na(d$outcome)) > 1) stop('not enough values to fit a line') lm(outcome ~ age, data = d) } # Now use the plyr package to run the models; the function to be applied # checks for fealty before fitting the model; if the error message is thrown, # it returns an error message with class 'try-error' as the output for that list # component. library('plyr') # takes a data frame as input, a list of model objects as output # id is the grouping variable. In the function, d represents the # sub-data frame associated with a particular id u <- dlply(mydata, .(id), function(d) try(checklm(d), TRUE)) # remove the 'bad' components v <- u[!sapply(u, function(x) class(x) == 'try-error')] # return the age, outcome and fitted values for each group w <- ldply(v, function(d) data.frame(d$model, yhat = fitted(d))) head(w) # Use ggplot2 and lattice to do the spaghetti plots: # the group variable allows individual line plots by id library('ggplot2') ggplot(w, aes(x = age, y = outcome, group = .id)) + geom_line() library('lattice') xyplot(outcome ~ age, data = w, group = .id, type = 'l', col = 1) You can substitute yhat for outcome if you wish. HTH, Dennis On Wed, Sep 7, 2011 at 12:34 PM, dadrivr wrote: > Sorry, I thought the link would work for people because it is a public link > and it works for me when I run it in R. ?Anyways, here is an example set of > data that I am having trouble with: > > /id <- > c(230017,230017,230017,230018,230018,230018,230019,230019,230019,230020,230020, > > 230020,230021,230021,230021,230022,230022,230022,230023,230023,230023,230024, > ? ? ? ?230024,230024,230025,230025,230025,230026,230026,230026) > age <- rep(c(30,36,42),10) > outcome <- > c(12,17,10,5,5,2,NA,NA,NA,8,6,5,11,13,10,15,11,15,13,NA,9,0,0,0,20, > ? ? ? ? ? ? 14,16,1,2,2) > mydata <- as.data.frame(cbind(id,age,outcome)) > > fit <- by(mydata, mydata$id, function(x) fitted.values(lm(outcome ~ age, > data=x))) > fit1 <- unlist(fit) > names(fit1) <- NULL > interaction.plot(mydata$age, mydata$id, fit1,legend=F)/ > > Note that the following works fine: > /lm(outcome ~ age, data=mydata)/ > > Any help would be greatly appreciated. ?Thanks! > > -- > View this message in context: http://r.789695.n4.nabble.com/Problem-with-by-statement-for-spaghetti-plots-tp3788536p3797025.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From daniel at umd.edu Wed Sep 7 23:19:11 2011 From: daniel at umd.edu (Daniel Malter) Date: Wed, 7 Sep 2011 14:19:11 -0700 (PDT) Subject: [R] reporting ANOVA for nested models In-Reply-To: References: Message-ID: <1315430351822-3797302.post@n4.nabble.com> fit1, the unrestricted model, includes 1 more regressor than fit2, the restricted model. Testing the models against each other means that fit2 is "equal to" fit1, assuming that the coefficient on the additional regressor that is included in fit1 is restricted to zero. Your F-test is thus whether this one extra regressor is significantly different from zero, and the F-test is therefore on 1 and n-k degrees of freedom, where 1 is the number of restrictions, n is the number of observations and k is the number of coefficients in the unrestricted model. So unless you do something a little more "spectacular" than just adding one regressor and checking its significance, this is understood I would say. That said, I am not a psychologist. HTH, Daniel zugi young wrote: > > I have the following results for an ANOVA comparing two nested models. I > wasn't sure how I am supposed to report this result in the area of > psychology. Specifically, am I supposed to report the DF's or just the F > ratio? I could manually calculate the degrees of freedoms, but there must > be > a reason why R does not give this information, i.e. those are not > conventionally used in the reporting? > > Any pointers would be greatly appreciated. > >> anova(fit1, fit2) > Analysis of Variance Table > > Model 1: fit1 > Model 2: fit2 > Res.Df RSS Df Sum of Sq F Pr(>F) > 1 373 19.908 > 2 374 30.717 -1 -10.809 202.53 < 2.2e-16 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- View this message in context: http://r.789695.n4.nabble.com/reporting-ANOVA-for-nested-models-tp3796957p3797302.html Sent from the R help mailing list archive at Nabble.com. From saptarshi.guha at gmail.com Wed Sep 7 23:20:57 2011 From: saptarshi.guha at gmail.com (Saptarshi Guha) Date: Wed, 7 Sep 2011 14:20:57 -0700 Subject: [R] Using substitute on a function parameter Message-ID: Hello, I would like to write a function where substitute operates on the parameter, but ... > Expression = function(o,l) substitute(o, l) > Expression({x=.(FOO)}, list(FOO=2)) o How do i get substitute to work on the contents of o. Regards Saptarshi From wdunlap at tibco.com Wed Sep 7 23:29:48 2011 From: wdunlap at tibco.com (William Dunlap) Date: Wed, 7 Sep 2011 21:29:48 +0000 Subject: [R] Using substitute on a function parameter In-Reply-To: References: Message-ID: I have used do.call("substitute", ...) to work around the fact that substitute does not evaluate its first argument: R> z <- quote(func(arg)) R> do.call("substitute", list(z, list(func=quote(myFunction), arg=as.name("myArgument")))) myFunction(myArgument) S+'s substitute (following S version 4) has a third argument, evaluate, which controls whether the first argument is evaluated or not: S+> z <- quote(func(arg)) S+> substitute(z, list(func=quote(myFunction), arg=as.name("myArgument")), evaluate=TRUE) myFunction(myArgument) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Saptarshi Guha > Sent: Wednesday, September 07, 2011 2:21 PM > To: R-help at r-project.org > Subject: [R] Using substitute on a function parameter > > Hello, > > I would like to write a function where substitute operates on the > parameter, but ... > > > > Expression = function(o,l) substitute(o, l) > > Expression({x=.(FOO)}, list(FOO=2)) > o > > How do i get substitute to work on the contents of o. > > Regards > Saptarshi > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From djmuser at gmail.com Wed Sep 7 23:32:15 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Wed, 7 Sep 2011 14:32:15 -0700 Subject: [R] Problem with by statement for spaghetti plots In-Reply-To: References: <1315088680053-3788536.post@n4.nabble.com> <1315403132956-3796131.post@n4.nabble.com> <5163A5D6-2024-4920-91FD-D837E61C593A@comcast.net> <1315424082175-3797025.post@n4.nabble.com> Message-ID: Hi: Having just seen your follow-up question, here's how it could be done in ggplot2 or lattice: # ggplot2 version: ggplot(w, aes(x = age, y = outcome)) + geom_line(aes(group = .id)) + stat_summary(fun.y = 'mean', geom = 'line', color = 'blue', size = 2) # lattice version: xyplot(outcome ~ age, data = w, group = .id, panel = function(x, y, groups, ...) { panel.xyplot(x, y, groups = groups, type = 'l', col = 1, ...) panel.lmline(x, y, col = 'blue', lwd = 2) } ) Dennis On Wed, Sep 7, 2011 at 2:07 PM, Dennis Murphy wrote: > Hi: > > Your code doesn't work, but if you want to do spaghetti plots, they're > rather easy to do in ggplot2 or lattice. > > The first problem you have is that one group in your test data has all > NA responses, so by() or any other groupwise summarization function is > going to choke unless you give it a way to bypass that group. In other > words, you need to write a function that checks for bad cases and then > 'skips over' them if found. I needed some practice in using try(), so > I gave this a shot. > > id <- c(230017,230017,230017,230018,230018,230018,230019,230019,230019,230020,230020, > ? ? ? ?230020,230021,230021,230021,230022,230022,230022,230023,230023,230023,230024, > ? ? ? ?230024,230024,230025,230025,230025,230026,230026,230026) > age <- rep(c(30,36,42), 10) > outcome <- c(12,17,10,5,5,2,NA,NA,NA,8,6,5,11,13,10,15,11,15,13,NA,9,0,0,0,20, > ? ? ? ? ? ? 14,16,1,2,2) > mydata <- data.frame(id,age,outcome) > > # Checks to make sure that each group has at least two non-NA observations, > # otherwise it can't fit a line. The code assumes that the x-variable > has no NAs; > # if this is not the case in your real data, then you need to check > that at least > # two observations (i.e., (x, y) pairs) have no NAs in each subgroup. > checklm <- function(d) { > ? ?if(sum(is.na(d$outcome)) > 1) stop('not enough values to fit a line') > ? ?lm(outcome ~ age, data = d) > ? } > > # Now use the plyr package to run the models; the function to be applied > # checks for fealty before fitting the model; if the error message is thrown, > # it returns an error message with class 'try-error' as the output for that list > # component. > > library('plyr') > # takes a data frame as input, a list of model objects as output > # id is the grouping variable. In the function, d represents the > # sub-data frame associated with a particular id > u <- dlply(mydata, .(id), function(d) try(checklm(d), TRUE)) > # remove the 'bad' components > v <- u[!sapply(u, function(x) class(x) == 'try-error')] > # return the age, outcome and fitted values for each group > w <- ldply(v, function(d) data.frame(d$model, yhat = fitted(d))) > head(w) > > # Use ggplot2 and lattice to do the spaghetti plots: > # the group variable allows individual line plots by id > > library('ggplot2') > ggplot(w, aes(x = age, y = outcome, group = .id)) + geom_line() > > library('lattice') > xyplot(outcome ~ age, data = w, group = .id, type = 'l', col = 1) > > You can substitute yhat for outcome if you wish. > > HTH, > Dennis > > On Wed, Sep 7, 2011 at 12:34 PM, dadrivr wrote: >> Sorry, I thought the link would work for people because it is a public link >> and it works for me when I run it in R. ?Anyways, here is an example set of >> data that I am having trouble with: >> >> /id <- >> c(230017,230017,230017,230018,230018,230018,230019,230019,230019,230020,230020, >> >> 230020,230021,230021,230021,230022,230022,230022,230023,230023,230023,230024, >> ? ? ? ?230024,230024,230025,230025,230025,230026,230026,230026) >> age <- rep(c(30,36,42),10) >> outcome <- >> c(12,17,10,5,5,2,NA,NA,NA,8,6,5,11,13,10,15,11,15,13,NA,9,0,0,0,20, >> ? ? ? ? ? ? 14,16,1,2,2) >> mydata <- as.data.frame(cbind(id,age,outcome)) >> >> fit <- by(mydata, mydata$id, function(x) fitted.values(lm(outcome ~ age, >> data=x))) >> fit1 <- unlist(fit) >> names(fit1) <- NULL >> interaction.plot(mydata$age, mydata$id, fit1,legend=F)/ >> >> Note that the following works fine: >> /lm(outcome ~ age, data=mydata)/ >> >> Any help would be greatly appreciated. ?Thanks! >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Problem-with-by-statement-for-spaghetti-plots-tp3788536p3797025.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > From dwinsemius at comcast.net Wed Sep 7 23:32:48 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 7 Sep 2011 17:32:48 -0400 Subject: [R] Using substitute on a function parameter In-Reply-To: References: Message-ID: <5854ABD2-966E-4828-B376-9550F49738B7@comcast.net> On Sep 7, 2011, at 5:20 PM, Saptarshi Guha wrote: > Hello, > > I would like to write a function where substitute operates on the > parameter, but ... > > >> Expression = function(o,l) substitute(o, l) >> Expression({x=.(FOO)}, list(FOO=2)) > o > > How do i get substitute to work on the contents of o. I suggest you look at the code of 'bquote'. You seem to be reinventing the wheel. > > Regards > Saptarshi > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From pdalgd at gmail.com Wed Sep 7 23:35:51 2011 From: pdalgd at gmail.com (peter dalgaard) Date: Wed, 7 Sep 2011 23:35:51 +0200 Subject: [R] reporting ANOVA for nested models In-Reply-To: References: Message-ID: <60FFB9D5-CB1A-4F41-B5C9-4E34233769E0@gmail.com> On Sep 7, 2011, at 20:57 , zugi young wrote: > I have the following results for an ANOVA comparing two nested models. I > wasn't sure how I am supposed to report this result in the area of > psychology. Specifically, am I supposed to report the DF's or just the F > ratio? I could manually calculate the degrees of freedoms, but there must be > a reason why R does not give this information, i.e. those are not > conventionally used in the reporting? > > Any pointers would be greatly appreciated. What do you mean by "compute" degrees of freedom? It would seem rather clear from the output that the F test is on (1, 373) d.f. (the denominator d.f. is taken from the largest of the two models). As for what to reporting the d.f., some journals do like to see them, mostly to allow the reader to sanity-check the results. In models which contain variance components (or should have contained them), the denominator d.f. can be revealing. As can they if you committed everyone's favorite blunder, category codes used as quantitative variables. >> anova(fit1, fit2) > Analysis of Variance Table > > Model 1: fit1 > Model 2: fit2 > Res.Df RSS Df Sum of Sq F Pr(>F) > 1 373 19.908 > 2 374 30.717 -1 -10.809 202.53 < 2.2e-16 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com "D?den skal tape!" --- Nordahl Grieg From arrayprofile at yahoo.com Wed Sep 7 21:53:22 2011 From: arrayprofile at yahoo.com (array chip) Date: Wed, 7 Sep 2011 12:53:22 -0700 (PDT) Subject: [R] suggestion for proportions In-Reply-To: <077E31A57DA26E46AB0D493C9966AC730C359B468C@UM-MAIL4112.unimaas.nl> References: <1315383072.86607.YahooMailNeo@web125805.mail.ne1.yahoo.com> <4E671B96020000CB00095795@med-webappgwia1.medicine.umaryland.edu> <077E31A57DA26E46AB0D493C9966AC730C359B463C@UM-MAIL4112.unimaas.nl> <077E31A57DA26E46AB0D493C9966AC730C359B468C@UM-MAIL4112.unimaas.nl> Message-ID: <1315425202.24963.YahooMailNeo@web125819.mail.ne1.yahoo.com> Hi all, thanks very much for sharing your thoughts. and sorry for my describing the problem not clearly, my fault. My data is paired, that is 2 different diagnostic tests were performed on the same individuals. Each individual will have a test results from each of the 2 tests. Then in the end, 2 accuracy rates were calculated for the 2 tests. And I want to test if there is a significant difference in the accuracy (proportion) between the 2 tests. My understanding is that prop.test() is appropriate for 2 independent proportions, ?whereas in my situation, the 2 proportions are not independent calculated from "paired" data, right? the data would look like: pid ? test1 ? ?test2 p1 ? ? ?1 ? ? ? ? 0 p2 ? ? ?1 ? ? ? ? 1 p3 ? ? ?0 ? ? ? ? 1 : : 1=test is correct; 0=not correct from the data above, we can calculate accuracy for test1 and test2, then to compare.... So mcnemar.test() is good for that, right? Thanks John ----- Original Message ----- From: Viechtbauer Wolfgang (STAT) To: "r-help at r-project.org" Cc: Bert Gunter Sent: Wednesday, September 7, 2011 8:14 AM Subject: Re: [R] suggestion for proportions Indeed, the original post leaves some room for interpretation. In any case, I hope the OP has enough information now to figure out what approach is best for his data. Best, Wolfgang > -----Original Message----- > From: Bert Gunter [mailto:gunter.berton at gene.com] > Sent: Wednesday, September 07, 2011 16:47 > To: Viechtbauer Wolfgang (STAT) > Cc: r-help at r-project.org; John Sorkin > Subject: Re: [R] suggestion for proportions > > Wolfgang: > > On Wed, Sep 7, 2011 at 7:28 AM, Viechtbauer Wolfgang (STAT) > wrote: > > Acutally, > > > > ?mcnemar.test > > > > since it is paired data. > > Actually, it is unclear to me from the OP's message whether this is the > case. > > In one sentence the OP says that the _number_ of samples is the same, > and in the next he says that "essentially" the samples are the same. > So, as usual, imprecision in the problem description leads to > imprecision in the solution. > > But your point is well taken, of course. > > -- Bert ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From donsmark at gmail.com Wed Sep 7 22:09:19 2011 From: donsmark at gmail.com (Jesper) Date: Wed, 7 Sep 2011 13:09:19 -0700 (PDT) Subject: [R] Application of results from smooth.spline outside R In-Reply-To: References: <1314259389525-3767610.post@n4.nabble.com> Message-ID: <1315426159713-3797118.post@n4.nabble.com> Hi Jean, Thanks for the reply.. Using your suggestion, I end up in in the source code (Fortran 77 i believe). At first look, it seems a bit more tedious to implement than I expected. -- View this message in context: http://r.789695.n4.nabble.com/Application-of-results-from-smooth-spline-outside-R-tp3767610p3797118.html Sent from the R help mailing list archive at Nabble.com. From shuklvineet at gmail.com Wed Sep 7 23:07:36 2011 From: shuklvineet at gmail.com (Vineet Shukla) Date: Wed, 7 Sep 2011 16:07:36 -0500 Subject: [R] finding events in a time duration. Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From R.I.Maidment at pgr.reading.ac.uk Wed Sep 7 23:25:11 2011 From: R.I.Maidment at pgr.reading.ac.uk (Ross Maidment) Date: Wed, 7 Sep 2011 21:25:11 +0000 Subject: [R] Editing the variables attributes section in the netCDF header of netCDF files created using the package ncdf. Message-ID: <72EEC39730D50046AAFEB719280AAF6A03380320@DB3PRD0104MB105.eurprd01.prod.exchangelabs.com> Hi, I am using the package ncdf to create netCDF files and I want to mimic the the header of an exiting netCDF file created outside of R. Below is what the existing header looks like (part of it that is different): netcdf ccd1984_05_08 { dimensions: lat = 1974 ; lon = 1894 ; time = UNLIMITED ; // (1 currently) variables: int time(time) ; time:long_name = "time" ; time:units = "days since 1984-05-01 0:0:0" ; time:day_begins = "06:15" ; double lat(lat) ; lat:long_name = "latitude" ; lat:standard_name = "latitude" ; lat:units = "degrees_north" ; lat:axis = "Y" ; double lon(lon) ; lon:long_name = "longitude" ; lon:standard_name = "longitude" ; lon:units = "degrees_east" ; lon:axis = "X" ; byte ccd(time, lat, lon) ; ccd:Unit_duration = 7.5f ; ccd:units = "Unit_duration mins" ; ccd:long_name = "Cold Cloud Duration" ; ccd:short_name = "ccd" ; ccd:_FillValue = -1b ; And here is my attempt when using the ncdf package: netcdf ccd1983_04_08_r { dimensions: lon = 1894 ; lat = 1974 ; time = UNLIMITED ; // (1 currently) variables: double lon(lon) ; lon:units = "" ; double lat(lat) ; lat:units = "" ; double time(time) ; time:units = "days since 1900-01-01" ; byte ccd(time, lat, lon) ; ccd:units = "" ; ccd:missing_value = -99b ; ccd:long_name = "Cold Cloud Duration" ; I am unable to replicate the variables attributes that exist in the first example in the second example. Is there anyway to transfer across the header information or edit the ncdf package to do this? Thanks in advance, Ross (PhD Student) From yanwei.song at gmail.com Wed Sep 7 23:18:42 2011 From: yanwei.song at gmail.com (Yanwei Song) Date: Wed, 7 Sep 2011 17:18:42 -0400 Subject: [R] Fwd: FSelector and RWeka problem References: <334E4969-301B-4FC1-935B-5A97C9BF74C7@gmail.com> Message-ID: <1932C3DD-A455-4769-BABD-C9EBAB2BF38D@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From berryboessenkool at hotmail.com Wed Sep 7 22:13:25 2011 From: berryboessenkool at hotmail.com (Berry Boessenkool) Date: Wed, 7 Sep 2011 22:13:25 +0200 Subject: [R] access objects In-Reply-To: References: , Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dadrivr at gmail.com Wed Sep 7 22:49:00 2011 From: dadrivr at gmail.com (dadrivr) Date: Wed, 7 Sep 2011 13:49:00 -0700 (PDT) Subject: [R] Change properties of line summary in interaction.plot In-Reply-To: References: <1315094013620-3788614.post@n4.nabble.com> <1315403170104-3796133.post@n4.nabble.com> <1315424783131-3797044.post@n4.nabble.com> Message-ID: <1315428540542-3797231.post@n4.nabble.com> David Winsemius wrote: > What "mean summary line"? I count 8 lines and that matches the number > if id's with complete data. Yea, I don't know why there is no mean summary line showing up. I requested it in the interaction.plot statement (fun = mean), and I don't get any errors. Any ideas? -- View this message in context: http://r.789695.n4.nabble.com/Change-properties-of-line-summary-in-interaction-plot-tp3788614p3797231.html Sent from the R help mailing list archive at Nabble.com. From natemiller77 at gmail.com Wed Sep 7 23:48:16 2011 From: natemiller77 at gmail.com (Nathan Miller) Date: Wed, 7 Sep 2011 14:48:16 -0700 Subject: [R] ggplot2-Issue placing error bars behind data points Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jdnewmil at dcn.davis.ca.us Wed Sep 7 23:48:43 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Wed, 07 Sep 2011 14:48:43 -0700 Subject: [R] finding events in a time duration. In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Wed Sep 7 23:49:59 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 7 Sep 2011 17:49:59 -0400 Subject: [R] Change properties of line summary in interaction.plot In-Reply-To: <1315428540542-3797231.post@n4.nabble.com> References: <1315094013620-3788614.post@n4.nabble.com> <1315403170104-3796133.post@n4.nabble.com> <1315424783131-3797044.post@n4.nabble.com> <1315428540542-3797231.post@n4.nabble.com> Message-ID: <422AE96F-0BE1-4548-9E60-A74A1BC572B1@comcast.net> On Sep 7, 2011, at 4:49 PM, dadrivr wrote: > > David Winsemius wrote: >> What "mean summary line"? I count 8 lines and that matches the number >> if id's with complete data. > > Yea, I don't know why there is no mean summary line showing up. I > requested > it in the interaction.plot statement (fun = mean), and I don't get any > errors. Any ideas? You didn't get any errors and you _did_ get the means. It's just that you only gave it a dataset that had only one value per category. The mean is defined for a single element vector although the sd and var are not. Look more closely at the example on the help page and you will see that it has more than one value per category defined by the two interaction variables > > -- > View this message in context: http://r.789695.n4.nabble.com/Change-properties-of-line-summary-in-interaction-plot-tp3788614p3797231.html You are still failing to include context. I've seen the Nabble interface. I KNOW that it is not that difficult to include context. > Sent from the R help mailing list archive at Nabble.com. > > ______________ David Winsemius, MD West Hartford, CT From jdnewmil at dcn.davis.ca.us Wed Sep 7 23:52:05 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Wed, 07 Sep 2011 14:52:05 -0700 Subject: [R] ggplot2-Issue placing error bars behind data points In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.weylandt at gmail.com Wed Sep 7 23:52:42 2011 From: michael.weylandt at gmail.com (R. Michael Weylandt ) Date: Wed, 7 Sep 2011 16:52:42 -0500 Subject: [R] finding events in a time duration. In-Reply-To: References: Message-ID: I think (no promises) roll.apply() from the zoo package with sum can count occurrences in whatever time period. Then use a logical test to identify sufficiently active periods. Hope this helps, Michael Weylandt On Sep 7, 2011, at 4:07 PM, Vineet Shukla wrote: > Hi, > > Premises: I have a database which contain the list of events and their time > stamps (This is a Unix time stamps) > > What I want to do : I want know how much is the maximum occurrence of this > in any a time period of 7 days or does a event occur es more than "N" (say > 5) times in a period of 7 days. > This time period is not fixed with "week > boundary", its a period of 7 days occurring at any time. > > > Question : How it can be done in R. is there a package which can be helpful > ? if yes then how I can use it. > > > Rgds, > Vineet > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From dpierce at ucsd.edu Wed Sep 7 23:56:13 2011 From: dpierce at ucsd.edu (David William Pierce) Date: Wed, 7 Sep 2011 14:56:13 -0700 Subject: [R] Editing the variables attributes section in the netCDF header of netCDF files created using the package ncdf. In-Reply-To: <72EEC39730D50046AAFEB719280AAF6A03380320@DB3PRD0104MB105.eurprd01.prod.exchangelabs.com> References: <72EEC39730D50046AAFEB719280AAF6A03380320@DB3PRD0104MB105.eurprd01.prod.exchangelabs.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From shuklvineet at gmail.com Wed Sep 7 23:59:49 2011 From: shuklvineet at gmail.com (Vineet Shukla) Date: Wed, 7 Sep 2011 16:59:49 -0500 Subject: [R] finding events in a time duration. In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From spencer.graves at structuremonitoring.com Thu Sep 8 00:00:53 2011 From: spencer.graves at structuremonitoring.com (Spencer Graves) Date: Wed, 07 Sep 2011 15:00:53 -0700 Subject: [R] Application of results from smooth.spline outside R In-Reply-To: <1315426159713-3797118.post@n4.nabble.com> References: <1314259389525-3767610.post@n4.nabble.com> <1315426159713-3797118.post@n4.nabble.com> Message-ID: <4E67E995.5020604@structuremonitoring.com> Hi, Jesper: Where "Outside R" do you want to use it? There are R interfaces to many other software packages, and many that cannot easily link to R can link to Fortran. You could use library(sos) to search for R packages to connect to whatever. Hope this helps. Spencer On 9/7/2011 1:09 PM, Jesper wrote: > Hi Jean, > > Thanks for the reply.. > > Using your suggestion, I end up in in the source code (Fortran 77 i > believe). At first look, it seems a bit more tedious to implement than I > expected. > > -- > View this message in context: http://r.789695.n4.nabble.com/Application-of-results-from-smooth-spline-outside-R-tp3767610p3797118.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From dale.coons at gmail.com Thu Sep 8 00:07:16 2011 From: dale.coons at gmail.com (Dale Coons) Date: Wed, 07 Sep 2011 18:07:16 -0400 Subject: [R] rgl 'how-to's Message-ID: <4E67EB14.6070109@gmail.com> Doing a visual graphic and trying to make it "pretty" Here's simple chart to play with: library(rgl) dotframe<-data.frame(x=c(0,7,0,0,-7,0),y=c(0,0,7,0,0,-7),z=c(7,0,0,-7,0,0)) dotframe plot3d(dotframe$x,dotframe$y,dotframe$z, radius=3, type='s',col=c('red','green','blue','purple','orange','gray'), axes=FALSE, box=FALSE, xlab='',ylab='',zlab='') text3d(x=7,y=0,z=0, text="hello, world",adj = 0.5, color="blue") #adds a label at one of the points My questions: 1) is there a way to label the points (spheres in this case) so that the label 'stays on top'? other than text3d(), which adds labels, but they are hidden when the graph is rotated? 2) can a bitmap, say, of a company or university be inserted into the title area? 3) can a bitmap be used as the marker for a point? Thanks in advance for help-I learn a lot from others questions and appreciate direction (even if it's RTFM!) Dale From mbmiller+l at gmail.com Thu Sep 8 00:25:20 2011 From: mbmiller+l at gmail.com (Mike Miller) Date: Wed, 7 Sep 2011 17:25:20 -0500 Subject: [R] storage and single-precision Message-ID: I'm getting the impression from on-line docs that R cannot work with single-precision floating-point numbers, but that it has a pseudo-mode for single precision for communication with external programs. I don't mind that R is using doubles internally, but what about storage? If all I need to store is single-precision (32-bit), can I do that? When it is read back into R it can be converted from single to double (back to 64-bit). Furthermore, the data are numbers from 0.000 to 2.000 with no missing values that could be stored just as accurately as unsigned 16-bit integers from 0 to 2000. That would be the best plan for me. It looks like the ff package allows for additional formats, so I might try to use ff, but I still would like to get a better understanding of R's native capabilities in regard to representations of numbers both in RAM and in stored data files. Thanks in advance. Best, Mike -- Michael B. Miller, Ph.D. Minnesota Center for Twin and Family Research Department of Psychology University of Minnesota From djmuser at gmail.com Thu Sep 8 00:33:01 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Wed, 7 Sep 2011 15:33:01 -0700 Subject: [R] ggplot2-Issue placing error bars behind data points In-Reply-To: References: Message-ID: Hi: For your test data, try this: # Result of dput(NerveSurv) NerveSurv <- structure(list(Time = c(0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 0, 0.25, 0.5, 0.75, 1, 1.25, 1.5 ), SAP = c(1, 1.04, 1.04, 1.06, 1.04, 1.22, 1.01, 1, 1.01, 1.01, 1.06, 1.01, 0.977, 0.959, 1, 1.01, 1.06, 0.921, 0.951, 0.904, 0.911), Temp = c(25L, 25L, 25L, 25L, 25L, 25L, 25L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 8L, 8L, 8L, 8L, 8L, 8L, 8L), SAPSE = c(0, 0.0412, 0.0935, 0.0818, 0.131, 0.144, 0.0712, 0, 0.0156, 0.0337, 0.0481, 0.0168, 0.00486, 0.0155, 0, 0.00462, 0.0491, 0.0329, 0.0304, 0.0471, 0.0722)), .Names = c("Time", "SAP", "Temp", "SAPSE" ), class = "data.frame", row.names = c(NA, -21L)) limits<-aes(ymax = NerveSurv$SAP + NerveSurv$SAPSE, ymin = NerveSurv$SAP - NerveSurv$SAPSE) # library('ggplot2') p <- ggplot(data=NerveSurv,aes(x=Time,y=SAP)) p + geom_errorbar(limits,width=0.2) + geom_point(aes(fill=factor(Temp)), colour = colors()[173], shape = 21, size=4) + xlab("Time (min)") + xlim(c(0, 2)) + ylab("Normalized Spontaneous Action Potential Rate") + ylim(c(0,1.6)) + scale_fill_manual("Acclimation\nTemperature", breaks=c(8,18,25), labels=c("25?C", "18 ?C", "8 ?C"), values=c(colours()[c(173,253,218)])) Notice the following in the above code: (1) As Jeff Newmiller pointed out, put geom_errorbar() before geom_point(). (2) If you are going to be using the same value of a plot aesthetic, you *set* it outside aes() rather than *map* it inside aes(). See the revised code for geom_point(). The idea is that if a variable is assigned to a plot aesthetic (e.g., color, fill, shape), then it needs to be mapped inside aes(); if an aesthetic is set to a specific value, it is set outside aes(). (3) Your test data had an effective x-range of 0 - 1.5 rather than 0 - 50, so I shrunk xlim() for readability. (4) You can use \n inside of a character string as a carriage return. See the legend title for scale_fill_manual(). (5) Style note: you want the layer addition operator + to be at the end of a line, not at the beginning. Copy and paste this code verbatim to see what I mean: p + geom_errorbar(limits,width=0.2) + geom_point(aes(fill=factor(Temp)), colour = colors()[173], shape = 21, size=4) This happens because the first line is syntactically complete. HTH, Dennis On Wed, Sep 7, 2011 at 2:48 PM, Nathan Miller wrote: > Hi all, > > This seems like a basic problem, but no amount of playing with the code has > solved it. I have a time-series data set like that shown below (only longer) > and am seeking to plot the data with filled, circular points and error bars. > I would like the error bars to be behind the points otherwise they tend to > obscure the points (especially when I have a lot of points in the actual > data set). Despite providing a fill colour for the points and using shapes > that utilize fill colours, the error bars are always placed on top of the > points. Can anyone see the error I am making? I simply want to move the > error bars so they are behind the data points. > > Thanks for your help. I assume its simple...or else its a bug. > > Nate > > > ?Time ? ? ?SAP Temp ? ?SAPSE > ?0.00 1.000000 ? 25 ? ? 0.000000 > ?0.25 1.040000 ? 25 ? ? 0.041200 > 0.50 1.040000 ? 25 ? ? ?0.093500 > 0.75 1.060000 ? 25 ? ? 0.081800 > 1.00 1.040000 ? 25 ? ? 0.131000 > 1.25 1.220000 ? 25 ? ? 0.144000 > 1.50 1.010000 ? 25 ? ? 0.071200 > 0.00 1.000000 ? 15 ? ? 0.000000 > 0.25 1.010000 ? 15 ? ? 0.015600 > ?0.50 1.010000 ? 15 ? ?0.033700 > 0.75 1.060000 ? 15 ? ?0.048100 > 1.00 1.010000 ? 15 ? ?0.016800 > 1.25 0.977000 ? 15 ? ?0.004860 > 1.50 0.959000 ? 15 ? ?0.015500 > 0.00 1.000000 ? ?8 ? ? 0.000000 > 0.25 1.010000 ? ?8 ? ? 0.004620 > 0.50 1.060000 ? ?8 ? ? 0.049100 > 0.75 0.921000 ? ?8 ? ? 0.032900 > 1.00 0.951000 ? ?8 ? ? 0.030400 > 1.25 0.904000 8 ? ? 0.047100 > 1.50 0.911000 8 ? ? 0.072200 > > limits<-aes(ymax=NerveSurv$SAP+NerveSurv$SAPSE,ymin=NerveSurv$SAP-NerveSurv$SAPSE) > > p<-ggplot(data=NerveSurv,aes(x=Time,y=SAP)) > > p+geom_point(aes(colour=factor(Temp), shape=factor(Temp), > fill=factor(Temp)), size=4) > +geom_errorbar(limits,width=0.2) > +xlab("Time (min)") > +xlim(c(0,50)) > +ylab("Normalized Spontaneous Action Potential Rate") > +ylim(c(0,1.6)) > +scale_shape_manual("Acclimation Temperature",breaks=c(8,18,25), > labels=c("25 ?C", "18 ?C", "8 ?C"),values=c(21,21,21)) > +scale_fill_manual("Acclimation Temperature",breaks=c(8,18,25), labels=c("25 > ?C", "18 ?C", "8 ?C"),values=c(colours()[c(173,253,218)])) > +scale_colour_manual("Acclimation Temperature",breaks=c(8,18,25), > labels=c("25 ?C", "18 ?C", "8 ?C"), values=c(colours()[c(173,173,173)])) > > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From andra_isan at yahoo.com Thu Sep 8 01:12:19 2011 From: andra_isan at yahoo.com (Andra Isan) Date: Wed, 7 Sep 2011 16:12:19 -0700 (PDT) Subject: [R] Question about model selection for glm -- how to select features based on BIC? Message-ID: <1315437139.61126.YahooMailClassic@web120618.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Thu Sep 8 01:12:16 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 7 Sep 2011 19:12:16 -0400 Subject: [R] rgl 'how-to's In-Reply-To: <4E67EB14.6070109@gmail.com> References: <4E67EB14.6070109@gmail.com> Message-ID: <45375FAB-C616-4F8B-8305-624FBB44E73B@comcast.net> On Sep 7, 2011, at 6:07 PM, Dale Coons wrote: > Doing a visual graphic and trying to make it "pretty" > > Here's simple chart to play with: > > library(rgl) > dotframe<- > data.frame(x=c(0,7,0,0,-7,0),y=c(0,0,7,0,0,-7),z=c(7,0,0,-7,0,0)) > dotframe plot3d(dotframe$x,dotframe$y,dotframe$z, radius=3, > type='s',col=c('red','green','blue','purple','orange','gray'), > axes=FALSE, box=FALSE, xlab='',ylab='',zlab='') > > text3d(x=7,y=0,z=0, text="hello, world",adj = 0.5, color="blue") > #adds a label at one of the points > > My questions: > > 1) is there a way to label the points (spheres in this case) so that > the label 'stays on top'? other than text3d(), which adds labels, > but they are hidden when the graph is rotated? > 2) can a bitmap, say, of a company or university be inserted into > the title area? > 3) can a bitmap be used as the marker for a point? > > Thanks in advance for help-I learn a lot from others questions and > appreciate direction (even if it's RTFM!) > No RTFM (or advice) from me. I'm afraid all I have to offer is the whimsical notion that the next thing people will request is how to program Second Life avatars within R. -- David Winsemius, MD West Hartford, CT From alrossi at icmc.usp.br Thu Sep 8 00:08:26 2011 From: alrossi at icmc.usp.br (=?ISO-8859-1?Q?Andr=E9_Rossi?=) Date: Wed, 7 Sep 2011 19:08:26 -0300 Subject: [R] Very slow assignments Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dadrivr at gmail.com Thu Sep 8 00:31:00 2011 From: dadrivr at gmail.com (dadrivr) Date: Wed, 7 Sep 2011 15:31:00 -0700 (PDT) Subject: [R] Change properties of line summary in interaction.plot In-Reply-To: <422AE96F-0BE1-4548-9E60-A74A1BC572B1@comcast.net> References: <1315094013620-3788614.post@n4.nabble.com> <1315403170104-3796133.post@n4.nabble.com> <1315424783131-3797044.post@n4.nabble.com> <1315428540542-3797231.post@n4.nabble.com> <422AE96F-0BE1-4548-9E60-A74A1BC572B1@comcast.net> Message-ID: <1315434660910-3797431.post@n4.nabble.com> Here's an example: > id <- c(17,17,17,18,18,18,19,19,19,20,20,20,21,21,21,22,22,22,23,23,23,24, > 24,24,25,25,25,26,26,26) > age <- rep(c(30,36,42),10) > outcome <- > c(12,17,10,5,5,2,NA,NA,NA,8,6,5,11,13,10,15,11,15,13,NA,9,0,0,0,20, > 14,16,1,2,2) > mydata <- as.data.frame(cbind(id,age,outcome)) > interaction.plot(mydata$age, mydata$id, mydata$outcome, fun = mean, legend > = FALSE, lty = 1, xtick = TRUE, type = "l") > > How can I make the 'mean' summary line red and thicker? David Winsemius wrote: > > On Sep 7, 2011, at 4:49 PM, dadrivr wrote: > >> >> David Winsemius wrote: >>> What "mean summary line"? I count 8 lines and that matches the number >>> if id's with complete data. >> >> Yea, I don't know why there is no mean summary line showing up. I >> requested >> it in the interaction.plot statement (fun = mean), and I don't get any >> errors. > > You didn't get any errors and you _did_ get the means. It's just that > you only gave it a dataset that had only one value per category. The > mean is defined for a single element vector although the sd and var > are not. Look more closely at the example on the help page and you > will see that it has more than one value per category defined by the > two interaction variables Okay, I checked the example, and I see what you mean. Is there a way to compute an average trajectory (i.e., a line that represents the averages of all the other lines) and make that line red? Obviously, I'm fairly new to this in general, so any guidance would be helpful. -- View this message in context: http://r.789695.n4.nabble.com/Change-properties-of-line-summary-in-interaction-plot-tp3788614p3797431.html Sent from the R help mailing list archive at Nabble.com. From tyler_hicks at wsu.edu Wed Sep 7 23:51:13 2011 From: tyler_hicks at wsu.edu (Tyler Hicks) Date: Wed, 7 Sep 2011 16:51:13 -0500 Subject: [R] Generating data when mean and 95% CI are known Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Thu Sep 8 02:04:42 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 7 Sep 2011 20:04:42 -0400 Subject: [R] Change properties of line summary in interaction.plot In-Reply-To: <1315434660910-3797431.post@n4.nabble.com> References: <1315094013620-3788614.post@n4.nabble.com> <1315403170104-3796133.post@n4.nabble.com> <1315424783131-3797044.post@n4.nabble.com> <1315428540542-3797231.post@n4.nabble.com> <422AE96F-0BE1-4548-9E60-A74A1BC572B1@comcast.net> <1315434660910-3797431.post@n4.nabble.com> Message-ID: <6316E859-4816-4F81-851F-842FC9282FFD@comcast.net> On Sep 7, 2011, at 6:31 PM, dadrivr wrote: > > Here's an example: >> id <- >> c(17,17,17,18,18,18,19,19,19,20,20,20,21,21,21,22,22,22,23,23,23,24, >> 24,24,25,25,25,26,26,26) >> age <- rep(c(30,36,42),10) >> outcome <- >> c(12,17,10,5,5,2,NA,NA,NA,8,6,5,11,13,10,15,11,15,13,NA,9,0,0,0,20, >> 14,16,1,2,2) >> mydata <- as.data.frame(cbind(id,age,outcome)) >> interaction.plot(mydata$age, mydata$id, mydata$outcome, fun = mean, >> legend >> = FALSE, lty = 1, xtick = TRUE, type = "l") >> >> How can I make the 'mean' summary line red and thicker? > > > David Winsemius wrote: >> >> On Sep 7, 2011, at 4:49 PM, dadrivr wrote: >> >>> >>> David Winsemius wrote: >>>> What "mean summary line"? I count 8 lines and that matches the >>>> number >>>> if id's with complete data. >>> >>> Yea, I don't know why there is no mean summary line showing up. I >>> requested >>> it in the interaction.plot statement (fun = mean), and I don't get >>> any >>> errors. >> >> You didn't get any errors and you _did_ get the means. It's just that >> you only gave it a dataset that had only one value per category. The >> mean is defined for a single element vector although the sd and var >> are not. Look more closely at the example on the help page and you >> will see that it has more than one value per category defined by the >> two interaction variables > > Okay, I checked the example, and I see what you mean. Is there a > way to > compute an average trajectory (i.e., a line that represents the > averages of > all the other lines) and make that line red? Obviously, I'm fairly > new to > this in general, so any guidance would be helpful. > mydat.mdl <- lm(outcome ~ age, data=mydata) > with( mydata, plot(age, outcome)) > abline(coef(mydat.mdl), col="red") -- David Winsemius, MD Heritage Laboratories West Hartford, CT From wdunlap at tibco.com Thu Sep 8 02:05:29 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 8 Sep 2011 00:05:29 +0000 Subject: [R] Very slow assignments In-Reply-To: References: Message-ID: You did not show the code you used to populate your object, but consider the following ways to make as.list(1:50000) via repeated assignments: > system.time( { z0 <- list() ; for(i in 1:50000)z0[i] <- list(i) } ) user system elapsed 13.34 0.00 13.42 > system.time( { z1 <- list() ; for(i in 1:50000)z1[[i]] <- i } ) user system elapsed 12.98 0.00 13.36 > system.time( { z2 <- vector("list",50000) ; for(i in 1:50000)z2[i] <- list(i) } ) user system elapsed 0.28 0.00 0.26 > system.time( { z3 <- vector("list",50000) ; for(i in 1:50000)z3[[i]] <- i } ) user system elapsed 0.22 0.00 0.20 > identical(z0,z1) && identical(z0,z2) && identical(z0,z3) [1] TRUE Preallocating a vector to its ultimate size can be much faster than repeatedly expanding it. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Andr? Rossi > Sent: Wednesday, September 07, 2011 3:08 PM > To: r-help at r-project.org > Subject: [R] Very slow assignments > > I'm creating an object of a S4 class that has two slots: ListExamples, which > is a list, and idx, which is an integer (as the code below). > > Then, I read a data.frame file with 10000 (ten thousands) of lines and 10 > columns, do some pre-processing and, basically, I store each line as an > element of a list in the slot ListExamples of the S4 object. However, any > kind of assignment operation (<-) that I try to do after this took a > considerable time. > > Can anyone explain me why dois it happen? Is it possible to speed up an > script that deals with a big number of data (it might be data.frame or > list)? > > Thank you, > > Andr? Rossi > > setClass("Buffer", > representation=representation( > Listexamples = "list", > idx = "integer" > ) > ) > > [[alternative HTML version deleted]] From rolf.turner at xtra.co.nz Thu Sep 8 02:22:05 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Thu, 08 Sep 2011 12:22:05 +1200 Subject: [R] Very slow assignments In-Reply-To: References: Message-ID: <4E680AAD.9020405@xtra.co.nz> On 08/09/11 12:05, William Dunlap wrote: > You did not show the code you used to populate your > object, but consider the following ways to make as.list(1:50000) > via repeated assignments: > > > system.time( { z0<- list() ; for(i in 1:50000)z0[i]<- list(i) } ) > user system elapsed > 13.34 0.00 13.42 > > system.time( { z1<- list() ; for(i in 1:50000)z1[[i]]<- i } ) > user system elapsed > 12.98 0.00 13.36 > > system.time( { z2<- vector("list",50000) ; for(i in 1:50000)z2[i]<- list(i) } ) > user system elapsed > 0.28 0.00 0.26 > > system.time( { z3<- vector("list",50000) ; for(i in 1:50000)z3[[i]]<- i } ) > user system elapsed > 0.22 0.00 0.20 > > identical(z0,z1)&& identical(z0,z2)&& identical(z0,z3) > [1] TRUE > > Preallocating a vector to its ultimate size can be much faster than > repeatedly expanding it. This is *very* interesting to me. Years ago I learned to use the ``z1'' syntax, and thought I was being cool. It's stunning how much more efficient the ``z2'' and ``z3'' syntaxes are. I'll bet that there are many bunnies like me out there who are unaware of the advantages of preallocation via vector("list",...). Thanks for the enlightenment. cheers, Rolf Turner From yanwei.song at gmail.com Thu Sep 8 02:47:34 2011 From: yanwei.song at gmail.com (Yanwei Song) Date: Wed, 7 Sep 2011 20:47:34 -0400 Subject: [R] FSelector and RWeka problem In-Reply-To: <1932C3DD-A455-4769-BABD-C9EBAB2BF38D@gmail.com> References: <334E4969-301B-4FC1-935B-5A97C9BF74C7@gmail.com> <1932C3DD-A455-4769-BABD-C9EBAB2BF38D@gmail.com> Message-ID: <7ADABAEF-1638-4A0B-94D2-E0C04798B481@gmail.com> Hi all, Last post didn't give the plain format: I was trying to combine RWeka and FSelector, and do the forward feature selection with J48/C5.4 decision tree: Here is the code: #================== library(RWeka) library(FSelector) library(rpart) data(iris) evaluator <- function(subset) { p <- J48(as.simple.formula(subset, "Species"), data=iris) e <- evaluate_Weka_classifier(p) print(subset) print(e$details[1]) return(e$details[1]) } subset <- forward.search(names(iris)[-5], evaluator) ========================= I got this error, when I ran it: Error in paste(attributes, sep = "", collapse = " + ") : cannot coerce type 'closure' to vector of type 'character' I don't know how I could fix this problem. Thanks. Yanwei From zhenjiang.xu at gmail.com Thu Sep 8 03:15:32 2011 From: zhenjiang.xu at gmail.com (zhenjiang xu) Date: Wed, 7 Sep 2011 21:15:32 -0400 Subject: [R] counting the duplicates in an object of list In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From zhenjiang.xu at gmail.com Thu Sep 8 03:18:53 2011 From: zhenjiang.xu at gmail.com (zhenjiang xu) Date: Wed, 7 Sep 2011 21:18:53 -0400 Subject: [R] how to create data.frames from vectors with duplicates In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From zhenjiang.xu at gmail.com Thu Sep 8 03:20:21 2011 From: zhenjiang.xu at gmail.com (zhenjiang xu) Date: Wed, 7 Sep 2011 21:20:21 -0400 Subject: [R] read.table truncated data? In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djmuser at gmail.com Thu Sep 8 03:55:18 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Wed, 7 Sep 2011 18:55:18 -0700 Subject: [R] how to create data.frames from vectors with duplicates In-Reply-To: References: Message-ID: Hi: Here are a few informal timings on my machine with the following example. The data.table package is worth investigating, particularly in problems where its advantages can scale with size. library(data.table) dt <- data.table(x = sample(1:50, 1000000, replace = TRUE), y = sample(letters[1:26], 1000000, replace = TRUE), key = 'y') system.time(dt[, list(count = sum(x)), by = 'y']) user system elapsed 0.02 0.00 0.02 # Data tables are also data frames, so we can use them as such: system.time(with(dt, tapply(x, y, sum))) user system elapsed 0.39 0.00 0.39 system.time(with(dt, rowsum(x, y))) user system elapsed 0.04 0.00 0.03 system.time(aggregate(x ~ y, data = dt, FUN = sum)) user system elapsed 1.87 0.00 1.87 So rowsum() is good, but data.table is a little better for this task. Increasing the size of the problem is to the advantage of both data.table and rowsum(), but tapply() takes a fair bit longer, relatively speaking (appx. 10x rowsum() in the first example, 20x in the second example). The ratios of rowsum() to data.table are about the same (appx. 2x). # 10M observations, 1000 groups > dt <- data.table(x = sample(1:100, 10000000, replace = TRUE), + y = sample(1:1000, 10000000, replace = TRUE), + key = 'y') > system.time(dt[, list(count = sum(x)), by = 'y']) user system elapsed 0.16 0.03 0.18 > system.time(with(dt, rowsum(x, y))) user system elapsed 0.36 0.04 0.40 > system.time(with(dt, tapply(x, y, sum))) user system elapsed 8.77 0.33 9.11 HTH, Dennis On Wed, Sep 7, 2011 at 6:18 PM, zhenjiang xu wrote: > Thanks for all your replies. I am using rowsum() and it looks efficient. I > hope I could do some benchmark sometime in near future and let people know. > Or is there any benchmark result available? > > On Wed, Aug 31, 2011 at 12:58 PM, Bert Gunter wrote: > >> Inline below: >> >> On Wed, Aug 31, 2011 at 9:50 AM, Jorge I Velez >> wrote: >> > Hi Zhenjiang, >> > >> > Try >> > >> > table(unlist(mapply(function(x, y) rep(x, y), y, x))) >> >> Yikes! How about simply tapply(x,y,sum) ?? >> ?tapply >> >> -- Bert >> > >> > HTH, >> > Jorge >> > >> > >> > On Wed, Aug 31, 2011 at 12:45 PM, zhenjiang xu <> wrote: >> > >> >> Hi R users, >> >> >> >> suppose I have two vectors, >> >> ?> x=c(1,2,3,4,5) >> >> ?> y=c('a','b','c','a','c') >> >> How can I get a data.frame like this? >> >> > xy >> >> ? ? ?count >> >> a ? ? 5 >> >> b ? ? 2 >> >> c ? ? 8 >> >> >> >> I know a few ways to fulfill the task. However, I have a huge number >> >> of this kind calculations, so I'd like an efficient solution. Thanks >> >> >> >> -- >> >> Best, >> >> Zhenjiang >> >> >> >> ______________________________________________ >> >> R-help at r-project.org mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> >> >> > >> > ? ? ? ?[[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> > > > > -- > Best, > Zhenjiang > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From cesar.rabak at gmail.com Thu Sep 8 02:31:13 2011 From: cesar.rabak at gmail.com (csrabak) Date: Wed, 07 Sep 2011 21:31:13 -0300 Subject: [R] suggestion for proportions In-Reply-To: <1315425202.24963.YahooMailNeo@web125819.mail.ne1.yahoo.com> References: <1315383072.86607.YahooMailNeo@web125805.mail.ne1.yahoo.com> <4E671B96020000CB00095795@med-webappgwia1.medicine.umaryland.edu> <077E31A57DA26E46AB0D493C9966AC730C359B463C@UM-MAIL4112.unimaas.nl> <077E31A57DA26E46AB0D493C9966AC730C359B468C@UM-MAIL4112.unimaas.nl> <1315425202.24963.YahooMailNeo@web125819.mail.ne1.yahoo.com> Message-ID: <4E680CD1.80905@acm.org> Em 7/9/2011 16:53, array chip escreveu: > Hi all, thanks very much for sharing your thoughts. and sorry for my describing the problem not clearly, my fault. > > My data is paired, that is 2 different diagnostic tests were performed on the same individuals. Each individual will have a test results from each of the 2 tests. Then in the end, 2 accuracy rates were calculated for the 2 tests. And I want to test if there is a significant difference in the accuracy (proportion) between the 2 tests. My understanding is that prop.test() is appropriate for 2 independent proportions, whereas in my situation, the 2 proportions are not independent calculated from "paired" data, right? > > the data would look like: > > pid test1 test2 > p1 1 0 > p2 1 1 > p3 0 1 > : > : > > 1=test is correct; 0=not correct > > from the data above, we can calculate accuracy for test1 and test2, then to compare.... > > > So mcnemar.test() is good for that, right? > > Thanks > John, From above clarifying I suggest you consider the use of kappa test. For a list of possible ways of doing it in R try: RSiteSearch("kappa",restrict="functions") HTH -- Cesar Rabak From kebennett at alaska.edu Thu Sep 8 03:59:01 2011 From: kebennett at alaska.edu (Katrina Bennett) Date: Wed, 7 Sep 2011 17:59:01 -0800 Subject: [R] Seasonal and 11-day subset for zoo object Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From yanwei.song at gmail.com Thu Sep 8 04:12:40 2011 From: yanwei.song at gmail.com (Yanwei Song) Date: Wed, 7 Sep 2011 22:12:40 -0400 Subject: [R] FSelector and RWeka problem In-Reply-To: <7ADABAEF-1638-4A0B-94D2-E0C04798B481@gmail.com> References: <334E4969-301B-4FC1-935B-5A97C9BF74C7@gmail.com> <1932C3DD-A455-4769-BABD-C9EBAB2BF38D@gmail.com> <7ADABAEF-1638-4A0B-94D2-E0C04798B481@gmail.com> Message-ID: <9F7E7435-A02C-47E8-9BB7-CD957938E89F@gmail.com> Hi, Here is a follow up, I resolve the problem in another way. " > e <- evaluate_Weka_classifier(p) > print(subset) > print(e$details[1]) > return(e$details[1])" change this lines to " e <- sum(iris$Species == predict(p))/nrow(iris) print(subset) print(e) return(e) ================ It works well, but I still don't understand why the previous one didn't work. Maybe there is a compatibility problem... Yanwei On Sep 7, 2011, at 8:47 PM, Yanwei Song wrote: > Hi all, > > Last post didn't give the plain format: > > I was trying to combine RWeka and FSelector, and do the forward feature selection with J48/C5.4 decision tree: > > Here is the code: > #================== > library(RWeka) > library(FSelector) > library(rpart) > > > data(iris) > evaluator <- function(subset) { > p <- J48(as.simple.formula(subset, "Species"), data=iris) > e <- evaluate_Weka_classifier(p) > print(subset) > print(e$details[1]) > return(e$details[1]) > } > > > subset <- forward.search(names(iris)[-5], evaluator) > ========================= > I got this error, when I ran it: > > Error in paste(attributes, sep = "", collapse = " + ") : > cannot coerce type 'closure' to vector of type 'character' > > I don't know how I could fix this problem. > Thanks. > > > Yanwei > > > > From rolf.turner at xtra.co.nz Thu Sep 8 04:22:15 2011 From: rolf.turner at xtra.co.nz (Rolf Turner) Date: Thu, 08 Sep 2011 14:22:15 +1200 Subject: [R] Generating data when mean and 95% CI are known In-Reply-To: References: Message-ID: <4E6826D7.4040402@xtra.co.nz> On 08/09/11 09:51, Tyler Hicks wrote: > Is there a function in R that will generate data from a known mean and 95% CI? I do not know the distribution or sample size of the original data. No. R is wonderful, but it cannot work magic. cheers, Rolf Turner From zhenjiang.xu at gmail.com Thu Sep 8 04:25:22 2011 From: zhenjiang.xu at gmail.com (zhenjiang xu) Date: Wed, 7 Sep 2011 22:25:22 -0400 Subject: [R] counting the duplicates in an object of list In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From JSorkin at grecc.umaryland.edu Thu Sep 8 04:31:46 2011 From: JSorkin at grecc.umaryland.edu (John Sorkin) Date: Wed, 07 Sep 2011 22:31:46 -0400 Subject: [R] suggestion for proportions In-Reply-To: <4E680CD1.80905@acm.org> References: <1315383072.86607.YahooMailNeo@web125805.mail.ne1.yahoo.com> <4E671B96020000CB00095795@med-webappgwia1.medicine.umaryland.edu> <077E31A57DA26E46AB0D493C9966AC730C359B463C@UM-MAIL4112.unimaas.nl> <077E31A57DA26E46AB0D493C9966AC730C359B468C@UM-MAIL4112.unimaas.nl> <1315425202.24963.YahooMailNeo@web125819.mail.ne1.yahoo.com> <4E680CD1.80905@acm.org> Message-ID: <4E67F0D2020000CB000959BA@med-webappgwia1.medicine.umaryland.edu> Would the following strategy work? numtests <- 20 # Create a data frame: test1 results from trial 1 # test2 results from trial 2 # agree indicagtor if trial1= trial2 (value =1) or # trial1<>trial2 (value =0) data <- data.frame(test1 <-rbinom(numtests,1,0.5), test2<-rbinom(numtests,1,0.5),agree<-test1*test2) cat("Fraction of times test1=test2",sum(data$test2)/numtests,"\n") # Choose one of the following tests: prop.test(sum(data$agree),20) binom.test(sum(data$agree),20) John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) >>> csrabak 9/7/2011 8:31 PM >>> Em 7/9/2011 16:53, array chip escreveu: > Hi all, thanks very much for sharing your thoughts. and sorry for my describing the problem not clearly, my fault. > > My data is paired, that is 2 different diagnostic tests were performed on the same individuals. Each individual will have a test results from each of the 2 tests. Then in the end, 2 accuracy rates were calculated for the 2 tests. And I want to test if there is a significant difference in the accuracy (proportion) between the 2 tests. My understanding is that prop.test() is appropriate for 2 independent proportions, whereas in my situation, the 2 proportions are not independent calculated from "paired" data, right? > > the data would look like: > > pid test1 test2 > p1 1 0 > p2 1 1 > p3 0 1 > : > : > > 1=test is correct; 0=not correct > > from the data above, we can calculate accuracy for test1 and test2, then to compare.... > > > So mcnemar.test() is good for that, right? > > Thanks > John, From above clarifying I suggest you consider the use of kappa test. For a list of possible ways of doing it in R try: RSiteSearch("kappa",restrict="functions") HTH -- Cesar Rabak ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} From markleeds2 at gmail.com Thu Sep 8 04:36:46 2011 From: markleeds2 at gmail.com (Mark Leeds) Date: Wed, 7 Sep 2011 22:36:46 -0400 Subject: [R] Generating data when mean and 95% CI are known In-Reply-To: <4E6826D7.4040402@xtra.co.nz> References: <4E6826D7.4040402@xtra.co.nz> Message-ID: Hi: I don't know if this is what you meant but here's a way to cheat and do it. 1) back out the [sigma over sqrt root of n] from the 95 % CI and call it X. 2) then generate data using rnorm(n*, known mean, sigma*) where sigma*/sqrt(n*) = X is satisfied. 3) there will be many solutions to 2) so you could make up a sigma* and then back out the n* that makes it hold. On Wed, Sep 7, 2011 at 10:22 PM, Rolf Turner wrote: > On 08/09/11 09:51, Tyler Hicks wrote: >> >> Is there a function in R that will generate data from a known mean and 95% >> CI? I do not know the distribution or sample size of the original data. > > No. ?R is wonderful, but it cannot work magic. > > ? ?cheers, > > ? ? ? ?Rolf Turner > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From zhenjiang.xu at gmail.com Thu Sep 8 04:39:05 2011 From: zhenjiang.xu at gmail.com (zhenjiang xu) Date: Wed, 7 Sep 2011 22:39:05 -0400 Subject: [R] how to create data.frames from vectors with duplicates In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From JSorkin at grecc.umaryland.edu Thu Sep 8 04:41:46 2011 From: JSorkin at grecc.umaryland.edu (John Sorkin) Date: Wed, 07 Sep 2011 22:41:46 -0400 Subject: [R] suggestion for proportions In-Reply-To: <4E680CD1.80905@acm.org> References: <1315383072.86607.YahooMailNeo@web125805.mail.ne1.yahoo.com> <4E671B96020000CB00095795@med-webappgwia1.medicine.umaryland.edu> <077E31A57DA26E46AB0D493C9966AC730C359B463C@UM-MAIL4112.unimaas.nl> <077E31A57DA26E46AB0D493C9966AC730C359B468C@UM-MAIL4112.unimaas.nl> <1315425202.24963.YahooMailNeo@web125819.mail.ne1.yahoo.com> <4E680CD1.80905@acm.org> Message-ID: <4E67F32A020000CB000959C0@med-webappgwia1.medicine.umaryland.edu> Let my try again, but this time with corrected R code: would the following strategy work: numtests <- 2000 # Create a data frame: test1 results from trial 1 # test2 results from trial 2 # agree indicagtor if trial1= trial2 (value =1) or # trial1<>trial2 (value =0) data <- data.frame(test1 <-rbinom(numtests,1,0.5), test2<-rbinom(numtests,1,0.5),agree<-test1*test2) cat("Fraction of times test1=test2",sum(data$agree)/numtests,"\n") # Choose one of the following tests: prop.test(sum(data$agree),numtests) binom.test(sum(data$agree),numtests) John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) >>> csrabak 9/7/2011 8:31 PM >>> Em 7/9/2011 16:53, array chip escreveu: > Hi all, thanks very much for sharing your thoughts. and sorry for my describing the problem not clearly, my fault. > > My data is paired, that is 2 different diagnostic tests were performed on the same individuals. Each individual will have a test results from each of the 2 tests. Then in the end, 2 accuracy rates were calculated for the 2 tests. And I want to test if there is a significant difference in the accuracy (proportion) between the 2 tests. My understanding is that prop.test() is appropriate for 2 independent proportions, whereas in my situation, the 2 proportions are not independent calculated from "paired" data, right? > > the data would look like: > > pid test1 test2 > p1 1 0 > p2 1 1 > p3 0 1 > : > : > > 1=test is correct; 0=not correct > > from the data above, we can calculate accuracy for test1 and test2, then to compare.... > > > So mcnemar.test() is good for that, right? > > Thanks > John, From above clarifying I suggest you consider the use of kappa test. For a list of possible ways of doing it in R try: RSiteSearch("kappa",restrict="functions") HTH -- Cesar Rabak ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} From wdunlap at tibco.com Thu Sep 8 04:42:15 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 8 Sep 2011 02:42:15 +0000 Subject: [R] counting the duplicates in an object of list In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From vicvoncastle at gmail.com Thu Sep 8 04:43:01 2011 From: vicvoncastle at gmail.com (Ken) Date: Wed, 7 Sep 2011 22:43:01 -0400 Subject: [R] Generating data when mean and 95% CI are known In-Reply-To: <4E6826D7.4040402@xtra.co.nz> References: <4E6826D7.4040402@xtra.co.nz> Message-ID: <1258A21B-EE89-4CA7-B038-AAE8AE88359F@gmail.com> R can tell you how many possible answers there are with those givens though: ?Inf Really though, you can get at some information if you are willing to set one of those definitively. I.e. If you set sample size you can 'find' data which match specs but isn't going to be applicable to anything because there is probably more to the story distributionally. If you can assume, say a normal distribution you are ?rnorm and ?quantile away from Monte Carlo-ing a good part of the story yourself for conclusions. Best of luck, and sorry for the bad R jokes. Ken Hutchison On Sep 7, 2554 BE, at 10:22 PM, Rolf Turner wrote: > On 08/09/11 09:51, Tyler Hicks wrote: >> Is there a function in R that will generate data from a known mean and 95% CI? I do not know the distribution or sample size of the original data. > > No. R is wonderful, but it cannot work magic. > > cheers, > > Rolf Turner > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From alrossi at icmc.usp.br Thu Sep 8 05:03:27 2011 From: alrossi at icmc.usp.br (=?ISO-8859-1?Q?Andr=E9_Rossi?=) Date: Thu, 8 Sep 2011 00:03:27 -0300 Subject: [R] Very slow assignments In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From zhenjiang.xu at gmail.com Thu Sep 8 05:03:57 2011 From: zhenjiang.xu at gmail.com (zhenjiang xu) Date: Wed, 7 Sep 2011 23:03:57 -0400 Subject: [R] counting the duplicates in an object of list In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jsorkin at grecc.umaryland.edu Thu Sep 8 05:09:59 2011 From: jsorkin at grecc.umaryland.edu (John Sorkin) Date: Wed, 07 Sep 2011 23:09:59 -0400 Subject: [R] suggestion for proportions Message-ID: <4E67F9C7020000CB000959CA@med-webappgwia1.medicine.umaryland.edu> Correction. It won't work. Please ignore. >>> John Sorkin 9/7/2011 10:41:46 PM >>> Let my try again, but this time with corrected R code: would the following strategy work: numtests <- 2000 # Create a data frame: test1 results from trial 1 # test2 results from trial 2 # agree indicagtor if trial1= trial2 (value =1) or # trial1<>trial2 (value =0) data <- data.frame(test1 <-rbinom(numtests,1,0.5), test2<-rbinom(numtests,1,0.5),agree<-test1*test2) cat("Fraction of times test1=test2",sum(data$agree)/numtests,"\n") # Choose one of the following tests: prop.test(sum(data$agree),numtests) binom.test(sum(data$agree),numtests) John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) >>> csrabak 9/7/2011 8:31 PM >>> Em 7/9/2011 16:53, array chip escreveu: > Hi all, thanks very much for sharing your thoughts. and sorry for my describing the problem not clearly, my fault. > > My data is paired, that is 2 different diagnostic tests were performed on the same individuals. Each individual will have a test results from each of the 2 tests. Then in the end, 2 accuracy rates were calculated for the 2 tests. And I want to test if there is a significant difference in the accuracy (proportion) between the 2 tests. My understanding is that prop.test() is appropriate for 2 independent proportions, whereas in my situation, the 2 proportions are not independent calculated from "paired" data, right? > > the data would look like: > > pid test1 test2 > p1 1 0 > p2 1 1 > p3 0 1 > : > : > > 1=test is correct; 0=not correct > > from the data above, we can calculate accuracy for test1 and test2, then to compare.... > > > So mcnemar.test() is good for that, right? > > Thanks > John, From above clarifying I suggest you consider the use of kappa test. For a list of possible ways of doing it in R try: RSiteSearch("kappa",restrict="functions") HTH -- Cesar Rabak ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} From wdunlap at tibco.com Thu Sep 8 05:27:25 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 8 Sep 2011 03:27:25 +0000 Subject: [R] counting the duplicates in an object of list In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From zhenjiang.xu at gmail.com Thu Sep 8 05:33:05 2011 From: zhenjiang.xu at gmail.com (zhenjiang xu) Date: Wed, 7 Sep 2011 23:33:05 -0400 Subject: [R] counting the duplicates in an object of list In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ggrothendieck at gmail.com Thu Sep 8 05:35:33 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Wed, 7 Sep 2011 23:35:33 -0400 Subject: [R] Seasonal and 11-day subset for zoo object In-Reply-To: References: Message-ID: On Wed, Sep 7, 2011 at 9:59 PM, Katrina Bennett wrote: > I have a zooreg object and I want to be able to generate a value for seasons > and 11-day composites paste it onto my zoo data frame, along with year, > month and days. > > Right now I have the following to work from: > > eg. dat.zoo.mdy <- with(month.day.year(time(dat.zoo)), cbind(dat.zoo, year, > month, day, quarter = (month - 1) %/% 3 + 1, dow = > as.numeric(format(time(dat.zoo), "%w")))) > > For the seasons, I have been trying to replace 'quarter' with a seasonal > value of "1" for Dec-Jan-Feb, "2" for Mar-Apr-May, "3" for Jun-Jul-Aug, "4" > for Sep-Oct-Nov. > > dat.zoo.mdy <- with(month.day.year(time(dat.zoo)), cbind(dat.zoo, year, > month, day, > season=for(i in nrow(dat.zoo.mdy)) { > ? ? ? ? if (month[i] == 12) { > ? ? ? ? quarter[i]=1 > ? ? ? ? } else if (month[i] == 3) { > ? ? ? ? quarter[i]=2 > ? ? ? ? } else if (month[i] == 6) { > ? ? ? ? quarter[i]=3 > ? ? ? ? } else quarter[i]=4 }, dow = as.numeric(format(time(dat.zoo), > "%w")))) > > However, this gives me the error: "Error in zoo(structure(x, dim = dim(x)), > index(x), ...) : > ??x? : attempt to define illegal zoo object" > > I'd like to get an 11-day value as well to replace the dow in the first > example, but I'm still trying to figure out if there is an easy way to do > this in zoo. > dat.zoo and "11 days composite" in the question were not defined but we can get the seasons by calculating the quarter of the following month: > d <- seq(as.Date("2011-01-01"), length = 12, by = "month") > as.numeric(format(as.yearqtr(as.yearmon(d) + 1/12), "%q")) [1] 1 1 2 2 2 3 3 3 4 4 4 1 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From ahoerner at rprogress.org Thu Sep 8 07:03:47 2011 From: ahoerner at rprogress.org (andrewH) Date: Wed, 7 Sep 2011 22:03:47 -0700 (PDT) Subject: [R] Searching the console Message-ID: <1315458227346-3797884.post@n4.nabble.com> Is there any way to search the console during an interactive session? I've looked and looked, and can not find one. In some add-on package, maybe? Sorry to be so basic, but help would be greatly appreciated. andrewH -- View this message in context: http://r.789695.n4.nabble.com/Searching-the-console-tp3797884p3797884.html Sent from the R help mailing list archive at Nabble.com. From jwiley.psych at gmail.com Thu Sep 8 07:24:20 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Wed, 7 Sep 2011 22:24:20 -0700 Subject: [R] Searching the console In-Reply-To: <1315458227346-3797884.post@n4.nabble.com> References: <1315458227346-3797884.post@n4.nabble.com> Message-ID: Hi Andrew, If you use http://ess.r-project.org/ just go to the R buffer and type: C-s (CTRL + s) which will let you search through the console. You can use C-M-s (CTRL + ALT + s) if you want to search using regular expressions. Cheers, Josh On Wed, Sep 7, 2011 at 10:03 PM, andrewH wrote: > Is there any way to search the console during an interactive session? I've > looked and looked, and can not find one. ?In some add-on package, maybe? > > Sorry to be so basic, but help would be greatly appreciated. > > andrewH > > -- > View this message in context: http://r.789695.n4.nabble.com/Searching-the-console-tp3797884p3797884.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ From ahoerner at rprogress.org Thu Sep 8 08:45:00 2011 From: ahoerner at rprogress.org (andrewH) Date: Wed, 7 Sep 2011 23:45:00 -0700 (PDT) Subject: [R] Searching the console In-Reply-To: References: <1315458227346-3797884.post@n4.nabble.com> Message-ID: <1315464300252-3797996.post@n4.nabble.com> Thanks, Josh! I'm using TINN-R now, but I have been thinking of switching to ESS. Though perhaps TINN-R has a similar function -- I had been looking for consol functions, rather than editor functions. andrewH -- View this message in context: http://r.789695.n4.nabble.com/Searching-the-console-tp3797884p3797996.html Sent from the R help mailing list archive at Nabble.com. From ahoerner at rprogress.org Thu Sep 8 08:52:45 2011 From: ahoerner at rprogress.org (andrewH) Date: Wed, 7 Sep 2011 23:52:45 -0700 (PDT) Subject: [R] Consistently printing the name of an object passed to a function; & a data-auditing question Message-ID: <1315464765950-3798005.post@n4.nabble.com> Dear folks-- I always seem to find that I spend more than half my time making sure my input date is in the right form, properly aligned, with no bizarre features. You know the drill: five kinds of missing values, three of them documented. An alpha mistype in one numeric field turns 30,000 numbers into factor levels. SPSS conversion turns 250 factors nicely into R factors, except 3 have levels instead of labels. A few columns in some years of a survey have undocumented differences in units. Halfway through a 20-year annual survey, they add two more allowable answers to a question. etc. I'm looking for things to make my data auditing go faster. One of them is a dopy little function, testX(), bundling together a variety of r tools to tell me what is in an object. Here it is: testX <- function(objectX, bar=TRUE) { # A useful diagnostic function object.name <- deparse(substitute(objectX)) if(bar) cat("########################\n"); # visual separation between consecutive objects. cat("testX(", object.name, "): "); cat("Class=", class(objectX)); cat(" Mode=", mode(objectX), "\n"); cat("Summary:\n"); print(summary(objectX)) cat("Structure:\n"); str(objectX); if (is.factor(objectX)) {cat("Levels: ", levels(objectX), "\n"); cat("Length: ", length(objectX), "\n")} invisible(object.name)} This works well when I give it the name of a single object. My problem is when I try to produce descriptions of a bunch of variables in a row, such as the variables in a list of variables, or all the variables that I have clomped together in a data frame. The output is all side effects. Some ways of passing multiple variables get the name wrong, but the rest right. For example, if I have a list of variables, and do: > lapply(varList, testX) I get an output like this: ################################## testX( X[[1L]] ): Class= factor Mode= numeric Summary: 1994 1997 1999 2002 2003 2007 2009 1009 1165 985 2502 2528 2007 3013 Structure: Factor w/ 7 levels "1994","1997",..: 1 1 1 1 1 1 1 1 1 1 ... Levels: 1994 1997 1999 2002 2003 2007 2009 Length: 13209 If instead, I do it with a loop through a the variable names in a data.frame, I get the name wrong _and_ it does not evaluate all the way to an object: > names(var.df) [1] "year" "YEAR" "AGE" "COHORT.5" "COHORT.10" "ETHNIC" "EDUC" "INCOME" "INTERNET" "PARTY" "IDEOL" >for (sel in 1:length(names(var.df))) testX(names(var.df)[sel]) Gives an output like this: ################################## testX( names(var.df)[sel] ): Class= character Mode= character Summary: Length Class Mode 1 character character Structure: chr "year" Or I can select the column instead of the name of the column. This gives me the right answer on the object description, but not the name, thus: > for (sel in 1:length(names(var.df))) testX(var.df[[sel]]) ################################## testX( var.df[[sel]] ): Class= integer Mode= numeric Summary: Min. 1st Qu. Median Mean 3rd Qu. Max. 1994 2002 2003 2003 2007 2009 Structure: int [1:13209] 1994 1994 1994 1994 1994 1994 1994 1994 1994 1994 ... I've tried doing various things to names(var.df)[sel] to get it closer to the object -- as.symbol, eval(substitute() ), several others, but I just get variations on the output above. So there are actually two questions here: 1. How can I write this function so that it works when I just give it an object, but I can also use it with an apply-family function and a list (or vector, or whatever) of objects, and still have it both treat the object as an object and print its name correctly? 2. How can I write the function, or write a loop, or use an apply-family function, to use this function to go through the columns of a data.frame, correctly naming and correctly describing each? Another way of asking this same question is this: I want to be able to give testX the name of an object, or a reference to a named object, via apply-family function, indexing, or whatever. (A) How can I get the name I print, object.name, to be the name of the object in both cases? And, (B), how can I make sure that objectX is the actual object that the name refers to, and not the name or the reference, in both cases? Finally, and this should maybe be another post, I'd love to hear if others have thought through the whole question of efficient data auditing. Is there a suite of tools, or a standard set of recommendations, that you use and like? I'd love to hear any useful advice about how to accelerate this stage of a project, and get more quickly to its statistical heart. Most sincerely, andrewH -- View this message in context: http://r.789695.n4.nabble.com/Consistently-printing-the-name-of-an-object-passed-to-a-function-a-data-auditing-question-tp3798005p3798005.html Sent from the R help mailing list archive at Nabble.com. From brionynorton at gmail.com Thu Sep 8 09:15:23 2011 From: brionynorton at gmail.com (Briony) Date: Thu, 8 Sep 2011 00:15:23 -0700 (PDT) Subject: [R] metaMDS and envfit: Help reading output In-Reply-To: <3056FED61B44C14E86DD1EE3DF337FE10167AB37F427@MEWMAD0PC01G03.accounts.wistate.us> References: <3056FED61B44C14E86DD1EE3DF337FE10167AB37F427@MEWMAD0PC01G03.accounts.wistate.us> Message-ID: <1315466123821-3798043.post@n4.nabble.com> Hi Katie, This is probably a bit late given the date of your post, but I was having similar problems with my own work and thought I'd respond anyway. I'm not sure that the script you've written here will fit 3D vectors for your 3D nmds. I tried it and it doesn't seem to work for me - it only gives 2D for the vectors. I found this: nmds3d <- metaMDS(varespec, k = 3, distance = 'bray', autotransform = FALSE) # run nmds with 3 dimensions nmds3d$stress # stress drops fit3d <- envfit(nmds3d, varechem[ ,1:4], choices = 1:3) # fit environmental vectors to 3d space ordirgl(nmds3d, envfit = fit3d) # dynamic 3D graph at http://en.wikibooks.org/wiki/R_Programming/Ordination ordirgl (in package rgl) gives a very nifty interactive 3d plot, or ordiplot3d is a static version. I hope this is useful. Kind regards, Briony -- View this message in context: http://r.789695.n4.nabble.com/metaMDS-and-envfit-Help-reading-output-tp3513052p3798043.html Sent from the R help mailing list archive at Nabble.com. From igors.lahanciks at gmail.com Thu Sep 8 09:58:07 2011 From: igors.lahanciks at gmail.com (Igors) Date: Thu, 8 Sep 2011 00:58:07 -0700 (PDT) Subject: [R] Overall SSR in plm package In-Reply-To: <1315399654425-3796004.post@n4.nabble.com> References: <1315399654425-3796004.post@n4.nabble.com> Message-ID: <1315468687553-3798116.post@n4.nabble.com> any ideas? -- View this message in context: http://r.789695.n4.nabble.com/Overall-SSR-in-plm-package-tp3796004p3798116.html Sent from the R help mailing list archive at Nabble.com. From igors.lahanciks at gmail.com Thu Sep 8 09:56:49 2011 From: igors.lahanciks at gmail.com (Igors) Date: Thu, 8 Sep 2011 00:56:49 -0700 (PDT) Subject: [R] function censReg in panel data setting In-Reply-To: <1315399008306-3795987.post@n4.nabble.com> References: <1315259907768-3792227.post@n4.nabble.com> <1315288271999-3792639.post@n4.nabble.com> <1315384740517-3795575.post@n4.nabble.com> <1315399008306-3795987.post@n4.nabble.com> Message-ID: <1315468609843-3798113.post@n4.nabble.com> Does censReg expect from panel data to be balanced? Because in my case it is unbalanced. Could this be a reason for errors? Best, Igors -- View this message in context: http://r.789695.n4.nabble.com/function-censReg-in-panel-data-setting-tp3792227p3798113.html Sent from the R help mailing list archive at Nabble.com. From wolfgang.viechtbauer at maastrichtuniversity.nl Thu Sep 8 10:24:50 2011 From: wolfgang.viechtbauer at maastrichtuniversity.nl (Viechtbauer Wolfgang (STAT)) Date: Thu, 8 Sep 2011 10:24:50 +0200 Subject: [R] suggestion for proportions In-Reply-To: <4E680CD1.80905@acm.org> References: <1315383072.86607.YahooMailNeo@web125805.mail.ne1.yahoo.com> <4E671B96020000CB00095795@med-webappgwia1.medicine.umaryland.edu> <077E31A57DA26E46AB0D493C9966AC730C359B463C@UM-MAIL4112.unimaas.nl> <077E31A57DA26E46AB0D493C9966AC730C359B468C@UM-MAIL4112.unimaas.nl> <1315425202.24963.YahooMailNeo@web125819.mail.ne1.yahoo.com> <4E680CD1.80905@acm.org> Message-ID: <077E31A57DA26E46AB0D493C9966AC730C359B4811@UM-MAIL4112.unimaas.nl> I assume you mean Cohen's kappa. This is not what the OP is asking about. The OP wants to know how to test for a difference in the proportions of 1's. Cohen's kappa will tell you what the level of agreement is between the two tests. This is something different. Also, the OP has now clarified that the data are paired. Therefore, prop.test() and binom.test() are not appropriate. So, to answer the OPs question: yes, mcnemar.test() is what you should be using. Wolfgang > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of csrabak > Sent: Thursday, September 08, 2011 02:31 > To: array chip > Cc: r-help at r-project.org > Subject: Re: [R] suggestion for proportions > > Em 7/9/2011 16:53, array chip escreveu: > > Hi all, thanks very much for sharing your thoughts. and sorry for my > describing the problem not clearly, my fault. > > > > My data is paired, that is 2 different diagnostic tests were performed > on the same individuals. Each individual will have a test results from > each of the 2 tests. Then in the end, 2 accuracy rates were calculated for > the 2 tests. And I want to test if there is a significant difference in > the accuracy (proportion) between the 2 tests. My understanding is that > prop.test() is appropriate for 2 independent proportions, whereas in my > situation, the 2 proportions are not independent calculated from "paired" > data, right? > > > > the data would look like: > > > > pid test1 test2 > > p1 1 0 > > p2 1 1 > > p3 0 1 > > : > > : > > > > 1=test is correct; 0=not correct > > > > from the data above, we can calculate accuracy for test1 and test2, then > to compare.... > > > > > > So mcnemar.test() is good for that, right? > > > > Thanks > > > John, > > From above clarifying I suggest you consider the use of kappa test. For > a list of possible ways of doing it in R try: > RSiteSearch("kappa",restrict="functions") > > HTH > > -- > Cesar Rabak From bt_jannis at yahoo.de Thu Sep 8 11:18:57 2011 From: bt_jannis at yahoo.de (Jannis) Date: Thu, 08 Sep 2011 11:18:57 +0200 Subject: [R] invalid strings while parsing code with inlinedocs package.skeleton.dx() Message-ID: <4E688881.9010405@yahoo.de> Dear list members, I use the package inlinedocs to create documentation for a package that I am building. The function package.skeleton.dx() is used to convert the source *.R file into the *.Rd documentation file. Usually this works like a charm, but on several occasions I got messages like this one: extra.code.docs prefixed.lines extract.xxx.chunks title.from.firstline examples.from.testfile definition.from.source edit.package.file author.from.description erase.format title.from.name examples.in.attr collapse tag.s3methods Creating directories ... Creating DESCRIPTION ... Creating Read-and-delete-me ... Copying code files ... Making help files ... Done. Further steps are described in './spectral.methods/Read-and-delete-me'. Modifying files automatically generated by package.skeleton: add.axis: definition title description item{side} item{trans.fun} item{label} item{\dots} seealso author format best.iter.default: definition description item{perf.matrix} details seealso author value format title determine.freq: title description item{series} item{plot.periodogram} ... Warning message: In grep(prefix, src) : input string 65 is invalid in this locale The warning message, however, gives me no hint on in which file and in which position (maybe line 65?) the invalid string can be found. On some occasions I was able to manually find these strings because I remembered which file I recently worked on. One of such errors was for example caused by a ? instead of a i (can you see the difference?). Additionally I have the impression that an error is caused while using R-2.10 and only a warning is caused with 2-12. Is there any convenient way of finding out where exactly this erroneous string is? I tried to execute package.skeleton.dx() row wise but the function quickly branches into a complex nesting of other functions. Thanks for any advice! Jannis From bt_jannis at yahoo.de Thu Sep 8 11:28:30 2011 From: bt_jannis at yahoo.de (Jannis) Date: Thu, 08 Sep 2011 11:28:30 +0200 Subject: [R] several functions in one *.R file in a R package In-Reply-To: <4E663258.8030705@gmail.com> References: <1315319180.35832.YahooMailClassic@web28207.mail.ukl.yahoo.com> <4E663258.8030705@gmail.com> Message-ID: <4E688ABE.7070505@yahoo.de> Thanks, Duncan, for your reply. You are right, the () in my code are not correct. Maybe my problem is that I do not really understand the exact effect of this dot . I have no tried with the following file in my /R folder in the package: mainfunction<- function(x) { x2 <- .subfunction1(x) x3 <- .subfunction2(x2) x3 } .subfunction1<- function(x) { x*2 } .subfunction2<- function(x) { x*2 } After I build the package and load it into R an run (for example): mainfunction(2) I get 8, which indicates that the functions are working. The reason that made me believe beforehand that the code is not working is that when I type the following to see the code of the function: .subfunction1 or run: .subfunction1(2) I get an error that the object .subfunction1 can not be found. Is this the desired effect of adding the dot to the function name? Or do I do something wrong here? Thanks Jannis On 09/06/2011 04:46 PM, Duncan Murdoch wrote: > On 11-09-06 10:26 AM, Jannis wrote: >> Dear list members, >> >> >> i have build a package which contains a collection of my frequently >> used functions. To keep the code organized I have broken down some >> rather extensive and long functions into individual steps and bundled >> these steps in sub-functions that are called inside the main function. >> >> To keep an overview over which sub-function belongs to which main >> function I saved all the respective sub-functions to the same *.R >> file as their main-function and gave them names beginning with . to >> somehow hide the sub-functions. The result would be one *.R file >> in/R for each 'main-function' containing something like: >> >> >> mainfunction<- function() { >> .subfunction1() >> .subfunction2() >> #... >> } >> >> .subfunction1()<- function() { >> #do some stuff >> } >> .subfunction2()<- function() { >> #do some more stuff >> } >> >> >> According to the way I understood the "Writing R Extensions" Manual I >> expected this to work. When I load the package, however, I get the >> error message that the sub-functions could not be found. Manually >> sourcing all files in the/R directory however yields the >> expected functionality. >> >> In what way am I mistaken here? Any ideas? > > Those definitions of .subfunction1 and .subfunction2 are not > syntactically correct: extra parens. If that's just a typo in the > message, then you'll have to show us real code. What you describe > should work. > > Duncan Murdoch > From bt_jannis at yahoo.de Thu Sep 8 11:45:28 2011 From: bt_jannis at yahoo.de (Jannis) Date: Thu, 08 Sep 2011 11:45:28 +0200 Subject: [R] invalid strings while parsing code with inlinedocs package.skeleton.dx() In-Reply-To: <4E688881.9010405@yahoo.de> References: <4E688881.9010405@yahoo.de> Message-ID: <4E688EB8.7040000@yahoo.de> Seems as whether I found a (clumsy) workaround: 1. options(warn =2) 2. run package.skeleton.dx, now an error is produced with the name of the file package.skeleton was actually working on 3. The number of the erroneous input string should correspond to the line number in that file. Just in case anybody is digging for the same solution ;-) Jannis On 09/08/2011 11:18 AM, Jannis wrote: > Dear list members, > > > I use the package inlinedocs to create documentation for a package > that I am building. The function package.skeleton.dx() is used to > convert the source *.R file into the *.Rd documentation file. Usually > this works like a charm, but on several occasions I got messages like > this one: > > > > extra.code.docs prefixed.lines extract.xxx.chunks title.from.firstline > examples.from.testfile definition.from.source edit.package.file > author.from.description erase.format title.from.name examples.in.attr > collapse tag.s3methods > Creating directories ... > Creating DESCRIPTION ... > Creating Read-and-delete-me ... > Copying code files ... > Making help files ... > Done. > Further steps are described in './spectral.methods/Read-and-delete-me'. > Modifying files automatically generated by package.skeleton: > add.axis: definition title description item{side} item{trans.fun} > item{label} item{\dots} seealso author format > best.iter.default: definition description item{perf.matrix} details > seealso author value format title > determine.freq: title description item{series} item{plot.periodogram} > > > ... > > Warning message: > In grep(prefix, src) : input string 65 is invalid in this locale > > > The warning message, however, gives me no hint on in which file and in > which position (maybe line 65?) the invalid string can be found. On > some occasions I was able to manually find these strings because I > remembered which file I recently worked on. One of such errors was for > example caused by a ? instead of a i (can you see the difference?). > Additionally I have the impression that an error is caused while using > R-2.10 and only a warning is caused with 2-12. > > Is there any convenient way of finding out where exactly this > erroneous string is? I tried to execute package.skeleton.dx() row wise > but the function quickly branches into a complex nesting of other > functions. > > > > Thanks for any advice! > > Jannis > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From murdoch.duncan at gmail.com Thu Sep 8 11:51:35 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Thu, 8 Sep 2011 05:51:35 -0400 Subject: [R] storage and single-precision In-Reply-To: References: Message-ID: <4E689027.4090808@gmail.com> On 11-09-07 6:25 PM, Mike Miller wrote: > I'm getting the impression from on-line docs that R cannot work with > single-precision floating-point numbers, but that it has a pseudo-mode for > single precision for communication with external programs. > > I don't mind that R is using doubles internally, but what about storage? > If all I need to store is single-precision (32-bit), can I do that? When > it is read back into R it can be converted from single to double (back to > 64-bit). > > Furthermore, the data are numbers from 0.000 to 2.000 with no missing > values that could be stored just as accurately as unsigned 16-bit integers > from 0 to 2000. That would be the best plan for me. writeBin is quite flexible in converting between formats if you just want to store them on disk. To use nonstandard formats in memory will require external support; it's not easy. Duncan Murdoch > > It looks like the ff package allows for additional formats, so I might try > to use ff, but I still would like to get a better understanding of R's > native capabilities in regard to representations of numbers both in RAM > and in stored data files. > > Thanks in advance. > > Best, > Mike > > -- > Michael B. Miller, Ph.D. > Minnesota Center for Twin and Family Research > Department of Psychology > University of Minnesota > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From murdoch.duncan at gmail.com Thu Sep 8 11:57:55 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Thu, 08 Sep 2011 05:57:55 -0400 Subject: [R] rgl 'how-to's In-Reply-To: <4E67EB14.6070109@gmail.com> References: <4E67EB14.6070109@gmail.com> Message-ID: <4E6891A3.9040704@gmail.com> On 11-09-07 6:07 PM, Dale Coons wrote: > Doing a visual graphic and trying to make it "pretty" > > Here's simple chart to play with: > > library(rgl) > dotframe<-data.frame(x=c(0,7,0,0,-7,0),y=c(0,0,7,0,0,-7),z=c(7,0,0,-7,0,0)) > dotframe plot3d(dotframe$x,dotframe$y,dotframe$z, radius=3, > type='s',col=c('red','green','blue','purple','orange','gray'), > axes=FALSE, box=FALSE, xlab='',ylab='',zlab='') > > text3d(x=7,y=0,z=0, text="hello, world",adj = 0.5, color="blue") #adds a > label at one of the points > > My questions: > > 1) is there a way to label the points (spheres in this case) so that the > label 'stays on top'? other than text3d(), which adds labels, but they > are hidden when the graph is rotated? Yes, but it's not trivial. You need to write your own mouse handler, that erases the old labels, rotates the data, and replots the labels afterwards. This would be quite feasible with a small number of labels, but maybe not if you have too many: the delays for replotting would be noticeable. > 2) can a bitmap, say, of a company or university be inserted into the > title area? I think so, by plotting a "sprite" with the bitmap as a "texture". See ?sprites3d. > 3) can a bitmap be used as the marker for a point? That also sounds like a sprite. Duncan Murdoch > > Thanks in advance for help-I learn a lot from others questions and > appreciate direction (even if it's RTFM!) > > Dale > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From E.Vettorazzi at uke.de Thu Sep 8 11:58:13 2011 From: E.Vettorazzi at uke.de (Eik Vettorazzi) Date: Thu, 8 Sep 2011 11:58:13 +0200 Subject: [R] Searching the console In-Reply-To: <1315458227346-3797884.post@n4.nabble.com> References: <1315458227346-3797884.post@n4.nabble.com> Message-ID: <4E6891B5.9070905@uke.de> Hi Andrew, maybe history() helps you? It also allows pattern search (using grep internally). hth. Am 08.09.2011 07:03, schrieb andrewH: > Is there any way to search the console during an interactive session? I've > looked and looked, and can not find one. In some add-on package, maybe? > > Sorry to be so basic, but help would be greatly appreciated. > > andrewH > > -- > View this message in context: http://r.789695.n4.nabble.com/Searching-the-console-tp3797884p3797884.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Institut f?r Medizinische Biometrie und Epidemiologie Universit?tsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. J?rg F. Debatin (Vorsitzender), Dr. Alexander Kirstein, Joachim Pr?l?, Prof. Dr. Dr. Uwe Koch-Gromus From murdoch.duncan at gmail.com Thu Sep 8 12:02:17 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Thu, 08 Sep 2011 06:02:17 -0400 Subject: [R] several functions in one *.R file in a R package In-Reply-To: <4E688ABE.7070505@yahoo.de> References: <1315319180.35832.YahooMailClassic@web28207.mail.ukl.yahoo.com> <4E663258.8030705@gmail.com> <4E688ABE.7070505@yahoo.de> Message-ID: <4E6892A9.3040702@gmail.com> On 11-09-08 5:28 AM, Jannis wrote: > Thanks, Duncan, for your reply. You are right, the () in my code are not > correct. Maybe my problem is that I do not really understand the exact > effect of this dot . I have no tried with the following file in my /R > folder in the package: > > > mainfunction<- function(x) { > x2<- .subfunction1(x) > x3<- .subfunction2(x2) > x3 > } > > .subfunction1<- function(x) { > x*2 > } > .subfunction2<- function(x) { > x*2 > } > > After I build the package and load it into R an run (for example): > > mainfunction(2) > > I get 8, which indicates that the functions are working. The reason that > made me believe beforehand that the code is not working is that when I > type the following to see the code of the function: > > .subfunction1 > > or run: > > .subfunction1(2) > > I get an error that the object .subfunction1 can not be found. Is this > the desired effect of adding the dot to the function name? Or do I do > something wrong here? If you chose to export those names, they would be visible. If you didn't export them, they wouldn't. The default NAMESPACE generated by package.skeleton would not export them. Duncan Murdoch > > > Thanks > Jannis > > > > > On 09/06/2011 04:46 PM, Duncan Murdoch wrote: >> On 11-09-06 10:26 AM, Jannis wrote: >>> Dear list members, >>> >>> >>> i have build a package which contains a collection of my frequently >>> used functions. To keep the code organized I have broken down some >>> rather extensive and long functions into individual steps and bundled >>> these steps in sub-functions that are called inside the main function. >>> >>> To keep an overview over which sub-function belongs to which main >>> function I saved all the respective sub-functions to the same *.R >>> file as their main-function and gave them names beginning with . to >>> somehow hide the sub-functions. The result would be one *.R file >>> in/R for each 'main-function' containing something like: >>> >>> >>> mainfunction<- function() { >>> .subfunction1() >>> .subfunction2() >>> #... >>> } >>> >>> .subfunction1()<- function() { >>> #do some stuff >>> } >>> .subfunction2()<- function() { >>> #do some more stuff >>> } >>> >>> >>> According to the way I understood the "Writing R Extensions" Manual I >>> expected this to work. When I load the package, however, I get the >>> error message that the sub-functions could not be found. Manually >>> sourcing all files in the/R directory however yields the >>> expected functionality. >>> >>> In what way am I mistaken here? Any ideas? >> >> Those definitions of .subfunction1 and .subfunction2 are not >> syntactically correct: extra parens. If that's just a typo in the >> message, then you'll have to show us real code. What you describe >> should work. >> >> Duncan Murdoch >> > From petr.pikal at precheza.cz Thu Sep 8 12:14:44 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Thu, 8 Sep 2011 12:14:44 +0200 Subject: [R] access objects In-Reply-To: References: , Message-ID: Hi Beware of shooting in your leg. Instead of poisoning your workspace with numerous objects called "obj.n" or something like that you help yourself with creating single object of type list. Parts of a list can be accessed much more easily than this kind of get construction. I use R quite a long time but I hardly remember a single event when I used such construction; I always used lists. In your example lll<-vector("list", 2) lll[[1]] <- 7:9 lll[[2]] <- 6:2 testf <- function(k) plot(lll[[k]]) testf(1) Regards Petr > Thanks! > > That's exactly it. > get gets it ;-) > > Berry > > > From: michael.weylandt at gmail.com > Date: Wed, 7 Sep 2011 14:54:48 -0500 > Subject: Re: [R] access objects > To: berryboessenkool at hotmail.com > CC: r-help at r-project.org > > The get() function should get you where you need. > > Michael Weylandt > > On Wed, Sep 7, 2011 at 12:53 PM, Berry Boessenkool > wrote: > > hi, > > say I have consecutively numbered objects obj1, obj2, ... in my R workspace. > > I want to acces one of them inside a function, with the number given as anargument. > > Where can I find help on how to do that? Somebody must have been trying to > do this before... > > Some keywords to start a search are appreciated as well. > > > > Here's an example, I hope it clarifies what I'm trying to do: > > > > obj1 <- 7:9 > > obj2 <- 6:2 > > testf <- function(k) plot( noquote(paste("obj", k, sep="")) ) > > testf(1) # should plot obj1 > > > > > > > > ------------------------------------- > > Berry Boessenkool > > D-14476 Potsdam (OT Golm) > > ------------------------------------- > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From pdalgd at gmail.com Thu Sep 8 13:09:12 2011 From: pdalgd at gmail.com (peter dalgaard) Date: Thu, 8 Sep 2011 13:09:12 +0200 Subject: [R] how to create data.frames from vectors with duplicates In-Reply-To: References: Message-ID: <7529FFE4-DC1E-4ED6-81C3-09D2784E6F32@gmail.com> On Sep 8, 2011, at 03:18 , zhenjiang xu wrote: > Thanks for all your replies. I am using rowsum() and it looks efficient. I > hope I could do some benchmark sometime in near future and let people know. > Or is there any benchmark result available? I'm a bit surprised that no-one thought of xtabs(x ~ y) It won't win the speed competition, though. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com From jim at bitwrit.com.au Thu Sep 8 13:36:42 2011 From: jim at bitwrit.com.au (Jim Lemon) Date: Thu, 08 Sep 2011 21:36:42 +1000 Subject: [R] Reshaping data from wide to tall format for multilevel modeling In-Reply-To: <1315404152635-3796168.post@n4.nabble.com> References: <1315404152635-3796168.post@n4.nabble.com> Message-ID: <4E68A8CA.9000201@bitwrit.com.au> On 09/08/2011 12:02 AM, dadrivr wrote: > Hi, > > I'm trying to reshape my data set from wide to tall format for multilevel > modeling. Unfortunately, the function I typically use (make.univ from the > multilevel package) does not appear to work with unbalanced data frames, > which is what I'm dealing with. > > Below is an example of the columns of a data frame similar to what I'm > working with: > ID a1 a2 a4 b2 b3 b4 b5 b6 > > Below is what I want the columns to be after reshaping the data to long > format: > ID a b time > > Here is an example data frame that I want to reshape: > ID<- c(1,2,3) > a1<- c(NA, rnorm(2)) > a2<- c(NA, rnorm(1), NA) > a4<- c(NA, rnorm(2)) > b2<- c(rnorm(2), NA) > b3<- rnorm(3) > b4<- NA > b5<- rnorm(3) > b6<- rnorm(3) > mydata<- as.data.frame(cbind(ID,a1,a2,a4,b2,b3,b4,b5,b6)) > > What is the best way to do this efficiently with MANY variables with widely > differing time ranges? Note that I will have to manually enter the time for > a given measurement because in the wide format, the time is in the variable > names. By the way, I have a fairly large data set, with some variables > occurring at 2 time points and other variables occurring at 20 time points. > Thanks for your help! > Hi dadrivr, I think you can do what you want using the rep_n_stack function in the prettyR package. If you want a data frame at the end, you will have to pad out your input data frame so that the lengths of the columns will be equal. You'll get lots of NAs, but without them, you won't get a data frame. mydata$a3<-NA mydata$a5<-NA mydata$a6<-NA mydata$b1<-NA mydata Now you have equal numbers of "a" and "b" columns. To reshape this into three columns is easy: rep_n_stack(mydata,to.stack=c("a1","a2","a3","a4","a5","a6", "b1","b2","b3","b4","b5","b6"),stack.names=c("ab","time")) If you want the "a" and "b" columns separate, try this: rep_n_stack(mydata,to.stack=matrix(c(2,3,10,4,11,12,13,5,6,7,8,9),nrow=2, byrow=TRUE),stack.names=c("a","time","b","time")) Currently you have to pass the column indices directly to get the correct order in the output. I hadn't anticipated the missing column problem when I wrote the function. Jim From pra at vfl.dk Thu Sep 8 10:56:18 2011 From: pra at vfl.dk (Peter Raundal) Date: Thu, 8 Sep 2011 01:56:18 -0700 (PDT) Subject: [R] Code for The R Book In-Reply-To: <848542.98911.qm@web57002.mail.re3.yahoo.com> References: <271421.60664.qm@web57003.mail.re3.yahoo.com> <848542.98911.qm@web57002.mail.re3.yahoo.com> Message-ID: <1315472178605-3798218.post@n4.nabble.com> Hi Paul, I'm struggling to find the R code as well on the books website. Did you managed to get it and are you willing to share it with me? BR Peter -- View this message in context: http://r.789695.n4.nabble.com/Code-for-The-R-Book-tp3091596p3798218.html Sent from the R help mailing list archive at Nabble.com. From borisberanger at gmail.com Thu Sep 8 10:42:51 2011 From: borisberanger at gmail.com (Boris Beranger) Date: Thu, 8 Sep 2011 01:42:51 -0700 (PDT) Subject: [R] generate randomly a value of a vector Message-ID: <1315471371399-3798190.post@n4.nabble.com> Hi everyone, I have a zero vector of length N and I would like to randomly allocate the value 1 to one of the values of this vector. I presume I have to use the uniform distribution but could someone tell me how I should process? Thanks in advance, Boris -- View this message in context: http://r.789695.n4.nabble.com/generate-randomly-a-value-of-a-vector-tp3798190p3798190.html Sent from the R help mailing list archive at Nabble.com. From divyamurali13 at gmail.com Thu Sep 8 11:37:58 2011 From: divyamurali13 at gmail.com (Divyam) Date: Thu, 8 Sep 2011 02:37:58 -0700 (PDT) Subject: [R] predictive modeling and extremely large data In-Reply-To: References: <1315387554818-3795674.post@n4.nabble.com> Message-ID: <1315474678631-3798294.post@n4.nabble.com> Hi Steve, Thanks for the reply. I am a complete novice to R and to using SVM as well and therefore the terms that I'd used maybe non standard(I am not sure how to call them technically). Nevertheless the number of data points I had mentioned are just instances. But as I mentioned, these are bound to grow perpetually and hence I am looking for a way to manage the data for training in the most effective way possible without any information loss. I did a quick search on online SVM and think it is what I am looking for. Thanks for the links. Divya -- View this message in context: http://r.789695.n4.nabble.com/predictive-modeling-and-extremely-large-data-tp3795674p3798294.html Sent from the R help mailing list archive at Nabble.com. From djandrija at gmail.com Thu Sep 8 13:50:07 2011 From: djandrija at gmail.com (andrija djurovic) Date: Thu, 8 Sep 2011 13:50:07 +0200 Subject: [R] generate randomly a value of a vector In-Reply-To: <1315471371399-3798190.post@n4.nabble.com> References: <1315471371399-3798190.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From patrick.breheny at uky.edu Thu Sep 8 14:15:45 2011 From: patrick.breheny at uky.edu (Patrick Breheny) Date: Thu, 8 Sep 2011 08:15:45 -0400 Subject: [R] generate randomly a value of a vector In-Reply-To: <1315471371399-3798190.post@n4.nabble.com> References: <1315471371399-3798190.post@n4.nabble.com> Message-ID: <4E68B1F1.7090308@uky.edu> On 09/08/2011 04:42 AM, Boris Beranger wrote: > I have a zero vector of length N and I would like to randomly allocate the > value 1 to one of the values of this vector. I presume I have to use the > uniform distribution but could someone tell me how I should process? rmultinom(m,1,rep(1,N)) where m is the number of random vectors you wish to generate in this manner. -- Patrick Breheny Assistant Professor Department of Biostatistics Department of Statistics University of Kentucky From jvadams at usgs.gov Thu Sep 8 14:25:29 2011 From: jvadams at usgs.gov (Jean V Adams) Date: Thu, 8 Sep 2011 07:25:29 -0500 Subject: [R] Question about model selection for glm -- how to select features based on BIC? In-Reply-To: <1315437139.61126.YahooMailClassic@web120618.mail.ne1.yahoo.com> References: <1315437139.61126.YahooMailClassic@web120618.mail.ne1.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pisicandru at hotmail.com Thu Sep 8 14:41:57 2011 From: pisicandru at hotmail.com (Monica Pisica) Date: Thu, 8 Sep 2011 12:41:57 +0000 Subject: [R] Getting the values out of histogram (lattice) In-Reply-To: References: , <201108312301.p7VN1O2N007997@mail12.tpg.com.au>, <4E5F112B.5090803@xtra.co.nz>, Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From maxusivanov at gmail.com Thu Sep 8 14:19:05 2011 From: maxusivanov at gmail.com (=?KOI8-R?B?7cHL08nNIOnXwc7P1w==?=) Date: Thu, 8 Sep 2011 16:19:05 +0400 Subject: [R] error in knn: too many ties in knn Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sarah.goslee at gmail.com Thu Sep 8 14:47:01 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Thu, 8 Sep 2011 08:47:01 -0400 Subject: [R] Searching the console In-Reply-To: <1315458227346-3797884.post@n4.nabble.com> References: <1315458227346-3797884.post@n4.nabble.com> Message-ID: I'm not at all certain what you wish to search: R objects? Past commands? If the latter, others have offered suggestions. If the former, what about simply ls()? Or if you mean something else entirely, please clarify. Sarah On Thu, Sep 8, 2011 at 1:03 AM, andrewH wrote: > Is there any way to search the console during an interactive session? I've > looked and looked, and can not find one. ?In some add-on package, maybe? > > Sorry to be so basic, but help would be greatly appreciated. > > andrewH > > -- -- Sarah Goslee http://www.functionaldiversity.org From borisberanger at gmail.com Thu Sep 8 14:02:49 2011 From: borisberanger at gmail.com (Boris Beranger) Date: Thu, 8 Sep 2011 05:02:49 -0700 (PDT) Subject: [R] generate randomly a value of a vector In-Reply-To: References: <1315471371399-3798190.post@n4.nabble.com> Message-ID: <1315483369458-3798595.post@n4.nabble.com> Thank you very much Andrija, I have been do some research and was about to post the same solution. Boris -- View this message in context: http://r.789695.n4.nabble.com/generate-randomly-a-value-of-a-vector-tp3798190p3798595.html Sent from the R help mailing list archive at Nabble.com. From andrew.beckerman at gmail.com Thu Sep 8 14:22:18 2011 From: andrew.beckerman at gmail.com (Andrew Beckerman) Date: Thu, 8 Sep 2011 13:22:18 +0100 Subject: [R] predict.rma (metafor package) Message-ID: <80D51324C8D04C34A77211FBD2BB3461@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From JRadinger at gmx.at Thu Sep 8 14:53:12 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Thu, 08 Sep 2011 14:53:12 +0200 Subject: [R] Extract r.squared using cbind in lm Message-ID: <20110908125312.297740@gmx.net> Hello, I am using cbind in a lm-model. For standard lm-models the r.squared can be easily extracted with summary(model)$r.squared, but that is not working in in the case with cbind. Here an example to illustrate the problem: a <- c(1,3,5,2,5,3,1,6,7,2,3,2,6) b <- c(12,15,18,10,18,22,9,7,9,23,12,17,13) c <- c(22,26,32,33,32,28,29,37,34,29,30,32,29) data <- data.frame(a,b,c) model_a <-lm(b~a,data=data) model_b <-lm(cbind(b,c)~a,data=data) summary(model_a)$r.squared summary(model_b)$r.squared How can I access r.squared in my case? Is there any option? In the end, I want a dataframe containing the the intercept, slope, p-value and r.squared for all Y's of my regression. thank you Johannes -- From andy_liaw at merck.com Thu Sep 8 14:58:37 2011 From: andy_liaw at merck.com (Liaw, Andy) Date: Thu, 8 Sep 2011 08:58:37 -0400 Subject: [R] randomForest memory footprint In-Reply-To: References: Message-ID: It looks like you are building a regression model. With such a large number of rows, you should try to limit the size of the trees by setting nodesize to something larger than the default (5). The issue, I suspect, is the fact that the size of the largest possible tree has about 2*nodesize nodes, and each node takes a row in a matrix to store. Multiply that by the number of trees you are trying to build, and you see how the memory can be gobbled up quickly. Boosted trees don't usually run into this problem because one usually boost very small trees (usually no more than 10 terminal nodes per tree). Best, Andy > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of John Foreman > Sent: Wednesday, September 07, 2011 2:46 PM > To: r-help at r-project.org > Subject: [R] randomForest memory footprint > > Hello, I am attempting to train a random forest model using the > randomForest package on 500,000 rows and 8 columns (7 predictors, 1 > response). The data set is the first block of data from the UCI > Machine Learning Repo dataset "Record Linkage Comparison Patterns" > with the slight modification that I dropped two columns with lots of > NA's and I used knn imputation to fill in other gaps. > > When I load in my dataset, R uses no more than 100 megs of RAM. I'm > running a 64-bit R with ~4 gigs of RAM available. When I execute the > randomForest() function, however I get memory complaints. Example: > > > summary(mydata1.clean[,3:10]) > cmp_fname_c1 cmp_lname_c1 cmp_sex cmp_bd > cmp_bm cmp_by cmp_plz is_match > Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000 > Min. :0.0000 Min. :0.0000 Min. :0.00000 FALSE:572820 > 1st Qu.:0.2857 1st Qu.:0.1000 1st Qu.:1.0000 1st Qu.:0.0000 > 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.00000 TRUE : 2093 > Median :1.0000 Median :0.1818 Median :1.0000 Median :0.0000 > Median :0.0000 Median :0.0000 Median :0.00000 > Mean :0.7127 Mean :0.3156 Mean :0.9551 Mean :0.2247 > Mean :0.4886 Mean :0.2226 Mean :0.00549 > 3rd Qu.:1.0000 3rd Qu.:0.4286 3rd Qu.:1.0000 3rd Qu.:0.0000 > 3rd Qu.:1.0000 3rd Qu.:0.0000 3rd Qu.:0.00000 > Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000 > Max. :1.0000 Max. :1.0000 Max. :1.00000 > > mydata1.rf.model2 <- randomForest(x = > mydata1.clean[,3:9],y=mydata1.clean[,10],ntree=100) > Error: cannot allocate vector of size 877.2 Mb > In addition: Warning messages: > 1: In dim(data) <- dim : > Reached total allocation of 3992Mb: see help(memory.size) > 2: In dim(data) <- dim : > Reached total allocation of 3992Mb: see help(memory.size) > 3: In dim(data) <- dim : > Reached total allocation of 3992Mb: see help(memory.size) > 4: In dim(data) <- dim : > Reached total allocation of 3992Mb: see help(memory.size) > > Other techniques such as boosted trees handle the data size just fine. > Are there any parameters I can adjust such that I can use a value of > 100 or more for ntree? > > Thanks, > John > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Notice: This e-mail message, together with any attachme...{{dropped:11}} From roger.bos at rothschild.com Thu Sep 8 15:03:12 2011 From: roger.bos at rothschild.com (Bos, Roger) Date: Thu, 8 Sep 2011 09:03:12 -0400 Subject: [R] How to specify a variable name in the regression formula without hard coding it Message-ID: I have a matrix called mat and y is the column number of my response and x is a vector of the column numbers of my terms. The variable name of y can change, so I don't want to hardcode it. I can find out the name as follows: > names(mat)[y] [1] "er12.l" Then I can run the regression by hard coding the variable name as follows: > mod <- try(rlm(er12.l ~ ., data=mat[zidx, c(y, x)]), silent=TRUE) But how would I do so without hard coding the name er12.l? I set up a reproducible example. In the following my regression formula is aa ~ bb + cc, or more simply aa ~ . How can I use the name of the first column without hard coding aa? dat <- data.frame(aa=runif(50), bb=runif(50), cc=runif(50)) names(dat)[1] lm(aa ~ ., data=dat[1:10,]) Thanks, Roger *************************************************************** This message is for the named person's use only. It may\...{{dropped:14}} From francogrex at mail.com Thu Sep 8 15:08:19 2011 From: francogrex at mail.com (francogrex) Date: Thu, 8 Sep 2011 06:08:19 -0700 (PDT) Subject: [R] 3D plot RGL Message-ID: <1315487299225-3798754.post@n4.nabble.com> Hi, anyone has experience with 3D plot (ex: in package RGL) I have a question, I draw a 3D plot of country, year and sales in z axis but when the type is "h" then it's ok but when I want to link the points and type is 'l' lines it's a mess Is there a way to link the points only in one direction? For example a unique line from each country through each year? The code example is below library(rgl) data=read.csv("c:/datout.csv", header=T) Colors=as.vector(data[,1]) dat=as.vector(data[,2]) Year=as.vector(data[,3]) Country=as.vector(data[,4]) Sales=as.vector(data[,5]) Sales=as.matrix(Sales) plot3d(Year, dat, Sales, type="h", axes=F, lwd=5, xlab="Year", ylab="Country", zlab="Sales", col=Colors, main="Typherix: Country/Year/Sales") ##replacing type="l" makes a connections all around, I want only one per country axis3d("x", nticks=10, cex=0.7) axis3d("z", nticks=5, cex=0.7, las=2) axis3d("y", labels=Country, las=2, nticks=20, cex=0.5) -- View this message in context: http://r.789695.n4.nabble.com/3D-plot-RGL-tp3798754p3798754.html Sent from the R help mailing list archive at Nabble.com. From pdalgd at gmail.com Thu Sep 8 15:10:02 2011 From: pdalgd at gmail.com (peter dalgaard) Date: Thu, 8 Sep 2011 15:10:02 +0200 Subject: [R] Extract r.squared using cbind in lm In-Reply-To: <20110908125312.297740@gmx.net> References: <20110908125312.297740@gmx.net> Message-ID: On Sep 8, 2011, at 14:53 , Johannes Radinger wrote: > Hello, > > I am using cbind in a lm-model. For standard lm-models > the r.squared can be easily extracted with summary(model)$r.squared, > but that is not working in in the case with cbind. > > Here an example to illustrate the problem: > a <- c(1,3,5,2,5,3,1,6,7,2,3,2,6) > b <- c(12,15,18,10,18,22,9,7,9,23,12,17,13) > c <- c(22,26,32,33,32,28,29,37,34,29,30,32,29) > > data <- data.frame(a,b,c) > > model_a <-lm(b~a,data=data) > model_b <-lm(cbind(b,c)~a,data=data) > > summary(model_a)$r.squared > summary(model_b)$r.squared > > > How can I access r.squared in my case? Is there any option? > In the end, I want a dataframe containing the the intercept, > slope, p-value and r.squared for all Y's of my regression. > summary(model_b) is a list of one-dimensional summaries, so extract for each element: > summary(model_b)$`Response b`$r.squared [1] 0.03650572 Or you can get fancy and do things along the following lines: > lapply(summary(model_b),"[[","r.squared") $`Response b` [1] 0.03650572 $`Response c` [1] 0.348667 > thank you > Johannes -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com From roger.bos at rothschild.com Thu Sep 8 15:13:16 2011 From: roger.bos at rothschild.com (Bos, Roger) Date: Thu, 8 Sep 2011 09:13:16 -0400 Subject: [R] Variable scoping question Message-ID: I modified an example in the object.size help page to create a function I want to be able to run: "mysize" <- function() { z <- sapply(ls(), function(w) object.size(get(w))) as.matrix(rev(sort(z))[1:5]) } mysize() When I test the lines inside the function it works fine: > z <- sapply(ls(), function(w) object.size(get(w))) > as.matrix(rev(sort(z))[1:5]) [,1] mat 166344288 mod 130794704 zidx 799664 wfidx 799664 megacap 799664 > But when I try to run the function, it produces an error: > "mysize" <- function() { + z <- sapply(ls(), function(w) object.size(get(w))) + as.matrix(rev(sort(z))[1:5]) + } > mysize() Error in rev(sort(z)) : error in evaluating the argument 'x' in selecting a method for function 'rev': Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 'x' must be atomic > It must be a variable scoping problem, but I am not sure how to tackle it. Thanks, Roger *************************************************************** This message is for the named person's use only. It may\...{{dropped:14}} From petr.pikal at precheza.cz Thu Sep 8 15:18:41 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Thu, 8 Sep 2011 15:18:41 +0200 Subject: [R] Extract r.squared using cbind in lm In-Reply-To: <20110908125312.297740@gmx.net> References: <20110908125312.297740@gmx.net> Message-ID: Hi > > Hello, > > I am using cbind in a lm-model. For standard lm-models > the r.squared can be easily extracted with summary(model)$r.squared, > but that is not working in in the case with cbind. > > Here an example to illustrate the problem: > a <- c(1,3,5,2,5,3,1,6,7,2,3,2,6) > b <- c(12,15,18,10,18,22,9,7,9,23,12,17,13) > c <- c(22,26,32,33,32,28,29,37,34,29,30,32,29) > > data <- data.frame(a,b,c) > > model_a <-lm(b~a,data=data) > model_b <-lm(cbind(b,c)~a,data=data) > > summary(model_a)$r.squared > summary(model_b)$r.squared > > > How can I access r.squared in my case? Is there any option? > summary(model_b)[[1]]$r.squared [1] 0.03650572 The result is list of 2 models, one for response b and one for response c you can get values for both by lapply or sapply sapply(summary(model_b), "[", 8) putting all together in some data frame is a matter of unlisting results and transforming them according to your wish. see ?unlist, ?coef Regards Petr In the end, I want a dataframe containing the the intercept, > slope, p-value and r.squared for all Y's of my regression. > > thank you > Johannes > -- > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From dwinsemius at comcast.net Thu Sep 8 15:21:25 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 8 Sep 2011 09:21:25 -0400 Subject: [R] How to specify a variable name in the regression formula without hard coding it In-Reply-To: References: Message-ID: <9F1F6CBC-FE32-43C4-805F-99F3AA85CC98@comcast.net> On Sep 8, 2011, at 9:03 AM, Bos, Roger wrote: > I have a matrix called mat and y is the column number of my response > and > x is a vector of the column numbers of my terms. The variable name > of y > can change, so I don't want to hardcode it. I can find out the name > as > follows: > >> names(mat)[y] > [1] "er12.l" > > Then I can run the regression by hard coding the variable name as > follows: > >> mod <- try(rlm(er12.l ~ ., data=mat[zidx, c(y, x)]), > silent=TRUE) > > But how would I do so without hard coding the name er12.l? > > I set up a reproducible example. In the following my regression > formula > is aa ~ bb + cc, or more simply aa ~ . > How can I use the name of the first column without hard coding aa? > > dat <- data.frame(aa=runif(50), bb=runif(50), cc=runif(50)) "dep <- names(dat)[1]; indep <- names(dat)[2] > lm(aa ~ ., data=dat[1:10, c(dep,indep]) > > dep <- names(dat)[1]; indep <- names(dat)[2] > lm(aa ~ ., data=dat[1:10, c(dep,indep)]) Call: lm(formula = aa ~ ., data = dat[1:10, c(dep, indep)]) Coefficients: (Intercept) bb 0.58908 0.03389 > Thanks, > > Roger > *************************************************************** > > This message is for the named person's use only. It may\...{{dropped: > 14}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From roger.bos at rothschild.com Thu Sep 8 15:29:10 2011