From sarah.goslee at gmail.com Sat Jan 1 00:06:44 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Fri, 31 Dec 2010 18:06:44 -0500 Subject: [R] Sweave for "big" data analysis In-Reply-To: References: Message-ID: My very simple approach is to check if the output file exists (within the Sweave file), and run the time-consuming bits only if it does not. As Uwe says, there are more sophisticated approaches too. Sarah On Fri, Dec 31, 2010 at 3:35 PM, Lars Bishop wrote: > Hi, > > Maybe I'm missing the point here...but let's suppose you are working with > "large" data sets and using functions that take a significant amount of time > to run in R. I woulnd't like to run these functions every time I call > Sweave("myfile.Rnw") within R. What is the "common" practice to use Sweave > in these situations. I would just run the function once, save the results > and only load them each time I run Sweave on the .Rnw file. Makes sense? > > Sorry, the question seems silly, but I'd appreciate your thoughts. > > Thanks, > Lars. > -- Sarah Goslee http://www.functionaldiversity.org From xie at yihui.name Sat Jan 1 00:20:34 2011 From: xie at yihui.name (Yihui Xie) Date: Fri, 31 Dec 2010 17:20:34 -0600 Subject: [R] Sweave for "big" data analysis In-Reply-To: References: Message-ID: I still recommend the pgfSweave package (as usual) -- you can cache both data objects (using cacheSweave) and graphics (using pgf). Regards, Yihui -- Yihui Xie Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Fri, Dec 31, 2010 at 2:35 PM, Lars Bishop wrote: > Hi, > > Maybe I'm missing the point here...but let's suppose you are working with > "large" data sets and using functions that take a significant amount of time > to run in R. I woulnd't like to run these functions every time I call > Sweave("myfile.Rnw") within R. What is the "common" practice to use Sweave > in these situations. I would just run the function once, save the results > and only load them each time I run Sweave on the .Rnw file. Makes sense? > > Sorry, the question seems silly, but I'd appreciate your thoughts. > > Thanks, > Lars. > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From murdoch.duncan at gmail.com Sat Jan 1 00:40:16 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Fri, 31 Dec 2010 18:40:16 -0500 Subject: [R] Sweave for "big" data analysis In-Reply-To: References: Message-ID: <4D1E69E0.6010405@gmail.com> On 31/12/2010 3:35 PM, Lars Bishop wrote: > Hi, > > Maybe I'm missing the point here...but let's suppose you are working with > "large" data sets and using functions that take a significant amount of time > to run in R. I woulnd't like to run these functions every time I call > Sweave("myfile.Rnw") within R. What is the "common" practice to use Sweave > in these situations. I would just run the function once, save the results > and only load them each time I run Sweave on the .Rnw file. Makes sense? > > Sorry, the question seems silly, but I'd appreciate your thoughts. As others have said, there are packages that provide caching. I haven't used them, because I like to keep my projects as self-contained as possible: adding a dependency on one of those packages is undesirable[1]. What I do in the case where there are time consuming calculations is to do all the calculations in a script, and save the results (using save()). Then the Sweave document will load the objects (using load()) and do post-processing, plotting, etc. Duncan Murdoch 1. I do generally write things that are dependent on my own patchDVI package, and curse myself for the dependency all the time. From mtmorgan at fhcrc.org Sat Jan 1 01:07:07 2011 From: mtmorgan at fhcrc.org (Martin Morgan) Date: Fri, 31 Dec 2010 16:07:07 -0800 Subject: [R] python-like dictionary for R In-Reply-To: References: <4D12BC84.7090304@fhcrc.org> Message-ID: <4D1E702B.1020301@fhcrc.org> On 12/30/2010 02:30 PM, Paul Rigor wrote: > Thanks gang, > I'll work with named vectors and concatenate as needed. This might be ok for small problems, but concatenation is an inefficient R pattern -- the objects being concatenated are copied in full, so becomes longer, and the concatenation slower, with each new key. With a <- integer(); t0 <- Sys.time() for (i in seq_len(1e6)) { a <- c(a, i) if (0 == i %% 10000) print(i / as.numeric(Sys.time() - t0)) } we have, in 'appends per second' [1] 3236.76 [1] 2425.111 [1] 1757.52 [1] 1331.846 We don't really have a dictionary here, either, as the 'key' values are not stored. Phil's suggest suffers from the same type of issue, where the addition of new keys implies growing (reallocating) the vector. a <- integer(); t0 <- Sys.time() for (i in seq_len(1e6)) { key <- as.character(i) a[[key]] <- i if (0 == i %% 10000) print(i / as.numeric(Sys.time() - t0)) } [1] 12659.18 [1] 9516.288 [1] 6821.47 [1] 5907.782 Better to use an environment (and live with reference semantics) e <- new.env(parent=emptyenv()); t0 <- Sys.time() for (i in seq_len(1e6)) { key <- as.character(i) e[[key]] <- i if (0 == i %% 10000) print(i / as.numeric(Sys.time() - t0)) } with [1] 20916.56 [1] 21421.85 [1] 21762.04 [1] 21207.69 [1] 21239.19 The usual alternative to the concatenation pattern is 'pre-allocate-and-fill' x <- integer(1e6); t0 <- Sys.time() for (i in seq_len(1e6)) { ??? } but this doesn't work with key/value pairs because there is no sense (or is there?) in which the keys can be 'pre-allocated'. Creating the dictionary in one go is very efficient > system.time(d <- + structure(seq_len(1e6), .Names=as.character(seq_len(1e6)))) user system elapsed 0.417 0.002 0.419 Martin > Paul > > On Thu, Dec 23, 2010 at 7:39 AM, Seth Falcon > wrote: > > On Wed, Dec 22, 2010 at 7:05 PM, Martin Morgan > wrote: > > On 12/22/2010 05:49 PM, Paul Rigor wrote: > >> Hi, > >> > >> I was wondering if anyone has played around this this package called > >> "rdict"? It attempts to implement a hash table in R using skip > lists. Just > >> came across it while trying to look for simpler text manipulation > methods: > >> > >> > http://userprimary.net/posts/2010/05/29/rdict-skip-list-hash-table-for-R/ > > > > kind of an odd question, so kind of an odd answer. > > > > I'd say this was an implementation of skip lists in C with an R > > interface. > > I had to play around with the rdict package in order to write it, but > haven't used it much since :-P > Be sure to look at R's native environment objects which provide a hash > table structure and are suitable for many uses. > > + seth > > -- > Seth Falcon | @sfalcon | http://userprimary.net/ > > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 From Bill.Venables at csiro.au Sat Jan 1 01:32:37 2011 From: Bill.Venables at csiro.au (Bill.Venables at csiro.au) Date: Sat, 1 Jan 2011 11:32:37 +1100 Subject: [R] Changing column names In-Reply-To: <188671.64978.qm@web120302.mail.ne1.yahoo.com> References: <188671.64978.qm@web120302.mail.ne1.yahoo.com> Message-ID: <1BDAE2969943D540934EE8B4EF68F95FB27A48C238@EXNSW-MBX03.nexus.csiro.au> You don't give us much to go on, but some variant of country <- c("US", "France", "UK", "NewZealand", "Germany", "Austria", "Italy", "Canada") result <- read.csv("result.csv", header = FALSE) names(result) <- country should do what you want. ________________________________________ From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Vincy Pyne [vincy_pyne at yahoo.ca] Sent: 31 December 2010 16:07 To: r-help at r-project.org Subject: [R] Changing column names Dear R helpers Wish you all a very Happy and Prosperous New Year 2011. I have following query. country = c("US", "France", "UK", "NewZealand", "Germany", "Austria", "Italy", "Canada") Through some other R process, the result.csv file is generated as result.csv var1 var2 var3 var4 var5 var6 var7 var8 1 25 45 29 92 108 105 65 56 2 80 132 83 38 38 11 47 74 3 135 11 74 56 74 74 74 29 I need the country names to be column heads i.e. I need an output like > result_new US France UK NewZealand Germany Austria Italy Canada 1 25 45 29 92 108 105 65 56 2 80 132 83 38 38 11 47 74 3 135 11 74 56 74 74 74 29 The number of countries i.e. length(country) matches with total number of variables (i.e. no of columns in 'result.csv'). One way of doing this is to use country names as column names while writing the 'result.csv' file. write.csv(data.frame(US = ..........., France = .......), 'result.csv', row.names = FALSE) However, the problem is I don't know in what order the country names will appear and also there could be addition or deletion of some country names. Also, if there are say 150 country names, the above way (i.e. writing.csv) of defining the column names is not practical. Basically I want to change the column heads after the 'result.csv' is generated. Kindly guide. Regards Vincy [[alternative HTML version deleted]] From bbolker at gmail.com Sat Jan 1 02:03:59 2011 From: bbolker at gmail.com (Ben Bolker) Date: Sat, 1 Jan 2011 01:03:59 +0000 (UTC) Subject: [R] Sweave for "big" data analysis References: <4D1E69E0.6010405@gmail.com> Message-ID: Duncan Murdoch gmail.com> writes: > As others have said, there are packages that provide caching. > > I haven't used them, because I like to keep my projects as > self-contained as possible: adding a dependency on one of those > packages is undesirable[1]. What I do in the case where there are time > consuming calculations is to do all the calculations in a script, and > save the results (using save()). Then the Sweave document will load the > objects (using load()) and do post-processing, plotting, etc. > Using the make/Makefile system is a good idea for more rigor/reproducibility here. From diasandre at gmail.com Sat Jan 1 04:59:35 2011 From: diasandre at gmail.com (ADias) Date: Fri, 31 Dec 2010 19:59:35 -0800 (PST) Subject: [R] Silhouette function problem In-Reply-To: References: <1293759372129-3169027.post@n4.nabble.com> <1293809653169-3169522.post@n4.nabble.com> Message-ID: <1293854375836-3169962.post@n4.nabble.com> Hello, thank you all. I have been able to solve my problem with your help. The problem I am trying to solve is: I am working on a clustering method to group a data base. At the moment I am using the clustering hierarchical method and trying to get to the best K group value via the silhouette function. My database has 569 observations and the dendogram is not readable as you can see on the picture attached. So I am trying to find a way to get to a conclusion without the use of the dendogram. thanks again, Regards, A.Dias http://r.789695.n4.nabble.com/file/n3169962/dendogram.jpeg -- View this message in context: http://r.789695.n4.nabble.com/Silhouette-function-problem-tp3169027p3169962.html Sent from the R help mailing list archive at Nabble.com. From htr at udel.edu Sat Jan 1 06:41:10 2011 From: htr at udel.edu (H. T. Reynolds) Date: Sat, 1 Jan 2011 00:41:10 -0500 (EST) Subject: [R] Retrieving Factors with Levels Ordered Message-ID: <20110101004110.HSN89098@ms1.nss.udel.edu> Hello (and Happy New Year), When I create a factor with labels in the order I want, write the data as a text file, and then retrieve them, the factor levels are no longer in the proper order. Here is what I do (I tried many variations): # educ is a numeric vector with 1,001 observations. # There is one NA # Use educ to create a factor feducord <- factor(educ, labels = c('Elem', 'Mid', 'HS', + 'Bus', 'Some', 'Col', 'Post'), ordered = T) levels(feducord) [1] "Elem" "Mid" "HS" "Bus" "Some" "Col" "Post" table(feducord) feducord Elem Mid HS Bus Some Col Post 30 90 303 108 236 144 89 # The above is what I want. The frequencies agree with # the codebook # Make a data frame and save it. (I want a text file.) testdf <- data.frame(feducord) str(testdf) 'data.frame': 1001 obs. of 1 variable: $ feducord: Ord.factor w/ 7 levels "Elem"<"Mid"<"HS"<..: 5 6 5 7 3 4 3 3 3 5 ... write.table(testdf, file = 'Junkarea/test.txt') # So far, so good. rm(testdf, feducord) # Go away. # Come back later to retrieve the data. testdf <- read.table(file = 'Junkarea/test.txt') # But levels are no longer ordered str(testdf) 'data.frame': 1001 obs. of 1 variable: $ feducord: Factor w/ 7 levels "Bus","Col","Elem",..: 7 2 7 6 4 1 4 4 4 7 table(testdf$feducord) Bus Col Elem HS Mid Post Some 108 144 30 303 90 89 236 # The frequencies are correct, but the ordering is wrong. Clearly I am missing something obvious, but I can't see it. If I save "feducord" and load it, the order of the levels is as it should be. But I don't know why writing to a test file should change anything. Any help would be greatly appreciated. (You're right, I don't have anything better to do on New Year's eve.) From berwin at maths.uwa.edu.au Sat Jan 1 07:59:19 2011 From: berwin at maths.uwa.edu.au (Berwin A Turlach) Date: Sat, 1 Jan 2011 14:59:19 +0800 Subject: [R] Retrieving Factors with Levels Ordered In-Reply-To: <20110101004110.HSN89098@ms1.nss.udel.edu> References: <20110101004110.HSN89098@ms1.nss.udel.edu> Message-ID: <20110101145919.793ca2b9@absentia> G'day H.T. On Sat, 1 Jan 2011 00:41:10 -0500 (EST) "H. T. Reynolds" wrote: > When I create a factor with labels in the order I want, write the > data as a text file, and then retrieve them, the factor levels are no > longer in the proper order. Not surprisingly. :) [..big snip..] > testdf <- read.table(file = 'Junkarea/test.txt') Did you look at the file Junkarea/test.txt, e.g. with a text editor? You will see that read.table() stores only the observed values but no information about what the mode of each variable in the data frame is. In particular, it doesn't store that a variable is a factor and definitely not the levels and their ordering. Actually, write.table() saves a factor by writing out the observed labels as character stings. Only because read.table() by default turns character data into factors (a behaviour that some useRs don't like and why the option stringsAsFactors exists) you end up with a factor again. > # But levels are no longer ordered But the help file of factor (see ?factor) states in the warning section: The levels of a factor are by default sorted, but the sort order may well depend on the locale at the time of creation, and should not be assumed to be ASCII. Thus, arguably, the levels are ordered, just not the order you want. :) > Clearly I am missing something obvious, but I can't see it. If I save > "feducord" and load it, the order of the levels is as it should be. > But I don't know why writing to a test file should change anything. > Any help would be greatly appreciated. If you save() and load() an object, then it is saved in binary format, thus much more information about it can be stored. Indeed, I would expect that a faithful internal representation of the object is stored in the binary format so that a save(), rm() and load() would restore exactly the same object. Saving objects in text formats is prone to loss of information. E.g., as you experience with write.table() and read.table() no information about ordering of levels is stored using these functions. If care is not taken when writing numbers to text files, the internal representation can change, e.g. : R> x <- data.frame(x=seq(from=0, to=1, length=11)) R> write.table(x, file="/tmp/junk1") R> y <- read.table("/tmp/junk1") R> identical(x,y) [1] FALSE R> x-y x 1 0.000000e+00 2 0.000000e+00 3 0.000000e+00 4 5.551115e-17 5 0.000000e+00 6 0.000000e+00 7 1.110223e-16 8 1.110223e-16 9 0.000000e+00 10 0.000000e+00 11 0.000000e+00 Having said all this, if you want to save your data in a text file with a representation that remembers the ordering of the factor levels, look at dput(): R> fac <- gl(2,4, labels=c("White", "Black")) R> fac [1] White White White White Black Black Black Black Levels: White Black R> write.table(fac, file="/tmp/junk") R> str(read.table("/tmp/junk")) 'data.frame': 8 obs. of 1 variable: $ x: Factor w/ 2 levels "Black","White": 2 2 2 2 1 1 1 1 R> dput(fac, file="/tmp/junk") R> str(dget(file="/tmp/junk")) Factor w/ 2 levels "White","Black": 1 1 1 1 2 2 2 2 It might just be that the text representation used by dput() is not particularly digestible for the human eye. :) > (You're right, I don't have anything better to do on New Year's eve.) New Year's eve? The first day of the new year is already nearly over! :) HTH. Cheers, Berwin ========================== Full address ============================ Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Maths and Stats (M019) +61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009 e-mail: berwin at maths.uwa.edu.au Australia http://www.maths.uwa.edu.au/~berwin From erich.neuwirth at univie.ac.at Sat Jan 1 10:24:53 2011 From: erich.neuwirth at univie.ac.at (Erich Neuwirth) Date: Sat, 01 Jan 2011 10:24:53 +0100 Subject: [R] Trying to extract an algorithm from a function In-Reply-To: References: Message-ID: <4D1EF2E5.7040902@univie.ac.at> vars:::predict.vec2var vars:::predict.varest On 12/29/2010 9:31 PM, CALEF ALEJANDRO RODRIGUEZ CUEVAS wrote: > Hi, I'm using package "vars" and I'm trying to extract the algorithm that > function "predict" contained in that package in order to understand how does > it work. > > When I type function "VAR" then all its algorithm appears in R, however if I > try to do the same with "predict" nothing happens...Is there any possible > way to extract the algorithm? > > Thanks a lot. > > Regards > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From landronimirc at gmail.com Sat Jan 1 13:54:47 2011 From: landronimirc at gmail.com (Liviu Andronic) Date: Sat, 1 Jan 2011 13:54:47 +0100 Subject: [R] Sweave for "big" data analysis In-Reply-To: References: Message-ID: (slightly off-topic) Hello On Sat, Jan 1, 2011 at 12:20 AM, Yihui Xie wrote: > I still recommend the pgfSweave package (as usual) -- you can cache > both data objects (using cacheSweave) and graphics (using pgf). > Do these packages re-implement Sweave, or just use it as a dependency? For example, if Sweave gets updated, will the changes be automatically available for pgfSweave users or not? Regards Liviu > Regards, > Yihui > -- > Yihui Xie > Phone: 515-294-2465 Web: http://yihui.name > Department of Statistics, Iowa State University > 2215 Snedecor Hall, Ames, IA > > > > On Fri, Dec 31, 2010 at 2:35 PM, Lars Bishop wrote: >> Hi, >> >> Maybe I'm missing the point here...but let's suppose you are working with >> "large" data sets and using functions that take a significant amount of time >> to run in R. I woulnd't like to run these functions every time I call >> Sweave("myfile.Rnw") within R. What is the "common" practice to use Sweave >> in these situations. I would just run the function once, save the results >> and only load them each time I run Sweave on the .Rnw file. Makes sense? >> >> Sorry, the question seems silly, but I'd appreciate your thoughts. >> >> Thanks, >> Lars. >> >> ? ? ? ?[[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail From lawrence.michael at gene.com Sat Jan 1 14:45:50 2011 From: lawrence.michael at gene.com (Michael Lawrence) Date: Sat, 1 Jan 2011 05:45:50 -0800 Subject: [R] RGtk2 compilation problem In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From adamlcarr at yahoo.com Sat Jan 1 14:59:12 2011 From: adamlcarr at yahoo.com (Adam Carr) Date: Sat, 1 Jan 2011 05:59:12 -0800 (PST) Subject: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics In-Reply-To: <2A98F8AC-4984-4747-B418-F222EBBEB4E1@comcast.net> References: <590515.14862.qm@web35304.mail.mud.yahoo.com> <2A98F8AC-4984-4747-B418-F222EBBEB4E1@comcast.net> Message-ID: <985889.54642.qm@web35307.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bbolker at gmail.com Sat Jan 1 16:08:07 2011 From: bbolker at gmail.com (Ben Bolker) Date: Sat, 1 Jan 2011 15:08:07 +0000 (UTC) Subject: [R] Retrieving Factors with Levels Ordered References: <20110101004110.HSN89098@ms1.nss.udel.edu> <20110101145919.793ca2b9@absentia> Message-ID: Berwin A Turlach maths.uwa.edu.au> writes: > > G'day H.T. > > On Sat, 1 Jan 2011 00:41:10 -0500 (EST) > "H. T. Reynolds" udel.edu> wrote: > > > When I create a factor with labels in the order I want, write the > > data as a text file, and then retrieve them, the factor levels are no > > longer in the proper order. > > Not surprisingly. :) > > [..big snip..] > > testdf <- read.table(file = 'Junkarea/test.txt') > Thanks to Berwin for the very thorough explanation. At a more pragmatic level, I find that testdf$myfactor <- factor(testdf$myfactor,levels=unique(as.character(testdf$myfactor))) generally works OK to reorder the factor levels into the order in which they appear in the data set ... From erich.neuwirth at univie.ac.at Sat Jan 1 16:20:32 2011 From: erich.neuwirth at univie.ac.at (Erich Neuwirth) Date: Sat, 01 Jan 2011 16:20:32 +0100 Subject: [R] RExcel doesn't get active dataset In-Reply-To: <1294286873-1292876652-cardhu_decombobulator_blackberry.rim.net-552722438-@b26.c16.bise7.blackberry> References: <1294286873-1292876652-cardhu_decombobulator_blackberry.rim.net-552722438-@b26.c16.bise7.blackberry> Message-ID: <4D1F4640.7020005@univie.ac.at> Please subscribe to the rcom mailing list at rcom.univie.ac.at RExcel questions are handled on that list. On 12/20/2010 9:24 PM, jryan.daniels at gmail.com wrote: > Hey everyone > > When I try to 'get active dataframe' from the context menu in excel using Rcmdr menus in excel, only the header and first row of data gets copied to the excel spread sheet. No clue why this is happening. I followed the instructions from 'R through Excel' pg 25-26. I'm using excel 2010, Rcmdr v 1.6-0 and R 2.11.1. > > Much appreciate any help > > > Sent via my BlackBerry from Vodacom - let your email find you! > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From dwinsemius at comcast.net Sat Jan 1 17:03:59 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 1 Jan 2011 11:03:59 -0500 Subject: [R] Retrieving Factors with Levels Ordered In-Reply-To: <20110101004110.HSN89098@ms1.nss.udel.edu> References: <20110101004110.HSN89098@ms1.nss.udel.edu> Message-ID: On Jan 1, 2011, at 12:41 AM, H. T. Reynolds wrote: > Hello (and Happy New Year), > > When I create a factor with labels in the order I want, write the > data as a text file, Why? What is the reason for this process. > and then retrieve them, the factor levels are no longer in the > proper order. Two further ideas to those offered by Turlach and Bolker: You can name them with leading digits that ascend in the desired order or I have seen described ( but not found a fully worked example despite what I thought was an adequate search) the use of an as() method which in this instance might apply as.factor with your own level specification while reading with colClasses. > > Here is what I do (I tried many variations): > > # educ is a numeric vector with 1,001 observations. > # There is one NA > > # Use educ to create a factor > > feducord <- factor(educ, labels = c('Elem', 'Mid', 'HS', > + 'Bus', 'Some', 'Col', 'Post'), ordered = T) > > levels(feducord) > [1] "Elem" "Mid" "HS" "Bus" "Some" "Col" "Post" > > table(feducord) > feducord > Elem Mid HS Bus Some Col Post > 30 90 303 108 236 144 89 > > # The above is what I want. The frequencies agree with > # the codebook > > # Make a data frame and save it. (I want a text file.) > > testdf <- data.frame(feducord) > str(testdf) > 'data.frame': 1001 obs. of 1 variable: > $ feducord: Ord.factor w/ 7 levels "Elem"<"Mid"<"HS"<..: > 5 6 5 7 3 4 3 3 3 5 ... > write.table(testdf, file = 'Junkarea/test.txt') > > # So far, so good. > > rm(testdf, feducord) > > # Go away. > # Come back later to retrieve the data. > > testdf <- read.table(file = 'Junkarea/test.txt') > > # But levels are no longer ordered > > str(testdf) > 'data.frame': 1001 obs. of 1 variable: > $ feducord: Factor w/ 7 levels "Bus","Col","Elem",..: > 7 2 7 6 4 1 4 4 4 7 > > table(testdf$feducord) > Bus Col Elem HS Mid Post Some > 108 144 30 303 90 89 236 > > # The frequencies are correct, but the ordering is wrong. > > Clearly I am missing something obvious, but I can't see it. If I > save "feducord" and load it, the order of the levels is as it should > be. But I don't know why writing to a test file should change > anything. Any help would be greatly appreciated. > > (You're right, I don't have anything better to do on New Year's eve.) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From dwinsemius at comcast.net Sat Jan 1 17:09:23 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 1 Jan 2011 11:09:23 -0500 Subject: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics In-Reply-To: <985889.54642.qm@web35307.mail.mud.yahoo.com> References: <590515.14862.qm@web35304.mail.mud.yahoo.com> <2A98F8AC-4984-4747-B418-F222EBBEB4E1@comcast.net> <985889.54642.qm@web35307.mail.mud.yahoo.com> Message-ID: <9FC6BE25-4A65-41E3-967B-4BBFDAD6D5E0@comcast.net> I thought Hadley's response was more definitive, but I did go on to test my alternate character strategy in ggplot and it did succeed. Whether you could get coloring or sixing that was appropriate I cannot say, since I figured the non-dingbatting option was more general. -- David On Jan 1, 2011, at 8:59 AM, Adam Carr wrote: > Neglected to reply to all. Sorry. > > > > ----- Forwarded Message ---- > From: Adam Carr > To: David Winsemius > Sent: Sat, January 1, 2011 8:58:26 AM > Subject: Re: [R] pdf() Export Problem: Circles Interpreted as Fonts > from ggplot2 > Graphics > > > Hello David: > > Thanks for the reply and for the suggestion on an alternative > character. I will > try this today and see what happens. > > As I searched for solutions to this a more experienced graphics editor > recommended a package called Xara Photo and Graphic Designer 6. This > package, > which has an open-source version for Linux, imported the PDF without > any font > interpretation difficulties. The text editing process required about > thirty > seconds. > > Happy New Year, > > Adam > > > > > ________________________________ > From: David Winsemius > > Cc: r-help at r-project.org > Sent: Thu, December 30, 2010 7:07:30 PM > Subject: Re: [R] pdf() Export Problem: Circles Interpreted as Fonts > from ggplot2 > Graphics > > You could try using the Symbol font's solid circle as pch , octmode > 267, if I > am reading the output from the TestChars function on the points help > page > correctly. > > BTW, I opened your document in GIMP and it shows "q"'s as well. > > --david. > > > On Dec 30, 2010, at 5:59 PM, Adam Carr wrote: > >> Good Evening: >> >> I am putting together a large report with plots created in R, V >> 2.12.0. Most > of >> the plots are created using ggplot2 V0.8.9. I use R's pdf() command >> to export >> the plot to a pdf file. I am exporting the plots and attempting to >> edit the >> title text in Inkscape primarily because ggplot2 does not support >> superscript >> or >> subscript formatting in the title text. For the report I am working >> on these >> formats are essential. >> >> >> I am running the R version mentioned above and Inkscape 0.48 on a >> Windows XP >> machine with the following system details: >> >> OS Name Microsoft Windows XP Professional >> Version 5.1.2600 Service Pack 3 Build 2600 >> System Type X86-based PC >> Processor x86 Family 6 Model 15 Stepping 11 GenuineIntel ~1995 Mhz >> BIOS Version/Date LENOVO 7LETB7WW (2.17 ), 4/25/2008 >> Total Physical Memory 4,096.00 MB >> Available Physical Memory 1.62 GB >> Total Virtual Memory 2.00 GB >> Available Virtual Memory 1.96 GB >> Page File Space 8.69 GB >> >> I do not think this is a ggplot2-specific problem. >> >> I use a simple version of the pdf() command to export the file that >> includes >> the >> file name and path only. The PDF looks fine actually, it is the >> restriction on >> text editing caused by Adobe's intepretation of the graphic that is >> the >> problem. >> >> I have attached two files to this email: >> >> 1. An R-exported pdf file exactly as it looks as opened in Adobe >> Reader V9. >> This >> file is named exportforinkscapeforum.pdf. >> >> 2. An example of the way the plot appears after I import it into >> Inkscape. > This >> file is named Example of How Imported File Appears in Inkscape.pdf. >> >> The problem I have is that when I import the pdf into Inkscape the >> solid, >> filled >> circles on the plot are converted to the lower case letter q. I >> read about >> similar problems on R-help.org and other R-related sites, but the >> descriptions >> I >> found seemed to indicate that the lower case q was visible in the >> pdf file > when >> opened with Adobe or other viewers. This does not seem to be my >> problem. >> >> >> I posted this problem to the Inkscape forum and received a reply >> suggesting >> that >> Adobe is interpreting the solid, filled circles not as solid, >> filled circles >> but >> as font objects. The user who replied suggested that I look for the >> Zpf > Dingbat >> font embedded in the PDF and it is in fact there. This is the font >> Adobe is >> applying to my solid, filled circles. Apparently there are known >> issues with >> Inkscape's ability to import fonts via PDF and the problem is >> documented on >> their bug list. >> >> The Inkscape user asked if there was any way that R could be >> coerced to use >> actual circles or paths for the points. I am not aware of a way to >> do this so >> any input from anyone here would be greatly appreciated. >> >> To briefly return to my main problem: if there is another way to >> edit the main >> title text to include a superscripted character (in my particular >> case it is >> Unicode character 00AE, the registered trademark sign) I would >> appreciate the >> insight. >> >> >> Any help on this issue would be appreciated. >> >> Adam >> >> >> > Inkscape >> .pdf >> > >> < >> exportforinkscapeforum >> .pdf>______________________________________________ >> >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT From xie at yihui.name Sat Jan 1 17:56:39 2011 From: xie at yihui.name (Yihui Xie) Date: Sat, 1 Jan 2011 10:56:39 -0600 Subject: [R] Sweave for "big" data analysis In-Reply-To: References: Message-ID: AFAIK, these packages often extend a part of the Sweave driver, which consists of several components, e.g. only modify the 'RweaveLatexRuncode' part in > RweaveLatex function () { list(setup = RweaveLatexSetup, runcode = RweaveLatexRuncode, writedoc = RweaveLatexWritedoc, finish = RweaveLatexFinish, checkopts = RweaveLatexOptions) } So they are based on Sweave instead of a re-implementation. However, it is unpredictable whether they will be updated when Sweave gets updated (depending on the package authors), because (IMHO) Sweave was not designed to be readily extensible (it is, however, extensible to some degree), i.e. if we want to replace or modify a very small part of its functionality, we have to copy all the source code and do our modification. A typical example is that currently it is difficult to add an extra graphics device such as png() to Sweave without touching the source code of Sweave. We might rely on some tools to update our revisions automatically, such as SVN (merge the changes), but I have no idea of the difficulties. What I can see is the pgfSweave authors have been trying to keep up with the updates of Sweave from R core. Regards, Yihui -- Yihui Xie Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Sat, Jan 1, 2011 at 6:54 AM, Liviu Andronic wrote: > (slightly off-topic) > Hello > > > On Sat, Jan 1, 2011 at 12:20 AM, Yihui Xie wrote: >> I still recommend the pgfSweave package (as usual) -- you can cache >> both data objects (using cacheSweave) and graphics (using pgf). >> > Do these packages re-implement Sweave, or just use it as a dependency? > For example, if Sweave gets updated, will the changes be automatically > available for pgfSweave users or not? > > Regards > Liviu > > >> Regards, >> Yihui >> -- >> Yihui Xie >> Phone: 515-294-2465 Web: http://yihui.name >> Department of Statistics, Iowa State University >> 2215 Snedecor Hall, Ames, IA >> >> >> >> On Fri, Dec 31, 2010 at 2:35 PM, Lars Bishop wrote: >>> Hi, >>> >>> Maybe I'm missing the point here...but let's suppose you are working with >>> "large" data sets and using functions that take a significant amount of time >>> to run in R. I woulnd't like to run these functions every time I call >>> Sweave("myfile.Rnw") within R. What is the "common" practice to use Sweave >>> in these situations. I would just run the function once, save the results >>> and only load them each time I run Sweave on the .Rnw file. Makes sense? >>> >>> Sorry, the question seems silly, but I'd appreciate your thoughts. >>> >>> Thanks, >>> Lars. >>> >>> ? ? ? ?[[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Do you know how to read? > http://www.alienetworks.com/srtest.cfm > http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader > Do you know how to write? > http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail > From charlene.cosandier at gmail.com Sat Jan 1 16:09:38 2011 From: charlene.cosandier at gmail.com (=?ISO-8859-1?Q?Charl=E8ne_Cosandier?=) Date: Sat, 1 Jan 2011 16:09:38 +0100 Subject: [R] robust standard error of an estimator Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From diasandre at gmail.com Sat Jan 1 18:11:51 2011 From: diasandre at gmail.com (ADias) Date: Sat, 1 Jan 2011 09:11:51 -0800 (PST) Subject: [R] How to make this script ask again Message-ID: <1293901911214-3170243.post@n4.nabble.com> Hi, as an example I have made this script to give the user the answer if a number is odd or even: { cat("Please, enter a number (Zero ends)") n<-scan(n=1) if(n==0)break i<-("The number is odd") p<-("The number is even") if (n%%2==0) p else i } If you run this script it will only work once, I mean, after it gives you the answer is won't ask for another number. You need to run the script all over again. How could I change it in order to make it ask me for another number without having to run the all script again? I have tried with the "repeat" but it doesn't work repeat { cat("Please, enter a number (Zero ends)") n<-scan(n=1) if(n==0)break i<-("The number is odd") p<-("The number is even") if (n%%2==0) p else i } thanks, Regards, ADias -- View this message in context: http://r.789695.n4.nabble.com/How-to-make-this-script-ask-again-tp3170243p3170243.html Sent from the R help mailing list archive at Nabble.com. From savicky at cs.cas.cz Sat Jan 1 13:59:09 2011 From: savicky at cs.cas.cz (Petr Savicky) Date: Sat, 1 Jan 2011 13:59:09 +0100 Subject: [R] python-like dictionary for R In-Reply-To: <4D1E702B.1020301@fhcrc.org> References: <4D12BC84.7090304@fhcrc.org> <4D1E702B.1020301@fhcrc.org> Message-ID: <20110101125908.GA24800@cs.cas.cz> On Fri, Dec 31, 2010 at 04:07:07PM -0800, Martin Morgan wrote: [...] > Better to use an environment (and live with reference semantics) > > e <- new.env(parent=emptyenv()); t0 <- Sys.time() > for (i in seq_len(1e6)) { > key <- as.character(i) > e[[key]] <- i > if (0 == i %% 10000) > print(i / as.numeric(Sys.time() - t0)) > } There is a related thread on R-devel. In particular https://stat.ethz.ch/pipermail/r-devel/2010-December/059526.html also suggests to use an environment, but with hash=TRUE. This option seems to allow faster access. Using the code e <- new.env(parent=emptyenv(), hash=TRUE) for (k in seq_len(10)) { s <- 0 ti <- system.time( for (i in seq_len(5000)) { key <- as.character(10*i + k) e[[key]] <- i s <- s + e[[key]] } ) print(unname(ti["user.self"])) } i get, for example [1] 0.146 [1] 0.149 [1] 0.151 [1] 0.13 [1] 0.15 [1] 0.131 [1] 0.131 [1] 0.115 [1] 0.111 [1] 0.143 and with e <- new.env(parent=emptyenv()) for example [1] 0.247 [1] 0.453 [1] 0.661 [1] 0.868 [1] 1.079 [1] 1.29 [1] 1.507 [1] 1.724 [1] 1.947 [1] 2.172 Petr Savicky. From rstuff.miles at gmail.com Sat Jan 1 18:41:06 2011 From: rstuff.miles at gmail.com (Andrew Miles) Date: Sat, 1 Jan 2011 12:41:06 -0500 Subject: [R] robust standard error of an estimator In-Reply-To: References: Message-ID: <67C38241-60C7-46DA-87C9-220920CF6771@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From htr at udel.edu Sat Jan 1 19:59:45 2011 From: htr at udel.edu (H. T. Reynolds) Date: Sat, 1 Jan 2011 13:59:45 -0500 (EST) Subject: [R] Retrieving Factors with Levels Ordered In-Reply-To: References: <20110101004110.HSN89098@ms1.nss.udel.edu> Message-ID: <20110101135945.HSO86570@ms1.nss.udel.edu> Thanks to one and all. I now have a better understanding of the situation. ---- Original message ---- >Date: Sat, 1 Jan 2011 11:03:59 -0500 >From: David Winsemius >Subject: Re: [R] Retrieving Factors with Levels Ordered >To: htr at UDel.Edu >Cc: r-help at r-project.org > > >On Jan 1, 2011, at 12:41 AM, H. T. Reynolds wrote: > >> Hello (and Happy New Year), >> >> When I create a factor with labels in the order I want, write the >> data as a text file, > >Why? What is the reason for this process. > >> and then retrieve them, the factor levels are no longer in the >> proper order. > >Two further ideas to those offered by Turlach and Bolker: >You can name them with leading digits that ascend in the desired order >or >I have seen described ( but not found a fully worked example despite >what I thought was an adequate search) the use of an as() method >which in this instance might apply as.factor with your own level >specification while reading with colClasses. > > From vseabra at uol.com.br Sat Jan 1 20:15:15 2011 From: vseabra at uol.com.br (Victor F Seabra) Date: Sat, 1 Jan 2011 17:15:15 -0200 Subject: [R] Plot symbols: How to plot (and save) a graphic symbols originating from a table Message-ID: <4d1f7d43b8c25_4bd510c20e74153@weasel15.tmail> Dear all, Please, I have a doubt regarding symbol plotting with data originating from a table. Please, see below: I have a tab delimited file called table1.txt with 4 columns: ypos animal var1 var2 5 cat gina <= lady gina \u2264 lady 7 dog bill >= tony bill \u2265 tony 9 fish dude <= bro dude \u2264 bro #I then load in the data to R: table1<-read.table("table1.txt", header=TRUE, sep="\t") #if I take a look at the table I realize that \u2264 was replaced by \\u2264 table1 #So, if i try to plot the data #instead of greater/equal or lesser/equal I get #a text string plotted "\u2265" plot(1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") text(y=table1$ypos,x=2,table1$animal) text(y=table1$ypos,x=4,table1$var1) text(y=table1$ypos,x=8,table1$var2) #this can be fixed if I manually erase the extra "\" on var2 fix(table1) plot(1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") text(y=table1$ypos,x=2,table1$animal) text(y=table1$ypos,x=4,table1$var1) text(y=table1$ypos,x=8,table1$var2) #However if I save the graph to a ps file, it shows the "<=" sign as "..." postscript("teste3.ps", width = 22, height = 11.5,pointsize=24,paper="special",bg="transparent") plot(1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") text(y=table1$ypos,x=2,table1$animal) text(y=table1$ypos,x=4,table1$var1) text(y=table1$ypos,x=8,table1$var2) dev.off() #My solution was to plot "<" or ">" instead of "<=" and ">=" # and then plot an hifen under the "<" or the ">" sign. # This worked to fix both problems, but is hard to do and # impossible to automate (or at least very difficult) #Please, does anyone know a better approach? #thanks in advance Victor Faria Seabra, MD vseabra at uol.com.br From dwinsemius at comcast.net Sat Jan 1 20:27:16 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 1 Jan 2011 14:27:16 -0500 Subject: [R] How to make this script ask again In-Reply-To: <1293901911214-3170243.post@n4.nabble.com> References: <1293901911214-3170243.post@n4.nabble.com> Message-ID: On Jan 1, 2011, at 12:11 PM, ADias wrote: > > Hi, > > as an example I have made this script to give the user the answer if a > number is odd or even: > > { > cat("Please, enter a number (Zero ends)") > n<-scan(n=1) > if(n==0)break > i<-("The number is odd") > p<-("The number is even") > if (n%%2==0) > p else i > } > > If you run this script it will only work once, I mean, after it > gives you > the answer is won't ask for another number. You need to run the > script all > over again. How could I change it in order to make it ask me for > another > number without having to run the all script again? > > I have tried with the "repeat" but it doesn't work > > repeat { > cat("Please, enter a number (Zero ends)") > n<-scan(n=1) Why do you set n=1 if you want more than one value? > if(n==0)break > i<-("The number is odd") > p<-("The number is even") > if (n%%2==0) > p else i > } ?Control -- -- David > > thanks, > > Regards, > ADias > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-make-this-script-ask-again-tp3170243p3170243.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From jholtman at gmail.com Sat Jan 1 20:27:58 2011 From: jholtman at gmail.com (jim holtman) Date: Sat, 1 Jan 2011 14:27:58 -0500 Subject: [R] How to make this script ask again In-Reply-To: <1293901911214-3170243.post@n4.nabble.com> References: <1293901911214-3170243.post@n4.nabble.com> Message-ID: your example works fine for me: > repeat { + cat("Please, enter a number (Zero ends)") + n<-scan(n=1) + if(n==0)break + i<-("The number is odd") + p<-("The number is even") + if (n%%2==0) + p else i + } Please, enter a number (Zero ends)1: 1 Read 1 item Please, enter a number (Zero ends)1: 2 Read 1 item Please, enter a number (Zero ends)1: 3 Read 1 item Please, enter a number (Zero ends)1: 0 Read 1 item > now if you want the answer, you have to use print: > repeat { + cat("Please, enter a number (Zero ends)") + n<-scan(n=1) + if(n==0)break + i<-("The number is odd") + p<-("The number is even") + if (n%%2==0) + print(p) else print(i) + } Please, enter a number (Zero ends)1: 1 Read 1 item [1] "The number is odd" Please, enter a number (Zero ends)1: 2 Read 1 item [1] "The number is even" Please, enter a number (Zero ends)1: 3 Read 1 item [1] "The number is odd" Please, enter a number (Zero ends)1: 4 Read 1 item [1] "The number is even" Please, enter a number (Zero ends)1: 5 Read 1 item [1] "The number is odd" Please, enter a number (Zero ends)1: 0 Read 1 item > On Sat, Jan 1, 2011 at 12:11 PM, ADias wrote: > > Hi, > > as an example I have made this script to give the user the answer if a > number is odd or even: > > ?{ > cat("Please, enter a number (Zero ends)") > n<-scan(n=1) > if(n==0)break > i<-("The number is odd") > p<-("The number is even") > if (n%%2==0) > p else i > } > > If you run this script it will only work once, I mean, after it gives you > the answer is won't ask for another number. You need to run the script all > over again. How could I change it in order to make it ask me for another > number without having to run the all script again? > > I have tried with the "repeat" but it doesn't work > > ?repeat { > cat("Please, enter a number (Zero ends)") > n<-scan(n=1) > if(n==0)break > i<-("The number is odd") > p<-("The number is even") > if (n%%2==0) > p else i > } > > thanks, > > Regards, > ADias > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-make-this-script-ask-again-tp3170243p3170243.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From vseabra at uol.com.br Sat Jan 1 20:40:58 2011 From: vseabra at uol.com.br (Victor F Seabra) Date: Sat, 1 Jan 2011 17:40:58 -0200 Subject: [R] Plot symbols: How to plot (and save) a graphic symbols originating from a table In-Reply-To: <4d1f7d43b8c25_4bd510c20e74153@weasel15.tmail> References: <4d1f7d43b8c25_4bd510c20e74153@weasel15.tmail> Message-ID: <4d1f834a26a5e_75b110c20e7414b@weasel15.tmail> sorry, I guess the table was a little confusing, I've quoted each cell to facilitate reading and attached a copy of the database file. ypos animal var1 var2 "5" "cat" "gina <= lady" "gina \u2264 lady" "7" "dog" "bill >= tony" "bill \u2265 tony" "9" "fish" "dude <= bro" "dude \u2264 bro" By the way, I'm running R on windows and didn't try any of this on Linux. thanks in advance, Victor ? ? Victor Faria Seabra Email: vseabra at uol.com.br _________________________________________________________________ Em 01/01/2011 17:15, Victor F Seabra < vseabra at uol.com.br > escreveu: Dear all, Please, I have a doubt regarding symbol plotting with data originating from a table. Please, see below: I have a tab delimited file called table1.txt with 4 columns: ypos animal var1 var2 5 cat gina <= lady gina \u2264 lady 7 dog bill >= tony bill \u2265 tony 9 fish dude <= bro dude \u2264 bro #I then load in the data to R: table1<-read.table("table1.txt", header=TRUE, sep="\t") #if I take a look at the table I realize that \u2264 was replaced by \\u2264 table1 #So, if i try to plot the data #instead of greater/equal or lesser/equal I get #a text string plotted "\u2265" plot(1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") text(y=table1$ypos,x=2,table1$animal) text(y=table1$ypos,x=4,table1$var1) text(y=table1$ypos,x=8,table1$var2) #this can be fixed if I manually erase the extra "\" on var2 fix(table1) plot(1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") text(y=table1$ypos,x=2,table1$animal) text(y=table1$ypos,x=4,table1$var1) text(y=table1$ypos,x=8,table1$var2) #However if I save the graph to a ps file, it shows the "<=" sign as "..." postscript("teste3.ps", width = 22, height = 11.5,pointsize=24,paper="special",bg="transparent") plot(1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") text(y=table1$ypos,x=2,table1$animal) text(y=table1$ypos,x=4,table1$var1) text(y=table1$ypos,x=8,table1$var2) dev.off() #My solution was to plot "<" or ">" instead of "<=" and ">=" # and then plot an hifen under the "<" or the ">" sign. # This worked to fix both problems, but is hard to do and # impossible to automate (or at least very difficult) #Please, does anyone know a better approach? #thanks in advance Victor Faria Seabra, MD vseabra at uol.com.br ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: table1.txt URL: From f.harrell at vanderbilt.edu Sat Jan 1 20:48:21 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Sat, 1 Jan 2011 11:48:21 -0800 (PST) Subject: [R] robust standard error of an estimator In-Reply-To: <67C38241-60C7-46DA-87C9-220920CF6771@gmail.com> References: <67C38241-60C7-46DA-87C9-220920CF6771@gmail.com> Message-ID: <1293911301358-3170363.post@n4.nabble.com> Is the (non-clustered) sandwich estimator really robust to autocorrelation? Thanks Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/robust-standard-error-of-an-estimator-tp3170257p3170363.html Sent from the R help mailing list archive at Nabble.com. From vseabra at uol.com.br Sat Jan 1 21:00:06 2011 From: vseabra at uol.com.br (Victor F Seabra) Date: Sat, 1 Jan 2011 18:00:06 -0200 Subject: [R] Plot symbols: How to plot (and save) a graphic with symbols originating from a table In-Reply-To: <4d1f8744c6a07_147010c20e74181@weasel15.tmail> References: <4d1f7d43b8c25_4bd510c20e74153@weasel15.tmail> <4d1f834a26a5e_75b110c20e7414b@weasel15.tmail> <4d1f8744c6a07_147010c20e74181@weasel15.tmail> Message-ID: <4d1f87c61d0ce_18e310c20e74197@weasel15.tmail> Dear all, Please, I have a doubt regarding symbols plotting when the data originates from a table (i.e. is not manually fed into the "text" function) Please, see below: I have a tab delimited file called table1.txt with 4 columns. (I wasn't sure on how to attach the table to this post, so I included the data below) ypos animal var1 var2 5 cat gina <= lady gina \u2264 lady 7 dog bill >= tony bill \u2265 tony 9 fish dude <= bro dude \u2264 bro # I load in the data: table1<-read.table("table1.txt", header=TRUE, sep="\t") # # If I take a look at the table table1 # I realize that \u2264 was replaced by \\u2264 # # So, when I plot the data plot(1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") text(y=table1$ypos,x=2,table1$animal) text(y=table1$ypos,x=4,table1$var1) text(y=table1$ypos,x=8,table1$var2) # # Instead of "<=" or ">=", the text string "\u2265" is plotted # This first problem can be fixed by manually erasing the extra "\" on var2 fix(table1) # # However, while saving the graph to a ps file, the "<=" sign is replaced by "..." postscript("graph1.ps", width = 22, height = 11.5,pointsize=24,paper="special",bg="transparent") plot(1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") text(y=table1$ypos,x=2,table1$animal) text(y=table1$ypos,x=4,table1$var1) text(y=table1$ypos,x=8,table1$var2) dev.off() # # # A solution would be to plot "<" or ">" instead of "<=" and ">=" signs # and then plot an hifen under the "<" or the ">" sign. # This approach fixes both problems, but is hard to do and # very difficult to automate # #Please, does anyone know a better way? #thanks in advance # #Victor Faria Seabra, MD #vseabra@ uol.com.br From dwinsemius at comcast.net Sat Jan 1 20:59:16 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 1 Jan 2011 14:59:16 -0500 Subject: [R] Plot symbols: How to plot (and save) a graphic symbols originating from a table In-Reply-To: <4d1f7d43b8c25_4bd510c20e74153@weasel15.tmail> References: <4d1f7d43b8c25_4bd510c20e74153@weasel15.tmail> Message-ID: On Jan 1, 2011, at 2:15 PM, Victor F Seabra wrote: > Dear all, > > Please, I have a doubt regarding symbol plotting > with data originating from a table. I would say it has very little to do with the data structure and everything to do with the encodings, font conventions of console output, and the defaults for graphical devices. (I'm using a Mac in an English locale, and you have not provided any of the requested information about your setup.) > > Please, see below: > > I have a tab delimited file called table1.txt with 4 columns: > > ypos animal var1 var2 > 5 cat gina <= lady gina \u2264 lady > 7 dog bill >= tony bill \u2265 tony > 9 fish dude <= bro dude \u2264 bro > > #I then load in the data to R: > table1<-read.table("table1.txt", header=TRUE, sep="\t") > > #if I take a look at the table I realize that \u2264 was replaced by > \\u2264 > table1 > No. You are more likely seeing how R presents what it did with \u2264 with its default method for printing to the console. I see: > table1 ypos animal var1 var2 1 5 cat gina <= lady gina ? lady 2 7 dog bill >= tony bill ? tony 3 9 fish dude <= bro dude ? bro Subject, of course, to how emailers handle the \u2264 character. > str(table1) 'data.frame': 3 obs. of 4 variables: $ ypos : num 5 7 9 $ animal: Factor w/ 3 levels "cat","dog","fish": 1 2 3 $ var1 : Factor w/ 3 levels "bill >= tony",..: 3 1 2 $ var2 : Factor w/ 3 levels "bill ? tony",..: 3 1 2 > #So, if i try to plot the data > #instead of greater/equal or lesser/equal I get > #a text string plotted "\u2265" > plot > (1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") > text(y=table1$ypos,x=2,table1$animal) > text(y=table1$ypos,x=4,table1$var1) > text(y=table1$ypos,x=8,table1$var2) > > #this can be fixed if I manually erase the extra "\" on var2 > fix(table1) I'm confused. You are starting with a factor variable whose levels have some higher order numbers in the character vector, and then you didn't assign the results of the fix() operation to an R object. Why should that do _anything_? > plot > (1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") > text(y=table1$ypos,x=2,table1$animal) > text(y=table1$ypos,x=4,table1$var1) > text(y=table1$ypos,x=8,table1$var2) > > #However if I save the graph to a ps file, it shows the "<=" sign as > "..." > postscript("teste3.ps", width = 22, height = > 11.5,pointsize=24,paper="special",bg="transparent") > plot > (1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") > text(y=table1$ypos,x=2,table1$animal) > text(y=table1$ypos,x=4,table1$var1) > text(y=table1$ypos,x=8,table1$var2) > dev.off() > That must be the glyph for that number in the default font for your pdf device (as it is for mine once I change the width settings so it can be seen after conversion to pdf.) ?Encoding # might be a useful place to start, followed by... ?Devices ?ps.options > > #My solution was to plot "<" or ">" instead of "<=" and ">=" > # and then plot an hifen under the "<" or the ">" sign. > # This worked to fix both problems, but is hard to do and > # impossible to automate (or at least very difficult) > > #Please, does anyone know a better approach? To accomplish what end? You have not described what you are trying to actually do. Is this text supposed to be plotted inside the plotting area or are you going to be using it as axis labels? There is a variety of approaches (especially the plotmath expression option) that can be used depending on the ultimate objective. ?plotmath Compare: plot(NULL, xlim=c(0,1), ylim=c(0,1)) text(0.5,0.5, as.expression(as.character(table1$var2[1])) ) text(0.5,0.6, label=expression(gina <= lady) ) > #thanks in advance > > Victor Faria Seabra, MD > vseabra at uol.com.br -- David Winsemius, MD West Hartford, CT From msharp at sfbr.org Sat Jan 1 18:56:21 2011 From: msharp at sfbr.org (Mark Sharp) Date: Sat, 1 Jan 2011 11:56:21 -0600 Subject: [R] forming function arguments from strings Message-ID: <5FB1A884-AFCD-4DAF-B171-CBC9249A49D6@sfbr.org> I am wanting to change arguments to a function dynamically. For example, in making a call to qplot, I want to dynamically define all of the arguments so that I can create the plot dependent on user input. I have played with eval() a bit, but have had no success. Mark Sharp From ligges at statistik.tu-dortmund.de Sat Jan 1 21:05:05 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sat, 01 Jan 2011 21:05:05 +0100 Subject: [R] forming function arguments from strings In-Reply-To: <5FB1A884-AFCD-4DAF-B171-CBC9249A49D6@sfbr.org> References: <5FB1A884-AFCD-4DAF-B171-CBC9249A49D6@sfbr.org> Message-ID: <4D1F88F1.40806@statistik.tu-dortmund.de> On 01.01.2011 18:56, Mark Sharp wrote: > I am wanting to change arguments to a function dynamically. For example, in making a call to qplot, I want to dynamically define all of the arguments so that I can create the plot dependent on user input. I have played with eval() a bit, but have had no success. If passing the arguments is not sufficient, then you may want to take a look at ?do.call Uwe Ligges > > Mark Sharp > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From baptiste.auguie at googlemail.com Sat Jan 1 21:10:33 2011 From: baptiste.auguie at googlemail.com (baptiste auguie) Date: Sat, 1 Jan 2011 21:10:33 +0100 Subject: [R] forming function arguments from strings In-Reply-To: <5FB1A884-AFCD-4DAF-B171-CBC9249A49D6@sfbr.org> References: <5FB1A884-AFCD-4DAF-B171-CBC9249A49D6@sfbr.org> Message-ID: See aes_string(), perhaps. baptiste On 1 January 2011 18:56, Mark Sharp wrote: > I am wanting to change arguments to a function dynamically. For example, in making a call to qplot, I want to dynamically define all of the arguments so that I can create the plot dependent on user input. I have played with eval() a bit, but have had no success. > > Mark Sharp > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From vseabra at uol.com.br Sat Jan 1 21:26:04 2011 From: vseabra at uol.com.br (Victor F Seabra) Date: Sat, 1 Jan 2011 18:26:04 -0200 Subject: [R] Plot symbols: How to plot (and save) a graphic with symbols originating from a table In-Reply-To: <4d1f87c61d0ce_18e310c20e74197@weasel15.tmail> References: <4d1f7d43b8c25_4bd510c20e74153@weasel15.tmail> <4d1f834a26a5e_75b110c20e7414b@weasel15.tmail> <4d1f8744c6a07_147010c20e74181@weasel15.tmail> <4d1f87c61d0ce_18e310c20e74197@weasel15.tmail> Message-ID: <4d1f8ddc64e7e_460b10c20e74292@weasel15.tmail> Thanks for your quick response, I'm using Vista and everything in my system is in English It's not just the way I view it, it's really \ \ u2264 since I can delete one of the \ and then it becomes \? u? 2264, which is further replaced by the fix(table1) statement, was just to manually go from cell to cell erasing the extra \ in table1 As for the code you sent me: plot(NULL, xlim=c(0,1), ylim=c(0,1)) text(0.5,0.5, as.expression(as.character(table1$var2[1])) ) text(0.5,0.6, label=expression(gina <= lady) ) the first statement plots gina \ u 2264 lady and the second statement plots gina <= lady (with the right symbol character, but it's not retrieving the data from the table as I need) this text is supposed to be plotted inside the plot area (but sometimes I might have 3 graphics with 20 symbols in each that need to be plotted) I will use his to label sub-group names in forest plots ____________________________________________ > Dear all, > > Please, I have a doubt regarding symbol plotting > with data originating from a table. I would say it has very little to do with the data structure and everything to do with the encodings, font conventions of console output, and the defaults for graphical devices. (I'm using a Mac in an English locale, and you have not provided any of the requested information about your setup.) > > Please, see below: > > I have a tab delimited file called table1.txt with 4 columns: > > ypos animal var1 var2 > 5 cat gina <= lady gina \u2264 lady > 7 dog bill >= tony bill \u2265 tony > 9 fish dude <= bro dude \u2264 bro > > #I then load in the data to R: > table1<-read.table("table1.txt", header=TRUE, sep="\t") > > #if I take a look at the table I realize that \u2264 was replaced by > \\u2264 > table1 > No. You are more likely seeing how R presents what it did with \u2264 with its default method for printing to the console. I see: > table1 ypos animal var1 var2 1 5 cat gina <= lady gina ? lady 2 7 dog bill >= tony bill ? tony 3 9 fish dude <= bro dude ? bro Subject, of course, to how emailers handle the \u2264 character. > str(table1) 'data.frame': 3 obs. of 4 variables: $ ypos : num 5 7 9 $ animal: Factor w/ 3 levels "cat","dog","fish": 1 2 3 $ var1 : Factor w/ 3 levels "bill >= tony",..: 3 1 2 $ var2 : Factor w/ 3 levels "bill ? tony",..: 3 1 2 > #So, if i try to plot the data > #instead of greater/equal or lesser/equal I get > #a text string plotted "\u2265" > plot > (1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") > text(y=table1$ypos,x=2,table1$animal) > text(y=table1$ypos,x=4,table1$var1) > text(y=table1$ypos,x=8,table1$var2) > > #this can be fixed if I manually erase the extra "\" on var2 > fix(table1) I'm confused. You are starting with a factor variable whose levels have some higher order numbers in the character vector, and then you didn't assign the results of the fix() operation to an R object. Why should that do _anything_? > plot > (1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") > text(y=table1$ypos,x=2,table1$animal) > text(y=table1$ypos,x=4,table1$var1) > text(y=table1$ypos,x=8,table1$var2) > > #However if I save the graph to a ps file, it shows the "<=" sign as > "..." > postscript("teste3.ps", width = 22, height = > 11.5,pointsize=24,paper="special",bg="transparent") > plot > (1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") > text(y=table1$ypos,x=2,table1$animal) > text(y=table1$ypos,x=4,table1$var1) > text(y=table1$ypos,x=8,table1$var2) > dev.off() > That must be the glyph for that number in the default font for your pdf device (as it is for mine once I change the width settings so it can be seen after conversion to pdf.) ?Encoding # might be a useful place to start, followed by... ?Devices ?ps.options > > #My solution was to plot "<" or ">" instead of "<=" and ">=" > # and then plot an hifen under the "<" or the ">" sign. > # This worked to fix both problems, but is hard to do and > # impossible to automate (or at least very difficult) > > #Please, does anyone know a better approach? To accomplish what end? You have not described what you are trying to actually do. Is this text supposed to be plotted inside the plotting area or are you going to be using it as axis labels? There is a variety of approaches (especially the plotmath expression option) that can be used depending on the ultimate objective. ?plotmath Compare: plot(NULL, xlim=c(0,1), ylim=c(0,1)) text(0.5,0.5, as.expression(as.character(table1$var2[1])) ) text(0.5,0.6, label=expression(gina <= lady) ) > #thanks in advance > > Victor Faria Seabra, MD > vseabra at uol.com.br -- David Winsemius, MD West Hartford, CT Victor Faria Seabra Email: vseabra at uol.com.br ____________________________________________ Em 01/01/2011 18:00, Victor F Seabra < vseabra at uol.com.br > escreveu: Dear all, Please, I have a doubt regarding symbols plotting when the data originates from a table (i.e. is not manually fed into the "text" function) Please, see below: I have a tab delimited file called table1.txt with 4 columns. (I wasn't sure on how to attach the table to this post, so I included the data below) ypos animal var1 var2 5 cat gina <= lady gina \u2264 lady 7 dog bill >= tony bill \u2265 tony 9 fish dude <= bro dude \u2264 bro # I load in the data: table1<-read.table("table1.txt", header=TRUE, sep="\t") # # If I take a look at the table table1 # I realize that \u2264 was replaced by \\u2264 # # So, when I plot the data plot(1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") text(y=table1$ypos,x=2,table1$animal) text(y=table1$ypos,x=4,table1$var1) text(y=table1$ypos,x=8,table1$var2) # # Instead of "<=" or ">=", the text string "\u2265" is plotted # This first problem can be fixed by manually erasing the extra "\" on var2 fix(table1) # # However, while saving the graph to a ps file, the "<=" sign is replaced by "..." postscript("graph1.ps", width = 22, height = 11.5,pointsize=24,paper="special",bg="transparent") plot(1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") text(y=table1$ypos,x=2,table1$animal) text(y=table1$ypos,x=4,table1$var1) text(y=table1$ypos,x=8,table1$var2) dev.off() # # # A solution would be to plot "<" or ">" instead of "<=" and ">=" signs # and then plot an hifen under the "<" or the ">" sign. # This approach fixes both problems, but is hard to do and # very difficult to automate # #Please, does anyone know a better way? #thanks in advance # #Victor Faria Seabra, MD #vseabra@ uol.com.br ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From vseabra at uol.com.br Sat Jan 1 21:39:27 2011 From: vseabra at uol.com.br (Victor F Seabra) Date: Sat, 1 Jan 2011 18:39:27 -0200 Subject: [R] Plot symbols: How to plot (and save) a graphic with symbols originating from a table In-Reply-To: <4d1f8ddc64e7e_460b10c20e74292@weasel15.tmail> References: <4d1f7d43b8c25_4bd510c20e74153@weasel15.tmail> <4d1f834a26a5e_75b110c20e7414b@weasel15.tmail> <4d1f8744c6a07_147010c20e74181@weasel15.tmail> <4d1f87c61d0ce_18e310c20e74197@weasel15.tmail> <4d1f8ddc64e7e_460b10c20e74292@weasel15.tmail> Message-ID: <4d1f90ff674cb_5d3e10c20e74165@weasel15.tmail> Please, you need to read.table to R as my problems start there If you want you can download the txt file at the link below https://docs.google.com/leaf?id=0BweZgDxYn9BkNmY1NThmMDEtOWUxZS00MzE0LTk0NTQ tYzdlODMxODgxYzJh&hl=en thanks, Victor From bkidd at stanford.edu Sat Jan 1 21:42:50 2011 From: bkidd at stanford.edu (Brian Kidd) Date: Sat, 1 Jan 2011 12:42:50 -0800 Subject: [R] exit and save data from the terminal where command is frozen Message-ID: <6334F514-9E3B-4987-B161-161BC8C7B8CA@stanford.edu> Hello, Is there any way to stop an R execution in a terminal where the previous command appears to have caused the code to hang? I basically need to get back to the prompt. Note that control-c doesn't restore the prompt and running top -o cpu from another terminal shows R at >99% CPU usage. While closing the terminal would work, I'd really like to save the data and the history file if I can. Is the temporary history list and/or variables from the current session located somewhere on my machine that I could get to and save before closing the terminal? Or is this a lost cause? Thanks, -Brian From murdoch.duncan at gmail.com Sat Jan 1 22:00:52 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Sat, 01 Jan 2011 16:00:52 -0500 Subject: [R] exit and save data from the terminal where command is frozen In-Reply-To: <6334F514-9E3B-4987-B161-161BC8C7B8CA@stanford.edu> References: <6334F514-9E3B-4987-B161-161BC8C7B8CA@stanford.edu> Message-ID: <4D1F9604.7080007@gmail.com> On 01/01/2011 3:42 PM, Brian Kidd wrote: > Hello, > > Is there any way to stop an R execution in a terminal where the previous command appears to have caused the code to hang? I basically need to get back to the prompt. > > Note that control-c doesn't restore the prompt and running top -o cpu from another terminal shows R at>99% CPU usage. While closing the terminal would work, I'd really like to save the data and the history file if I can. Is the temporary history list and/or variables from the current session located somewhere on my machine that I could get to and save before closing the terminal? Or is this a lost cause? There's probably no way if Ctrl-C doesn't work. You might be able to attach gdb to the process and see what's going on in there, but that's not too likely to be successful. Next time it would be a good idea to save results on a regular basis so that you can restart from a more recent save point. Duncan Murdoch From adamlcarr at yahoo.com Sat Jan 1 22:07:41 2011 From: adamlcarr at yahoo.com (Adam Carr) Date: Sat, 1 Jan 2011 13:07:41 -0800 (PST) Subject: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics In-Reply-To: References: <590515.14862.qm@web35304.mail.mud.yahoo.com> Message-ID: <5108.57600.qm@web35308.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From adamlcarr at yahoo.com Sat Jan 1 22:11:28 2011 From: adamlcarr at yahoo.com (Adam Carr) Date: Sat, 1 Jan 2011 13:11:28 -0800 (PST) Subject: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics In-Reply-To: <9FC6BE25-4A65-41E3-967B-4BBFDAD6D5E0@comcast.net> References: <590515.14862.qm@web35304.mail.mud.yahoo.com> <2A98F8AC-4984-4747-B418-F222EBBEB4E1@comcast.net> <985889.54642.qm@web35307.mail.mud.yahoo.com> <9FC6BE25-4A65-41E3-967B-4BBFDAD6D5E0@comcast.net> Message-ID: <940454.48210.qm@web35304.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From vseabra at uol.com.br Sat Jan 1 23:05:12 2011 From: vseabra at uol.com.br (Victor F Seabra) Date: Sat, 1 Jan 2011 20:05:12 -0200 Subject: [R] problem with read.table Message-ID: <4d1fa518b02e6_25d10c20e74139@weasel15.tmail> Dear all, Please, when I use the command: table1<-read.table("table1.txt", header=TRUE, sep="\t") cells that contain \ u 2264 (corresponding to <= sign) get imported as \ \ u2264. which causes the need to manually edit each cell using the fix command. Is there a way to fix that? thanks, Victor From ggrothendieck at gmail.com Sat Jan 1 23:08:35 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Sat, 1 Jan 2011 17:08:35 -0500 Subject: [R] Plot symbols: How to plot (and save) a graphic with symbols originating from a table In-Reply-To: <4d1f87c61d0ce_18e310c20e74197@weasel15.tmail> References: <4d1f7d43b8c25_4bd510c20e74153@weasel15.tmail> <4d1f834a26a5e_75b110c20e7414b@weasel15.tmail> <4d1f8744c6a07_147010c20e74181@weasel15.tmail> <4d1f87c61d0ce_18e310c20e74197@weasel15.tmail> Message-ID: On Sat, Jan 1, 2011 at 3:00 PM, Victor F Seabra wrote: > > ? Dear all, > ? Please, I have a doubt regarding symbols plotting > ? when the data originates from a table > ? (i.e. is not manually fed into the "text" function) > ? Please, see below: > ? I have a tab delimited file called table1.txt with 4 columns. > ? (I wasn't sure on how to attach the table to this post, so I included the > ? data below) > ? ypos animal var1 var2 > ? 5 cat gina <= lady gina \u2264 lady > ? 7 dog bill >= tony bill \u2265 tony > ? 9 fish dude <= bro dude \u2264 bro > ? # I load in the data: > ? table1<-read.table("table1.txt", header=TRUE, sep="\t") > ? # > ? # If I take a look at the table > ? table1 > ? # I realize that \u2264 was replaced by \\u2264 > ? # > ? # So, when I plot the data > ? plot(1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") > ? text(y=table1$ypos,x=2,table1$animal) > ? text(y=table1$ypos,x=4,table1$var1) > ? text(y=table1$ypos,x=8,table1$var2) > ? # > ? # Instead of "<=" or ">=", the text string "\u2265" is plotted > ? # This first problem can be fixed by manually erasing the extra "\" on var2 > ? fix(table1) > ? # > ? # However, while saving the graph to a ps file, the "<=" sign is replaced by > ? "..." > ? postscript("graph1.ps", width = 22, height = > ? 11.5,pointsize=24,paper="special",bg="transparent") > ? plot(1:1,col="white",xlim=c(1,10),ylim=c(1,10),ylab="",axes=FALSE,xlab="") > ? text(y=table1$ypos,x=2,table1$animal) > ? text(y=table1$ypos,x=4,table1$var1) > ? text(y=table1$ypos,x=8,table1$var2) > ? dev.off() > ? # > ? # > ? # A solution would be to plot "<" or ">" instead of "<=" and ">=" signs > ? # and then plot an hifen under the "<" or the ">" sign. > ? # This approach fixes both problems, but is hard to do and > ? # very difficult to automate > ? # > ? #Please, does anyone know a better way? > ? #thanks in advance > ? # > ? #Victor Faria Seabra, MD > ? #vseabra@ uol.com.br Try this which illustrates the idea without all the distraction of the inessential elements: plot(0, main = eval(parse(text = sprintf("'%s'", "a\\u2264b")))) That is, within the context of your post the last line would be replaced with: text(y=table1$ypos, x = 8, eval(parse(text = sprintf("'%s'", table1$var2)))) Whether this works may depend on the OS and device that you use -- it does work on my Windows Vista system. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From vseabra at uol.com.br Sat Jan 1 23:12:45 2011 From: vseabra at uol.com.br (Victor F Seabra) Date: Sat, 1 Jan 2011 20:12:45 -0200 Subject: [R] problem with postscript command Message-ID: <4d1fa6ddc004b_11c510c20e741c0@weasel15.tmail> please, when i use the command postscript, the symbol "<="(less than or equal to) is replaced by "..." (ellipsis) how can I fix that? postscript("plot1.ps", width = 22, height = 11.5,pointsize=24,paper="special",bg="transparent") plot(NULL,xlim=c(1,10),ylim=c(1,10)) text(5,5,"\u2264") dev.off() I'm using windows vista, my system is in English, R v2.8.1 thanks, Victor From bbolker at gmail.com Sun Jan 2 00:04:22 2011 From: bbolker at gmail.com (Ben Bolker) Date: Sat, 1 Jan 2011 23:04:22 +0000 (UTC) Subject: [R] problem with postscript command References: <4d1fa6ddc004b_11c510c20e741c0@weasel15.tmail> Message-ID: Victor F Seabra uol.com.br> writes: > please, when i use the command postscript, the symbol "<="(less than or > equal to) is replaced by "..." (ellipsis) > how can I fix that? > postscript("plot1.ps", width = 22, height = > 11.5,pointsize=24,paper="special",bg="transparent") > plot(NULL,xlim=c(1,10),ylim=c(1,10)) > text(5,5,"\u2264") > dev.off() > I'm using windows vista, my system is in English, R v2.8.1 > thanks, Victor This kind of problem is likely to be a giant headache to solve, and if it is solvable at all, the solution will likely be very specific to your particular system (encoding, OS version, etc etc). You will probably have better luck telling us what your ultimate goal is. One way to accomplish the plot attempted above: postscript("plot1.ps", width = 22, height = 11.5,pointsize=24,paper="special", bg="transparent") plot(NULL,xlim=c(1,10),ylim=c(1,10)) ## see ?plotmath ## (need 'empty' expressions before and after <=) text(5,5,expression({}<={})) dev.off() By the way, PostScript might not support transparency ... ? For your previous question (why does read.table not work the way you expected with respect to importing \u2264)? -- Reread the detailed answers you got previously that distinguish the difference between how the character is stored within R and how it is printed on the console. good luck Ben Bolker From gajahorvat at hotmail.com Sat Jan 1 22:25:46 2011 From: gajahorvat at hotmail.com (gaja) Date: Sat, 1 Jan 2011 13:25:46 -0800 (PST) Subject: [R] Problem with uploading library Message-ID: <1293917146033-3170455.post@n4.nabble.com> Hello. I'm new on that forum, so I would use some help... I'm programing with R program, have some exercise to do. So here it goes. 1. I have to upload two libraryes "sp" and "maptools" 2. This command is troubling me : towns <- readShapeSpatial("OB/OB.shp") 3. When I'm pressing f5, there is an error called: Error: could not find function "readShapeSpatial" So, I would use some help. Thanx in advance. Gaja -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-uploading-library-tp3170455p3170455.html Sent from the R help mailing list archive at Nabble.com. From mrkdennehy at gmail.com Sun Jan 2 01:43:02 2011 From: mrkdennehy at gmail.com (markdx) Date: Sat, 1 Jan 2011 16:43:02 -0800 (PST) Subject: [R] Example in "Applied Spatial Analysis with R" giving error Message-ID: <1293928982199-3170586.post@n4.nabble.com> Example from page 29, Chapter 2.3 from "Applied Spatial Analysis with R" I am new to R...just trying to replicate the example from the book. > m <- matrix(c(0,0,1,1), ncol = 2, dimnames = list(NULL, + c ("min", > "max"))) Error in +c("min", "max") : invalid argument to unary operator Thoughts? -- View this message in context: http://r.789695.n4.nabble.com/Example-in-Applied-Spatial-Analysis-with-R-giving-error-tp3170586p3170586.html Sent from the R help mailing list archive at Nabble.com. From vseabra at uol.com.br Sat Jan 1 23:23:08 2011 From: vseabra at uol.com.br (Victor F Seabra) Date: Sat, 1 Jan 2011 20:23:08 -0200 Subject: [R] problem with read.table In-Reply-To: <4d1fa518b02e6_25d10c20e74139@weasel15.tmail> References: <4d1fa518b02e6_25d10c20e74139@weasel15.tmail> Message-ID: <4d1fa94ccc7d7_27b110c20e74160@weasel15.tmail> This was already solved in another post: [1]Plot symbols: How to plot (and save) a graphic with symbols originating from a table Sorry about that, the first post was too big as I was making more than 1 question, so I decided to break it down _________________________________________________________________ Em 01/01/2011 20:05, Victor F Seabra < vseabra at uol.com.br > escreveu: Dear all, Please, when I use the command: table1<-read.table("table1.txt", header=TRUE, sep="\t") cells that contain \ u 2264 (corresponding to <= sign) get imported as \ \ u2264. which causes the need to manually edit each cell using the fix command. Is there a way to fix that? thanks, Victor ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. References 1. http://r.789695.n4.nabble.com/Plot-symbols-How-to-plot-and-save-a-graphic-with-symbols-originating-from-a-table-tp3170327p3170327.html From benjamin.ward at bathspa.org Sun Jan 2 02:16:54 2011 From: benjamin.ward at bathspa.org (Axolotl9250) Date: Sat, 1 Jan 2011 17:16:54 -0800 (PST) Subject: [R] Windows 7, R and org-babel issue Message-ID: <1293931014697-3170598.post@n4.nabble.com> Hi, I'm hoping somebody can help me with this I'm a little stuck, I've followed the method as closely to the Linux way to get org mode exporting to pdf through latex my work with R. I have emacs extracted and installed to Program Files, and ran "addpm", I have Rterm in my path, and I extracted ESS and Org 7.4 to Program Files\emacs-22.3\site lisp. I've edited .emacs.d\init.el (a .emacs file was too awkward with windows way of naming files) to include the strings required to load ESS: (require 'ess-site), and then all the lines on the org-documentation required to activate org mode 7.4, and I went into the customization options in Emacs and made it so babel would load the R language. Normally this is enough to get the thing working, then when I choose to export to PDF through latex (pdflatex - 3 times in the customization options, as is default) I don't get any of the R output in my PDF or any of the echoed commands. However I don't get any errors upon export. The intermediate .tex files don't show any entry of verbatim at all, yet it asks me if I want to evaluate each R chunk. My Log files produced or the AUC file don't show any errors and my TeXLive tree is up to date. Hopefully some fellow Windows 7 org and R user will be able to help me out on this. Cheers, Ben. -- View this message in context: http://r.789695.n4.nabble.com/Windows-7-R-and-org-babel-issue-tp3170598p3170598.html Sent from the R help mailing list archive at Nabble.com. From t at biegner.com Sun Jan 2 02:28:09 2011 From: t at biegner.com (Thorsten Biegner) Date: Sun, 2 Jan 2011 02:28:09 +0100 Subject: [R] Clusteranalysis Chi-square test and SingleLinkage Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jwiley.psych at gmail.com Sun Jan 2 02:54:27 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Sat, 1 Jan 2011 17:54:27 -0800 Subject: [R] Problem with uploading library In-Reply-To: <1293917146033-3170455.post@n4.nabble.com> References: <1293917146033-3170455.post@n4.nabble.com> Message-ID: Dear Gaja, If "sp" and "maptools" are not already installed, you'll need to do that. It is often recommended that you update your existing packages before installing new ones. Once that is done (or if you have already), then you just need to load the packages, and then R will be able to find the readShapeSpatial() function (it is in the 'maptools' package). Something like this should get you going: ######################## update.packages() install.packages(c("sp", "maptools")) require(sp) require(maptools) ?readShapeSpatial #documentation towns <- readShapeSpatial("OB/OB.shp") ############################ Hope that helps, Josh On Sat, Jan 1, 2011 at 1:25 PM, gaja wrote: > > Hello. > > I'm new on that forum, so I would use some help... I'm programing with R > program, have some exercise to do. > > So here it goes. > > 1. I have to upload two ?libraryes "sp" and "maptools" > 2. This command is troubling me : towns <- readShapeSpatial("OB/OB.shp") > 3. When I'm pressing f5, there is an error called: Error: could not find > function "readShapeSpatial" > > So, I would use some help. Thanx in advance. > > Gaja > -- > View this message in context: http://r.789695.n4.nabble.com/Problem-with-uploading-library-tp3170455p3170455.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From bbolker at gmail.com Sun Jan 2 03:25:16 2011 From: bbolker at gmail.com (Ben Bolker) Date: Sun, 2 Jan 2011 02:25:16 +0000 (UTC) Subject: [R] Example in " Applied Spatial Analysis with R" giving error References: <1293928982199-3170586.post@n4.nabble.com> Message-ID: markdx gmail.com> writes: > > > Example from page 29, Chapter 2.3 from "Applied Spatial Analysis with R" > > I am new to R...just trying to replicate the example from the book. > > > m <- matrix(c(0,0,1,1), ncol = 2, dimnames = list(NULL, + c ("min", > > "max"))) > Error in +c("min", "max") : invalid argument to unary operator Omit the "+", which is a 'continuation character' indicating that the command in the book was carried over onto a new line. In general line breaks at 'sensible' places are fine: m <- matrix(c(0,0,1,1), ncol = 2, dimnames = list(NULL, c ("min", "max"))) From jim at bitwrit.com.au Sun Jan 2 05:32:05 2011 From: jim at bitwrit.com.au (Jim Lemon) Date: Sun, 02 Jan 2011 15:32:05 +1100 Subject: [R] Problem with uploading library In-Reply-To: References: <1293917146033-3170455.post@n4.nabble.com> Message-ID: <4D1FFFC5.3030504@bitwrit.com.au> On 01/02/2011 12:54 PM, Joshua Wiley wrote: > Dear Gaja, > > If "sp" and "maptools" are not already installed, you'll need to do > that. It is often recommended that you update your existing packages > before installing new ones. Once that is done (or if you have > already), then you just need to load the packages, and then R will be > able to find the readShapeSpatial() function (it is in the 'maptools' > package). Something like this should get you going: > > ######################## > update.packages() > install.packages(c("sp", "maptools")) > require(sp) > require(maptools) > ?readShapeSpatial #documentation > towns<- readShapeSpatial("OB/OB.shp") > ############################ > Hi, I have also been trying to read a shapefile: SS10aAust.shp with readOGR (rgdal) and readShapeSpatial (maptools) with absolutely no success, despite having long lost count of the help pages I have read seeking an example of how to do this: readOGR("/home/jim/research/ombo/mapping/SSD10aAust.shp", layer="cities") Error in ogrInfo(dsn = dsn, layer = layer, input_field_name_encoding = input_field_name_encoding) : Cannot open layer I realize that one has to supply a layer here, but searching through the four files that were extracted from the ZIP file, there is no mention of layers. Obviously guessing on the basis of other examples doesn't work either. ozmap<-readShapeSpatial( "/home/jim/research/ombo/mapping/SSD10aAust.shp") Error in getinfo.shape(fn) : Error opening SHP file All of the examples seem to deal with a different type of shape file that follows whatever rules are operating in these two functions. All of the examples and tutorials I have found on the Web are similarly inscrutable. I guess I should start with asking how does one discover what layers are included in a shapefile? One step at a time... Jim From diasandre at gmail.com Sun Jan 2 03:00:24 2011 From: diasandre at gmail.com (ADias) Date: Sat, 1 Jan 2011 18:00:24 -0800 (PST) Subject: [R] How to make this script ask again In-Reply-To: References: <1293901911214-3170243.post@n4.nabble.com> Message-ID: <1293933624491-3170611.post@n4.nabble.com> thank you for the answers, problem solved Regards, ADias. -- View this message in context: http://r.789695.n4.nabble.com/How-to-make-this-script-ask-again-tp3170243p3170611.html Sent from the R help mailing list archive at Nabble.com. From tk725472 at albany.edu Sun Jan 2 04:42:42 2011 From: tk725472 at albany.edu (Nissim Kaufmann) Date: Sat, 1 Jan 2011 22:42:42 -0500 (EST) Subject: [R] The Percentile of a User-Defined pdf Message-ID: <32618.173.68.34.215.1293939762.squirrel@webmail.albany.edu> I would like to give a probability distribution function of a function of (x,y) on the half-plane y>0, and a constant 0 References: <1293917146033-3170455.post@n4.nabble.com> <4D1FFFC5.3030504@bitwrit.com.au> Message-ID: <4D202A48.3040203@bitwrit.com.au> On 01/02/2011 04:18 PM, Joshua Wiley wrote: > ... > This makes me wonder if you could somehow have it return any or all layers? > One of the first things I did was look for a function that would return all the information I needed to know. I thought that "ogrInfo" might do the trick, but it doesn't. >... > When I was browsing, I ran across this whitepaper on shapefiles: > http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf > Thanks for that, I now know quite a bit more about shapefiles, but no mention of layers. > perhaps it will be useful/mean something to you. > I'm a bit further down the track, I'll keep looking. Thanks. Jim From s1327720 at student.rug.nl Sun Jan 2 10:09:00 2011 From: s1327720 at student.rug.nl (Sarah) Date: Sun, 2 Jan 2011 01:09:00 -0800 (PST) Subject: [R] dataframe, simulating data In-Reply-To: <20101231135635.GA11352@cs.cas.cz> References: <1293789078403-3169246.post@n4.nabble.com> <20101231110742.GA26395@cs.cas.cz> <1293800708919-3169354.post@n4.nabble.com> <20101231135635.GA11352@cs.cas.cz> Message-ID: <1293959340469-3170764.post@n4.nabble.com> Thanks, Petr! With your script my problem is solved. David, thanks for your help and time as well!! I really appreciate it. -- View this message in context: http://r.789695.n4.nabble.com/dataframe-simulating-data-tp3169246p3170764.html Sent from the R help mailing list archive at Nabble.com. From dieter.menne at menne-biomed.de Sun Jan 2 10:50:19 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Sun, 2 Jan 2011 01:50:19 -0800 (PST) Subject: [R] The Percentile of a User-Defined pdf In-Reply-To: <32618.173.68.34.215.1293939762.squirrel@webmail.albany.edu> References: <32618.173.68.34.215.1293939762.squirrel@webmail.albany.edu> Message-ID: <1293961819968-3170786.post@n4.nabble.com> Nissim Kaufmann wrote: > > > J=sapply(xc, function(xc) {integrate(function(x) { > sapply(y, function(x) { > integrate(function(y) { > sapply(x, function(y) 1/(1+x^2+y^2)) > }, -c, c)$value > }) > }, -c, xc)$value > }) > > Once you are inside the first "{", R only knows about the x it received in as a parameter (assuming there is no global y). I have not checked the details of the code (it looks overly complicated), but sapply(x, function(y) { could work. In theory, keeping function(x) would also work, but I suggest that you use another variable name in inner nests to avoid confusion (human, not CPU). My favorite step in developing nested xapply is to first remove all code and only put a print(str(x)) inside the function, so I can clearly see what is passed in. Dieter -- View this message in context: http://r.789695.n4.nabble.com/The-Percentile-of-a-User-Defined-pdf-tp3170672p3170786.html Sent from the R help mailing list archive at Nabble.com. From ata.sonu at gmail.com Sun Jan 2 12:32:35 2011 From: ata.sonu at gmail.com (ATANU) Date: Sun, 2 Jan 2011 03:32:35 -0800 (PST) Subject: [R] changing method of estimation in GLM Message-ID: <1293967955520-3170836.post@n4.nabble.com> can anyone tell me how can i control the method of estimation (i.e. scoring method or Newton raphson method) in glm and compute deviance function ? -- View this message in context: http://r.789695.n4.nabble.com/changing-method-of-estimation-in-GLM-tp3170836p3170836.html Sent from the R help mailing list archive at Nabble.com. From gajahorvat at hotmail.com Sun Jan 2 12:49:15 2011 From: gajahorvat at hotmail.com (gaja) Date: Sun, 2 Jan 2011 03:49:15 -0800 (PST) Subject: [R] Problem with uploading library In-Reply-To: <4D1FFFC5.3030504@bitwrit.com.au> References: <1293917146033-3170455.post@n4.nabble.com> <4D1FFFC5.3030504@bitwrit.com.au> Message-ID: <1293968955038-3170840.post@n4.nabble.com> wow guys, thanx for so quick reply. I'm in the office now, as soon as I get home, I'll try to do it! Many thanx -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-uploading-library-tp3170455p3170840.html Sent from the R help mailing list archive at Nabble.com. From ligges at statistik.tu-dortmund.de Sun Jan 2 15:51:03 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sun, 02 Jan 2011 15:51:03 +0100 Subject: [R] Clusteranalysis Chi-square test and SingleLinkage In-Reply-To: References: Message-ID: <4D2090D7.2000604@statistik.tu-dortmund.de> On 02.01.2011 02:28, Thorsten Biegner wrote: > Hi > > The short version of my questions is this: > > How can I run a chi-square test over a matrix (table) to get the distanaces > between rows and then run a SingleLinkage (or other fusion algorithm over > the resulting table? > > ------------ > > The long-version of my question: > > My data consists of different data of different countries so I have stuff > like how many people can read, write in X,Y,Z countries and then percentages > for each country. And I want to find out which countries might be similar by > doing a cluster analysis. > > So first I want to take the data which would look something like this: > > Plastikbecher Kartonbox Papier > Rama 24 65 12 > Homa 83 30 21 > Flora 75 28 22 > SB 35 55 21 > Holl. Butter 20 40 75 > > And then run a chi-square test over it (I think that makes the most sense or > does anybody think something different)? > > So for that I will put each row with every other row in a single different > matrix (mat1) and use the use the chisq.test. > > So mat 1 would for example looks like this: > > Plastikbecher Kartonbox Papier > Rama 24 65 12 > Flora 75 28 22 > > And then I would run matResult[1,3]<- sqrt(chisq.test(mat1)[[1]]) > > So in the end I would get a matrix like this: > Rama Homa Flora SB HollButter > Rama 0.000 6.642 6.470 2.209 6.931 > Homa 6.642 0.000 0.430 4.994 8.387 > Flora 6.470 0.430 0.000 4.754 7.941 > SB 2.209 4.994 4.754 0.000 5.901 > HollButter 6.931 8.387 7.941 5.901 0.000 > > So here is my question: > How can I run a single linkage algorithm over this matrix? > > I thought a good stating point might be "hclust" > > hclust(d, method = "complete", members=NULL) > > But the R reference says d must be "a dissimilarity structure as produced by > dist." > > But the dist function does not have a method chisquared-test or something > similar. Well, there is as.dist, so just use: hclust(as.dist(matResult), .......) Uwe Ligges > So does anybody have an idea how I can do a clusteranalysis with a > chi-squared test and then use a fusion algorithm to join the clusters? > > Thanks > > Thorsten > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Sun Jan 2 15:52:49 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sun, 02 Jan 2011 15:52:49 +0100 Subject: [R] changing method of estimation in GLM In-Reply-To: <1293967955520-3170836.post@n4.nabble.com> References: <1293967955520-3170836.post@n4.nabble.com> Message-ID: <4D209141.9090403@statistik.tu-dortmund.de> On 02.01.2011 12:32, ATANU wrote: > > can anyone tell me how can i control the method of estimation (i.e. scoring > method or Newton raphson method) in glm and compute deviance function ? See ?glm. With method="model.frame" the model frame is returned and you can do any fitting yourself. Default is scoring (in particular: IWLS). Uwe Ligges From bbolker at gmail.com Sun Jan 2 16:03:07 2011 From: bbolker at gmail.com (Ben Bolker) Date: Sun, 2 Jan 2011 15:03:07 +0000 (UTC) Subject: [R] changing method of estimation in GLM References: <1293967955520-3170836.post@n4.nabble.com> Message-ID: ATANU gmail.com> writes: > can anyone tell me how can i control the method of estimation (i.e. scoring > method or Newton raphson method) in glm and compute deviance function ? I don't think you can; you would have to write your own, although you can take advantage of the framework of the current glm() [for interpreting formulas, etc.]: see the "method" argument in ?glm. You might also find install.packages("sos"); library("sos"); findFn("glm newton-raphson") helpful; it finds a few special cases (logistic regression via Newton-Raphson) in the logistf and glmmAK packages. From msamtani at gmail.com Sun Jan 2 16:08:46 2011 From: msamtani at gmail.com (mahesh samtani) Date: Sun, 2 Jan 2011 10:08:46 -0500 Subject: [R] How to compute the density of a variable that follows a proportional error distribution Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Sun Jan 2 18:05:27 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 2 Jan 2011 12:05:27 -0500 Subject: [R] The Percentile of a User-Defined pdf In-Reply-To: <32618.173.68.34.215.1293939762.squirrel@webmail.albany.edu> References: <32618.173.68.34.215.1293939762.squirrel@webmail.albany.edu> Message-ID: On Jan 1, 2011, at 10:42 PM, Nissim Kaufmann wrote: > I would like to give a probability distribution function of a > function of > (x,y) on the half-plane y>0, and a constant 0 to know > the c percentile of the marginal distribution of x. I have tried > along > the lines of the following but I keep getting errors: > > # SIMPLIFIED PROBLEM > # The plan is to solve for the .975 percentile "xc" of the marginal x > distribution of the pdf (say it is proportional to 1/(1+x^2+y^2) for > simplicity) which has support on the real plane. > # The function 1/(1+x^2+y^2) has value (normalization constant) > approximately equal to "I" that I was able to > # program with no problem, as shown below. > > # Approximate I, the normalization constant. > # This works fine. > c=1e+3 #the bound of integration in the three directions; if I use > Inf I > get error. > > I=integrate(function(y) { It's a bad idea to define a new function "I". There already is one in R. It might or might not cause problems in this case, depending on whether any of the internal functions might be calling it. > sapply(y, function(y) { > integrate(function(x) { > sapply(x, function(x) 1/(1+x^2+y^2)) > }, -c, c)$value It's also a bad idea to use "-c" and "c" as your name for limits of integration (or for any other variable). "c" is a rather fundamental R function. It may not confuse the interpreter but it will at the vary least confuse the humans who attempt to understand it and will give error messages that are difficult to interpret. > }) > }, -c, c) > > # Preliminary step -- define function J as an integral up to > variable xc. > # I am still stuck on this step -- R says > # "Error in is.vector(X): object 'y' not found." > > J=sapply(xc, function(xc) {integrate(function(x) { > sapply(y, function(x) { > integrate(function(y) { > sapply(x, function(y) 1/(1+x^2+y^2)) > }, -c, c)$value > }) > }, -c, xc)$value > }) In addition to the debugging strategy suggested by Menne, from the console you can issue a traceback() call immediately after a call and see the sequence of calls up to the error. You can also place a browser() call inside the sapply calls which will give you the capability of inspecting the local environment inside the call. ?browser > > # Final step -- solve for .975 percentile of the above function J > uniroot( > sapply(xc, function(xc) {integrate(function(x) { > sapply(y, function(x) { > integrate(function(y) { > sapply(x, function(y) 1/(1+x^2+y^2)) > }, -c, c)$value > }) > }, -c, xc)$value > }) > )/I-.975, > lower=-c, upper=c, tol=1e-10)$root > > I don't have much programming experience. Thank you for your help. > > Nissim Kaufmann > University at Albany > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From vseabra at uol.com.br Sun Jan 2 18:34:08 2011 From: vseabra at uol.com.br (Victor F Seabra) Date: Sun, 2 Jan 2011 15:34:08 -0200 Subject: [R] Please, need help with a plot Message-ID: <4d20b7104796e_63e055fae74160@weasel21.tmail> Please, I wonder if someone knows how to add the less than or equal to symbol in the plot generated by the code below: var1<-c('age <= 3','age <= 7','age <= 10','age <= 11','age <= 20','age <= 25','age <= 30','age <= 45','age <= 50','age < 55','age >= 55') var2<-c(3.8,5.4,3.7,3.8,5.9,6.4,7.2,8.4,10.5,1.3,0.7) table1<-data.frame(var1, var2) plot(x=table1$var2,y=1:11,xlim=c(0,20),pch=20) text(x=table1$var2,y=1:11,table1$var1,pos=4) title(x=15,y=5,expression("how to substitute the < = with the " <= "symbol"),font=5) Please, note that the data must come from a table (not manually fed in a text command) I received help yesterday and learned a fix using \u2264, eval, parse, and sprintf? but the symbols generated by this fix are not exported to an .EPS file Kind regards, Victor Faria Seabra, MD From jukka.koskela at helsinki.fi Sun Jan 2 18:37:24 2011 From: jukka.koskela at helsinki.fi (Jukka Koskela) Date: Sun, 02 Jan 2011 19:37:24 +0200 Subject: [R] Class "coef.mer" into a data.frame? Message-ID: <20110102193724.19252rgpnc2wwtzo.jtkoskel@webmail.helsinki.fi> Thank you! This was exactly what I was looking for. Jukka Lainaus "David Winsemius" : > > On Dec 31, 2010, at 2:49 AM, Jukka Koskela wrote: > >> Hello, >> >> Could somebody please tell me what am I doing wrong in following? >> >> I try extract coefficients (using arm-package) from the lmer >> frunction, but I get the >> following warning: >> a<-data.frame(coef(res)) >> Error in as.data.frame.default(x[[i]], optional = TRUE, >> stringsAsFactors = stringsAsFactors) : >> cannot coerce class "coef.mer" into a data.fram >> >> I think I have done it before the same way and it has worked, but >> not any more.. > > Are you sure you were not looking at just one particular element of > a coef.mer object? > >> fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy) > >> a<-data.frame(coef(fm1)) > Error in as.data.frame.default(x[[i]], optional = TRUE, > stringsAsFactors = stringsAsFactors) : > cannot coerce class '"coef.mer"' into a data.frame >> coef(fm1) > $Subject > (Intercept) Days > 308 253.6623 19.6665383 > 309 211.0067 1.8461362 > 310 212.4446 5.0170766 > 330 275.0971 5.6535771 > > . > . > >> a<-data.frame(coef(fm1)[[1]]) >> a > X.Intercept. Days > 308 253.6623 19.6665383 > 309 211.0067 1.8461362 > 310 212.4446 5.0170766 > 330 275.0971 5.6535771 > 331 273.6664 7.3980095 > > > -- > > David Winsemius, MD > West Hartford, CT > > > From bbolker at gmail.com Sun Jan 2 19:15:31 2011 From: bbolker at gmail.com (Ben Bolker) Date: Sun, 2 Jan 2011 18:15:31 +0000 (UTC) Subject: [R] Please, need help with a plot References: <4d20b7104796e_63e055fae74160@weasel21.tmail> Message-ID: Victor F Seabra uol.com.br> writes: > > > Please, I wonder if someone knows how to add the > less than or equal to symbol in the plot generated by the code below: > var1<-c('age <= 3','age <= 7','age <= 10','age <= 11','age <= 20','age <= > 25','age <= 30','age <= 45','age <= 50','age < 55','age >= 55') > var2<-c(3.8,5.4,3.7,3.8,5.9,6.4,7.2,8.4,10.5,1.3,0.7) > table1<-data.frame(var1, var2) > plot(x=table1$var2,y=1:11,xlim=c(0,20),pch=20) > text(x=table1$var2,y=1:11,table1$var1,pos=4) > title(x=15,y=5,expression("how to substitute the < = with the " <= > "symbol"),font=5) > Please, note that the data must come from a table (not manually fed in a > text command) > I received help yesterday and learned a fix using > \u2264, eval, parse, and sprintf? but the symbols generated by this fix are > not exported to an .EPS file > Kind regards, This is a little bit more 'magic' than I would like, but seems to work. Perhaps someone else can suggest a cleaner solution. ages <- gsub("[^0-9]+","",table1$var1) rel <- gsub("age\\s*([=<>]+)\\s*[0-9]+","\\1",table1$var1,perl=TRUE) with(table1,plot(var2,1:11,xlim=c(0,20),pch=20)) invisible(with(table1, mapply(function(x,y,a,r) { text(x=x,y=y, switch(r, `<=`=bquote(age <= .(a)), `<`=bquote(age < .(a)), `>=`=bquote(age >= .(a))), pos=4)}, var2,1:11,ages,rel))) From michcurran at yahoo.com Sun Jan 2 20:14:22 2011 From: michcurran at yahoo.com (michael curran) Date: Sun, 2 Jan 2011 11:14:22 -0800 (PST) Subject: [R] filehash for big data Message-ID: <922131.53919.qm@web114703.mail.gq1.yahoo.com> Hi all, I am trying to use the filehash library to analyze a 5M by 20 matrix with both double and string data types. After consulting a few tutorials online, it seems as though one needs to first read the data into R; then create an R object; and then assign that object a location in my computer via filehash. It seems like the benefit of this is minimizing memory allocation when running subsequent analysis (e.g., descriptive statistics, regressions, etc.) . My question is: what happens if R chokes when trying to read in the data (i.e., step 1)? Is there another library I can use to get the data read in or, alternatively, am I misunderstanding the complete functionality of the filehash library and what it can do? Apologies if this a basic question--usually I work with considerably smaller data frames and don't have much experience with memory issues and R. Thanks in advance for any advice/pointers. Best, Mike From BChiquoine at tiff.org Sun Jan 2 20:35:37 2011 From: BChiquoine at tiff.org (Chiquoine, Ben) Date: Sun, 2 Jan 2011 19:35:37 +0000 Subject: [R] Probably with default library tree Message-ID: <60E01F7C81AAAE468B66CD1A361EFDB4AB1B30@VSW8EXCH2.tiff.local> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Sun Jan 2 22:30:14 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 2 Jan 2011 16:30:14 -0500 Subject: [R] Please, need help with a plot In-Reply-To: References: <4d20b7104796e_63e055fae74160@weasel21.tmail> Message-ID: <5630327E-CC9A-4CF9-AFDE-E12360AB6CB6@comcast.net> On Jan 2, 2011, at 1:15 PM, Ben Bolker wrote: > Victor F Seabra uol.com.br> writes: > >> >> >> Please, I wonder if someone knows how to add the >> less than or equal to symbol in the plot generated by the code >> below: >> var1<-c('age <= 3','age <= 7','age <= 10','age <= 11','age <= >> 20','age <= >> 25','age <= 30','age <= 45','age <= 50','age < 55','age >= 55') >> var2<-c(3.8,5.4,3.7,3.8,5.9,6.4,7.2,8.4,10.5,1.3,0.7) >> table1<-data.frame(var1, var2) >> plot(x=table1$var2,y=1:11,xlim=c(0,20),pch=20) >> text(x=table1$var2,y=1:11,table1$var1,pos=4) >> title(x=15,y=5,expression("how to substitute the < = with the >> " <= >> "symbol"),font=5) >> Please, note that the data must come from a table (not manually >> fed in a >> text command) >> I received help yesterday and learned a fix using >> \u2264, eval, parse, and sprintf? but the symbols generated by >> this fix are >> not exported to an .EPS file >> Kind regards, > > > This is a little bit more 'magic' than I would like, but seems > to work. Perhaps someone else can suggest a cleaner solution. Here's the best I could come up with but will admit that there were many failed attempts before success: expr.vec <- as.expression(parse(text=table1$var1)) plot(x=table1$var2 ,y=1:11, xlim=c(0,20), pch=20) text(x=table1$var2, y=1:11, labels=expr.vec, pos=4) title(x=15, y=5, expression("Yet another way to process strings with operators like '<=' ) (The title expression works on my machine, but perhaps not on the OP's machine, given differences in encoding that have so far been exhibited.) > > ages <- gsub("[^0-9]+","",table1$var1) > rel <- gsub("age\\s*([=<>]+)\\s*[0-9]+","\\1",table1$var1,perl=TRUE) > > with(table1,plot(var2,1:11,xlim=c(0,20),pch=20)) > invisible(with(table1, > mapply(function(x,y,a,r) { > text(x=x,y=y, > switch(r, > `<=`=bquote(age <= .(a)), > `<`=bquote(age < .(a)), > `>=`=bquote(age >= .(a))), > pos=4)}, > var2,1:11,ages,rel))) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From jholtman at gmail.com Sun Jan 2 23:08:00 2011 From: jholtman at gmail.com (jim holtman) Date: Sun, 2 Jan 2011 17:08:00 -0500 Subject: [R] filehash for big data In-Reply-To: <922131.53919.qm@web114703.mail.gq1.yahoo.com> References: <922131.53919.qm@web114703.mail.gq1.yahoo.com> Message-ID: Exactly how do you want to work with this data? How do you want it organized? What is the structure of the file that you want to read in? What types of analysis are you going to do? Does all the data have to be in memory at once, or can you construct your analysis to do it in pieces and the aggregate the summary data? There is some missing information before trying to propose a solution. For example, do you need all the data in memory at one time (if it is all doubles, you would need 800MB for a single copy). Are you running on a 64-bit version of the operating system? If so, I would suggest that you have at least 4GB of real memory for R so that you could have multiple copies that will probably be created by some of the processing. Why are you considering filehash and not a relational database to store/extract the data? You can always read in a portion of the data and then transfer it to the appropriate storage type. No reason for R to "choke" reading in the data if you have structured the input/output files appropriately. On Sun, Jan 2, 2011 at 2:14 PM, michael curran wrote: > Hi all, > > I am trying to use the filehash library to analyze a 5M by 20 matrix with both > double and string data types. > > > After consulting a few tutorials online, it seems as though one needs to first > read the data into R; then create an R object; and then assign that object a > location in my computer via filehash. It seems like the benefit of this is > minimizing memory allocation when running subsequent analysis (e.g., descriptive > > statistics, regressions, etc.) . > > > My question is: what happens if R chokes when trying to read in the data (i.e., > step 1)? Is there another library I can use to get the data read in or, > alternatively, am I misunderstanding the complete functionality of the filehash > library and what it can do? > > > Apologies if this a basic question--usually I work with considerably smaller > data frames and don't have much experience with memory issues and R. > > > Thanks in advance for any advice/pointers. > > Best, Mike > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From jwiley.psych at gmail.com Sun Jan 2 23:22:40 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Sun, 2 Jan 2011 14:22:40 -0800 Subject: [R] Probably with default library tree In-Reply-To: <60E01F7C81AAAE468B66CD1A361EFDB4AB1B30@VSW8EXCH2.tiff.local> References: <60E01F7C81AAAE468B66CD1A361EFDB4AB1B30@VSW8EXCH2.tiff.local> Message-ID: Hi Ben, The documentation for ?.libPaths makes me think you might be able to get somewhere by setting the Windows environment variables R_LIBS and/or R_LIBS_USER. See also: http://cran.r-project.org/bin/windows/base/rw-FAQ.html#Packages Another option would be to create a user profile directly specify the library path there. You can do this by creating and customizing the Rprofile.site file or a .Rprofile file. See: http://cran.r-project.org/doc/manuals/R-intro.html#Customizing-the-environment I suspect, though I do not know, that the alternate library path (which you do not want) was created because in Windows 7, the Program directory requires elevated privileges. If you installed R in a different directory (I typically use C:\R\), I think this problem might go away (it's also easier to type if you're running any R scripts etc. from the command prompt). Cheers, Josh On Sun, Jan 2, 2011 at 11:35 AM, Chiquoine, Ben wrote: > Hi, > > > > I just installed R on a new windows 7 machine and am having a probelm with the default libraries. The default libraries are not what I want them to be so when i say install.packages("XXX") the packages don't install where I want them to. ?Ideally everything would install to the same location as the base packages. When I look at my library paths I get the following. > > > >> .libPaths() > [1] "C:\\Users\\Ben\\Documents/R/win-library/2.12" > [2] "C:/PROGRA~1/R/R-212~1.1/library" > > > > if I type > > > >> .libPaths(.libPaths()[2]) > >> .libPaths() > [1] "C:/PROGRA~1/R/R-212~1.1/library" > > This is what I want it to say and packages are now installed where I want them to be. ?Unfortunately when I restar R, .libPaths() defaults back to the original two paths. So the real question is how do I permantly set my directory to be the one in which the base files are installed? > > Thanks in advance for your help! > > > > Ben > > ___________________________________________ > This message and any attached documents contain > information which may be confidential, subject to > privilege or exempt from disclosure under applicable > law. These materials are solely for the use of the > intended recipient. If you are not the intended > recipient of this transmission, you are hereby > notified that any distribution, disclosure, printing, > copying, storage, modification or the taking of any > action in reliance upon this transmission is strictly > prohibited. Delivery of this message to any person > other than the intended recipient shall not > compromise or waive such confidentiality, privilege > or exemption from disclosure as to this > communication. > > If you have received this communication in error, > please notify the sender immediately and delete > this message from your system. > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From ggrothendieck at gmail.com Sun Jan 2 23:31:00 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Sun, 2 Jan 2011 17:31:00 -0500 Subject: [R] Probably with default library tree In-Reply-To: <60E01F7C81AAAE468B66CD1A361EFDB4AB1B30@VSW8EXCH2.tiff.local> References: <60E01F7C81AAAE468B66CD1A361EFDB4AB1B30@VSW8EXCH2.tiff.local> Message-ID: On Sun, Jan 2, 2011 at 2:35 PM, Chiquoine, Ben wrote: > Hi, > > > > I just installed R on a new windows 7 machine and am having a probelm with the default libraries. The default libraries are not what I want them to be so when i say install.packages("XXX") the packages don't install where I want them to. ?Ideally everything would install to the same location as the base packages. When I look at my library paths I get the following. > > > >> .libPaths() > [1] "C:\\Users\\Ben\\Documents/R/win-library/2.12" > [2] "C:/PROGRA~1/R/R-212~1.1/library" > > Its normally better to have your library in your own user area (which is what its doing for you) so that you don't need Administrative privileges to install packages. Unless there is some good reason not to want the default set up you should think about just going with it. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From mgogos1 at gmail.com Mon Jan 3 00:28:12 2011 From: mgogos1 at gmail.com (Mkip) Date: Sun, 2 Jan 2011 18:28:12 -0500 Subject: [R] iPhone 3G App For R? Message-ID: <5C2A1ADA-9DF3-4489-BCEE-47FB639200F4@gmail.com> Does anyone know if a free iphone 3G app for R is available now? Thanks, Matilda Gogos From chethanuniversal at gmail.com Mon Jan 3 02:12:21 2011 From: chethanuniversal at gmail.com (Chethan S) Date: Mon, 3 Jan 2011 06:42:21 +0530 Subject: [R] Modules for using geostatistics for image classification Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From benjamin.ward at bathspa.org Mon Jan 3 03:25:15 2011 From: benjamin.ward at bathspa.org (Ben Ward) Date: Mon, 3 Jan 2011 02:25:15 +0000 Subject: [R] iPhone 3G App For R? In-Reply-To: <5C2A1ADA-9DF3-4489-BCEE-47FB639200F4@gmail.com> References: <5C2A1ADA-9DF3-4489-BCEE-47FB639200F4@gmail.com> Message-ID: On 02/01/2011 23:28, Mkip wrote: > Does anyone know if a free iphone 3G app for R is available now? > > Thanks, > Matilda Gogos > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > I remember looking into it last academic year, because I knew there were SQLite programs on iPhone and I was interested in remote data collection with iPhone and initial analysis and viewing of data in R. But as I remember there isn't one yet. There may be some scope for R in android phones, because I read somewhere that some allow access to a terminal, which might alleviate the need for coding a gui for the phone. But I'm not an app coder so I wouldn't know the specifics of such a task and phone. From shigesong at gmail.com Mon Jan 3 03:28:22 2011 From: shigesong at gmail.com (Shige Song) Date: Sun, 2 Jan 2011 21:28:22 -0500 Subject: [R] iPhone 3G App For R? In-Reply-To: References: <5C2A1ADA-9DF3-4489-BCEE-47FB639200F4@gmail.com> Message-ID: Does iphone even support the GNU tool chain? Shige On Sun, Jan 2, 2011 at 9:25 PM, Ben Ward wrote: > On 02/01/2011 23:28, Mkip wrote: >> >> Does anyone know if a free iphone 3G app for R is available now? >> >> Thanks, >> Matilda Gogos >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > I remember looking into it last academic year, because I knew there were > SQLite programs on iPhone and I was interested in remote data collection > with iPhone and initial analysis and viewing of data in R. But as I remember > there isn't one yet. There may be some scope for R in android phones, > because I read somewhere that some allow access to a terminal, which might > alleviate the need for coding a gui for the phone. But I'm not an app coder > so I wouldn't know the specifics of such a task and phone. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From Roy.Mendelssohn at noaa.gov Mon Jan 3 03:35:43 2011 From: Roy.Mendelssohn at noaa.gov (Roy Mendelssohn) Date: Sun, 2 Jan 2011 18:35:43 -0800 Subject: [R] iPhone 3G App For R? In-Reply-To: References: <5C2A1ADA-9DF3-4489-BCEE-47FB639200F4@gmail.com> Message-ID: Check the archive for r-sig-mac (https://stat.ethz.ch/pipermail/r-sig-mac/). There has been extensive discussion about this. If memory serves (and it rarely does anymore :-) ) the issue is mainly licensing, there already exists the ability to compile to the ARM processor. -Roy M. On Jan 2, 2011, at 6:28 PM, Shige Song wrote: > Does iphone even support the GNU tool chain? > > Shige > > On Sun, Jan 2, 2011 at 9:25 PM, Ben Ward wrote: >> On 02/01/2011 23:28, Mkip wrote: >>> >>> Does anyone know if a free iphone 3G app for R is available now? >>> >>> Thanks, >>> Matilda Gogos >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> I remember looking into it last academic year, because I knew there were >> SQLite programs on iPhone and I was interested in remote data collection >> with iPhone and initial analysis and viewing of data in R. But as I remember >> there isn't one yet. There may be some scope for R in android phones, >> because I read somewhere that some allow access to a terminal, which might >> alleviate the need for coding a gui for the phone. But I'm not an app coder >> so I wouldn't know the specifics of such a task and phone. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ********************** "The contents of this message do not reflect any position of the U.S. Government or NOAA." ********************** Roy Mendelssohn Supervisory Operations Research Analyst NOAA/NMFS Environmental Research Division Southwest Fisheries Science Center 1352 Lighthouse Avenue Pacific Grove, CA 93950-2097 e-mail: Roy.Mendelssohn at noaa.gov (Note new e-mail address) voice: (831)-648-9029 fax: (831)-648-8440 www: http://www.pfeg.noaa.gov/ "Old age and treachery will overcome youth and skill." "From those who have been given much, much will be expected" From dwinsemius at comcast.net Mon Jan 3 04:35:03 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 2 Jan 2011 22:35:03 -0500 Subject: [R] iPhone 3G App For R? In-Reply-To: References: <5C2A1ADA-9DF3-4489-BCEE-47FB639200F4@gmail.com> Message-ID: <20BC677F-BD1E-49F8-9983-2A650C173D5E@comcast.net> On Jan 2, 2011, at 9:35 PM, Roy Mendelssohn wrote: > Check the archive for r-sig-mac (https://stat.ethz.ch/pipermail/r-sig-mac/ > ). There has been extensive discussion about this. If memory > serves (and it rarely does anymore :-) ) the issue is mainly > licensing, there already exists the ability to compile to the ARM > processor. > For R to be ported to the IPhone or iPad, there would needed to be approval by Apple which has rather emphatically stated that it does not want languages or alternate development systems ported. I believe that terminal emulators already exist if you wanted to configure your computer as a server. So depending on the precise meaning of the phrase "app for R" the answer could be yes or no. The modifier "3G" would appear to be meaningless in this discussion. > -Roy M. > > > On Jan 2, 2011, at 6:28 PM, Shige Song wrote: > >> Does iphone even support the GNU tool chain? >> >> Shige >> >> On Sun, Jan 2, 2011 at 9:25 PM, Ben Ward >> wrote: >>> On 02/01/2011 23:28, Mkip wrote: >>>> >>>> Does anyone know if a free iphone 3G app for R is available now? >>>> >>>> Thanks, >>>> Matilda Gogos >>>> >>> I remember looking into it last academic year, because I knew >>> there were >>> SQLite programs on iPhone and I was interested in remote data >>> collection >>> with iPhone and initial analysis and viewing of data in R. But as >>> I remember >>> there isn't one yet. There may be some scope for R in android >>> phones, >>> because I read somewhere that some allow access to a terminal, >>> which might >>> alleviate the need for coding a gui for the phone. But I'm not an >>> app coder >>> so I wouldn't know the specifics of such a task and phone. -- > David Winsemius, MD West Hartford, CT From jtr4v at yahoo.com Mon Jan 3 04:46:51 2011 From: jtr4v at yahoo.com (Justin Reese) Date: Sun, 2 Jan 2011 19:46:51 -0800 (PST) Subject: [R] Using PCA to correct p-values from snpMatrix Message-ID: <822002.25350.qm@web110814.mail.gq1.yahoo.com> Hi R-help folks, I have been doing some single SNP association work using snpMatrix. This works well, but produces a lot of false positives, because of population structure in my data. I would like to correct the p-values (which snpMatrix gives me) for population structure, possibly using principle component analysis (PCA). My data is complicated, so here's a simple example of what I'd like to do: # 3x8 matrix of example snp data, 1 = allele A, -1 = allele B, 0 = hetero snp.data = matrix( c(0,1,0,-1,-1,1,1,-1,0,1,1,0,0,1,0,-1,-1,NA,0,-1,0,0,1,0), nrow=3, dimnames = list( c("bob", "frita", "trudy"), c("snp1", "snp2", "snp3", "snp4", "snp5","snp6", "snp7", "snp8") ) ) # phenotype data - resistant or susceptible to zombie infection phenotype.data = matrix( c("bob", "frita", "trudy", "resistant", "susceptible", "resistant"), nrow=3, dimnames = list( c("bob", "frita", "trudy"), c("rowNames", "cc") ) ) library("snpMatrix") # add one in the following line so genotypes are 0,1 or 2, so single.snp.tests() method doesn't complain snp.matrix <- as(snp.data + 1,'snp.matrix') single.snp.assoc <- single.snp.tests(cc, data=as.data.frame(phenotype.data), snp.data=snp.matrix ) Okay, so now I have p-values for the association between SNPs and resistance to zombie infection. I do this for PCA: snp.data = replace( snp.data, is.na(snp.data), 0) # workaround, b/c I can't get prcomp to ignore NAs no matter what I do pca = prcomp(snp.data) So, the question is, can I use the PCA data to correct my p-values (in single.snp.association) for population structure? In my real data, population structure is causing a lot of type I errors (false positive SNPs). I have read of some standalone software that does this sort of thing, for example EIGENSTRAT: http://www.biostat.jhsph.edu/~iruczins/teaching/misc/gwas/papers/price2006.pdf But I'd like to stick to R if possible. Any advice/comments welcome. From david.crow at cide.edu Mon Jan 3 05:41:05 2011 From: david.crow at cide.edu (David Crow) Date: Sun, 2 Jan 2011 22:41:05 -0600 Subject: [R] Logical Indicator for Warning and Error Messages? Message-ID: <000001cbab00$6edba830$4c92f890$@crow@cide.edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Mon Jan 3 06:50:25 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 3 Jan 2011 00:50:25 -0500 Subject: [R] Logical Indicator for Warning and Error Messages? In-Reply-To: <000001cbab00$6edba830$4c92f890$@crow@cide.edu> References: <000001cbab00$6edba830$4c92f890$@crow@cide.edu> Message-ID: <9DBD0E87-9F7B-4981-B5DE-5A76EB6F4287@comcast.net> On Jan 2, 2011, at 11:41 PM, David Crow wrote: > Dear R Community- > > > > Is there a logical variable indicating the presence of a warning or > error > message after executing a command? I?m bootstrapping a logit model > (1,000 > iterations, relevant code posted below), but the model fails to > converge on > many iterations. (I?m guessing that a small sample size and the > combination > of sampling with replacement, high item missingness, and list-wise > deletion > leads to a sparse data problem for some iterations.) > > > > Even for the iterations that fail to converge, though, the program > produces > parameter estimates. I would like to remove these parameter > estimates from > N x k matrix containing the boostrapped parameter estimates. What I > was > thinking of is adding a line of code to check if there is a warning > or error > message present, recording the iteration numbers where there are > error/warning messages, and replacing the parameters for those > iterations > with NA?s. I?ve checked the help pages for the ?warning?, > ?getOption?, and > other commands, but it?s not immediately apparent to me how to check > for the > presence of warning or error messages. ?options #says that if "warn" is >= 2 that warnings are turned into errors ?try # for handling of errors -- David. > > > > Here?s the relevant portion of the code (I?ve omitted the parts that > recover > parameter and fit estimates for each iteration): > > > > ************************************* > > [Code Starts Here] > > > > #Create Objects > > N = 1000 #1000 iterations > > set.seed(7843) > > seed = sample(10000000, 1000) > > > > #1000 Iterations > > for (i in 1:N){ > > > > #Bootstap (Sampling with Replacement) > > set.seed(seed[i]) > > pre <- subset(cainit, time==1) > > sampt1 <- sort(sample(nrow(pre), nrow(pre), replace=T)) > > sampt1 <- pre[sampt1,] > > > > > > #T1 Model of Vote Intention > > try(T1A <- glm(prop1a ~ educ + age + inc + dem + appgov + appleg + > budget + > co1a + > > prop1b + prop1d,family=binomial(link="logit"), > > data=sampt1, na.action=na.exclude)) > > > > ##[NOTE: the ?lrm? command is part of the ?Design? package] > > try(T1a <- lrm(prop1a ~ educ + age + inc + dem + appgov + appleg + > budget + > co1a + > > prop1b + prop1d, data=sampt1)) > > } > > > > [Code Ends Here] > > ****************************************** > > > > > > > > The following set of warning/error messages is repeated about a > dozen times > at the end of the loop: > > > > Error in lrm(prop1a ~ educ + age + inc + dem + appgov + appleg + > budget + : > > > Unable to fit model using ?lrm.fit? > > In addition: Warning messages: > > 1: glm.fit: fitted probabilities numerically 0 or 1 occurred > > 2: glm.fit: fitted probabilities numerically 0 or 1 occurred > > 3: glm.fit: algorithm did not converge > > 4: glm.fit: fitted probabilities numerically 0 or 1 occurred > > > > > > > > Any help would be greatly appreciated. > > > > Happy New Year! > > David > > > > ============================== > > David Crow > > Profesor-Investigador / Assistant Professor > > Divisi?n de Estudios Internacionales > > Centro de Investigaci?n y Docencia Econ?micas > > > Carretera M?xico-Toluca 3655 > Col. Lomas de Santa Fe > 01210 M?xico, D.F. > Tel.: (+011 52 55) 5727-9800, ext. 2152 > > > > ============================== > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From marchywka at hotmail.com Mon Jan 3 03:49:10 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Sun, 2 Jan 2011 21:49:10 -0500 Subject: [R] Modules for using geostatistics for image classification In-Reply-To: References: Message-ID: > From: chethanuniversal at gmail.com > Date: Mon, 3 Jan 2011 06:42:21 +0530 > To: r-sig-geo at r-project.org; r-help at r-project.org > Subject: [R] Modules for using geostatistics for image classification > > Hello everyone! > > I am using GRASS with spgrass6 for my work. I will be using variograms in > the process of landsat image classification. I am quite ok with GRASS but am > finding R really tough. I understand that spgrass6 is a link between GRASS > and R which can read and write raster/vector layers. Out of really many > packages in R, for generating variograms out of landsat images which > packages of R can be used? I want to go for image segmentation after that - > i.e., for further work I will be using object oriented classification. > I had no idea what this is so I thought I would use my usual strategy and type "R foo" into google and it seems to produce a lot of hits relevant to R, http://www.google.com/#sclient=psy&hl=en&q=R+variogram I just suggest this since often you can find peers and a broader range or useful pages than just the R docs or other genre that you may get from more restricted sources. > Regards, > > Chethan S. From yingfeng.zheng at gmail.com Mon Jan 3 04:36:16 2011 From: yingfeng.zheng at gmail.com (Yingfeng Zheng) Date: Mon, 3 Jan 2011 11:36:16 +0800 Subject: [R] Errors in installing Matrix package Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From gnolffilc at gmail.com Mon Jan 3 07:03:15 2011 From: gnolffilc at gmail.com (Clifford Long) Date: Mon, 3 Jan 2011 00:03:15 -0600 Subject: [R] using "plot" with time series object - "axes = FALSE" option does not appear to work Message-ID: Dear R-help, I am attempting to plot data using standard R plot utilities. The data was retrieved from FRED (St. Louis Federal Reserve) using the package quantmod. My question is NOT about quantmod. While I retrieve data using quantmod, I am not using its charting utility. I have been having success using the standard R "plot" utilities to this point with this type of data. Eventually I want to put two series on the same plot but with the y-axis for one series on the right side, and also inverted (min at the top, max at the bottom). I believe that I see how to do this by using par(new = TRUE) with a second plot statement with "axes = FALSE", followed by the command "axis(side = 4, ylim = c(max(seriesname), min(seriesname))". Here is what I believe should be a smaller reproducible example of my issue: #----------------------------------------------------------------- library(quantmod) getSymbols('PCECTPI', src='FRED') is.xts(PCECTPI) # check the type of object - response is 'TRUE' plot(PCECTPI) # This works fine. plot(PCECTPI, axes = FALSE) # This works in that it gives me a plot, but I get the axes regardless of the use of "axes = FALSE". #----------------------------------------------------------------- I did find that using par(yaxt = "n") seems to work to suppress the y-axis. But it seems to me that the "axes = FALSE" command should also work, and I believe that it would be easier to use in the larger context of my goal. I have spent time with the R help pages and Nabble searches of the R help archives but I still seem to be missing something. Does the "axes = FALSE" option not work when using plot with this type of data object? Is there some other fundamental thing that I have overlooked? Or should this work? My apologies if the answer is obvious and I've just missed it. Thank you in advance for any help that can be provided. Cliff Long gnolffilc at gmail.com #################################################################### My system: HP Pavilion Windows 7 The computer/OS is 64-bit. I am running the precompiled 32-bit version of R (per sessionInfo). Thus far, everything seems to have been working as expected. > sessionInfo() R version 2.12.1 (2010-12-16) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] quantmod_0.3-15 TTR_0.20-2 xts_0.7-5 zoo_1.6-4 [5] Defaults_1.1-1 loaded via a namespace (and not attached): [1] grid_2.12.1 lattice_0.19-13 tools_2.12.1 > Sys.getlocale() [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252" From lucam1968 at gmail.com Mon Jan 3 08:32:15 2011 From: lucam1968 at gmail.com (Luca Meyer) Date: Mon, 3 Jan 2011 08:32:15 +0100 Subject: [R] error in calling source(): invalid multibyte character in parser Message-ID: <25DFCF6A-7BEE-430E-B009-10183B43429D@gmail.com> Being italians when writing comments/instructions we use accented letters - like ?, ?, ?, etc.... when running R scripts using such characters I get and error saying: invalid multibyte character in parser I have been looking at the help and searched the r-help archives but I haven't find anything that I could intelligibly apply to my case. Can anyone suggest a fix for this error? Thanks, Luca Mr. Luca Meyer www.lucameyer.com IBM SPSS Statistics release 19.0.0 R version 2.12.1 (2010-12-16) Mac OS X 10.6.5 (10H574) - kernel Darwin 10.5.0 From spector at stat.berkeley.edu Mon Jan 3 08:36:15 2011 From: spector at stat.berkeley.edu (Phil Spector) Date: Sun, 2 Jan 2011 23:36:15 -0800 (PST) Subject: [R] error in calling source(): invalid multibyte character in parser In-Reply-To: <25DFCF6A-7BEE-430E-B009-10183B43429D@gmail.com> References: <25DFCF6A-7BEE-430E-B009-10183B43429D@gmail.com> Message-ID: Luca - What happens why you type Sys.setlocale('LC_ALL','C') before issuing the source command? - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Mon, 3 Jan 2011, Luca Meyer wrote: > Being italians when writing comments/instructions we use accented letters - like ?, ?, ?, etc.... when running R scripts using such characters I get and error saying: > > invalid multibyte character in parser > > I have been looking at the help and searched the r-help archives but I haven't find anything that I could intelligibly apply to my case. > > Can anyone suggest a fix for this error? > > Thanks, > Luca > > Mr. Luca Meyer > www.lucameyer.com > IBM SPSS Statistics release 19.0.0 > R version 2.12.1 (2010-12-16) > Mac OS X 10.6.5 (10H574) - kernel Darwin 10.5.0 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From dieter.menne at menne-biomed.de Mon Jan 3 08:44:33 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Sun, 2 Jan 2011 23:44:33 -0800 (PST) Subject: [R] Should I use T-test or prop.test In-Reply-To: <1293997972335-3171191.post@n4.nabble.com> References: <1293997972335-3171191.post@n4.nabble.com> Message-ID: <1294040673787-3171644.post@n4.nabble.com> Az Ha wrote: > > I have a set of data where two groups of animals walk around in an > environment and I have calculated the percentages of the area covered by > them. Each group covers different area, for example the arithmetic mean is > 35% and 23% with n=40 and 29 respectively. > ... > Should i use unpaired T-test or prop.test? > .. > Distribution questions aside, you should also consider what your story is. >From the animal's viewpoint, a doubling from 1% to 2% is as good as one from 30% to 60%, so the proportion is the measure of success. If you are an ecologist, a change in population from 30% to 60% might be close to disaster for the environment, so percentage points (60-30) or (2-1) is the right choice. Dieter -- View this message in context: http://r.789695.n4.nabble.com/Should-I-use-T-test-or-prop-test-tp3171191p3171644.html Sent from the R help mailing list archive at Nabble.com. From pdalgd at gmail.com Mon Jan 3 09:24:21 2011 From: pdalgd at gmail.com (peter dalgaard) Date: Mon, 3 Jan 2011 09:24:21 +0100 Subject: [R] error in calling source(): invalid multibyte character in parser In-Reply-To: <25DFCF6A-7BEE-430E-B009-10183B43429D@gmail.com> References: <25DFCF6A-7BEE-430E-B009-10183B43429D@gmail.com> Message-ID: On Jan 3, 2011, at 08:32 , Luca Meyer wrote: > Being italians when writing comments/instructions we use accented letters - like ?, ?, ?, etc.... when running R scripts using such characters I get and error saying: > > invalid multibyte character in parser > > I have been looking at the help and searched the r-help archives but I haven't find anything that I could intelligibly apply to my case. > > Can anyone suggest a fix for this error? The most likely cause is that your scripts are written in an "8 bit ASCII" encoding (Latin-1 or -9, most likely), while R is running in a UTF8 locale. If that is the cause, the fix is to standardize things to use the same locale. You can convert the encoding of your source file using the iconv utility (in a Terminal window). -pd > > Thanks, > Luca > > Mr. Luca Meyer > www.lucameyer.com > IBM SPSS Statistics release 19.0.0 > R version 2.12.1 (2010-12-16) > Mac OS X 10.6.5 (10H574) - kernel Darwin 10.5.0 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com From djmuser at gmail.com Mon Jan 3 09:26:22 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Mon, 3 Jan 2011 00:26:22 -0800 Subject: [R] Errors in installing Matrix package In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wayne.rochester at csiro.au Mon Jan 3 09:27:44 2011 From: wayne.rochester at csiro.au (Wayne Rochester) Date: Mon, 3 Jan 2011 18:27:44 +1000 Subject: [R] Keeping lattice plot axes in the same place when legend sizes vary Message-ID: <4D218880.1030103@csiro.au> Dear all, I am generating a number of lattice plots with common axes but varying legends. There is one plot in the window at a time. I would like the plot axes to always appear in the same place within the window (and saved images) so the plots do not move around when they are combined in a video or browsed in something like an HTML document. The default behaviour of the lattice package is to centre the whole plot, including the legend, in the window. If a legend is on the right and gets bigger from one plot to the next (e.g. due to bigger numbers), then the graph axes shift left a little to keep the centre of the whole plot in the same place. To prevent this, is it possible to explicitly set the position of the axes? Alternatively, being able to align the plot with the left of the window rather than the centre would also provide a solution (when the legend is on the right). The following example demonstrates the default behaviour: library(lattice) x <- -10:10 y <- -10:10 grid <- expand.grid(x=x, y=y) grid$z <- grid$x * grid$y x11(width=5, height=2.5) print(levelplot(z ~ x * y, data=grid, aspect=1)) The example plots a graph with a legend on the right. A wide window is used to provide horizontal space to enable to legend to grow or shrink without affecting the size of the graph axes. The whole plot, including the legend, is centred horizontally. If we run the same code, but with grid$z multiplied by 10, then the legend is a little wider, and the graph axes are shifted a little to the left. I had a play with the grid viewport functions, but was only able to change the size and position of the viewport the graph went into rather than the position of the graph within that viewport. Many thanks, Wayne Rochester CSIRO, Brisbane, Australia From tabieg at yahoo.com Mon Jan 3 09:41:48 2011 From: tabieg at yahoo.com (taby gathoni) Date: Mon, 3 Jan 2011 00:41:48 -0800 (PST) Subject: [R] personal details appearing on google such after I pose a question in the R-help forum Message-ID: <365505.13403.qm@web112612.mail.gq1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ripley at stats.ox.ac.uk Mon Jan 3 09:48:09 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Mon, 3 Jan 2011 08:48:09 +0000 (GMT) Subject: [R] error in calling source(): invalid multibyte character in parser In-Reply-To: References: <25DFCF6A-7BEE-430E-B009-10183B43429D@gmail.com> Message-ID: On Mon, 3 Jan 2011, peter dalgaard wrote: > > On Jan 3, 2011, at 08:32 , Luca Meyer wrote: > >> Being italians when writing comments/instructions we use accented letters - like ?, ?, ?, etc.... when running R scripts using such characters I get and error saying: >> >> invalid multibyte character in parser >> >> I have been looking at the help and searched the r-help archives but I haven't find anything that I could intelligibly apply to my case. >> >> Can anyone suggest a fix for this error? > > The most likely cause is that your scripts are written in an "8 bit > ASCII" encoding (Latin-1 or -9, most likely), while R is running in > a UTF8 locale. If that is the cause, the fix is to standardize > things to use the same locale. You can convert the encoding of your > source file using the iconv utility (in a Terminal window). Or use the 'encoding' argument of source() to tell R what the encoding is, e.g. encoding="latin1" or "latin-9" (the inconsistency being in the iconv used on Macs, not in R). > > -pd > >> >> Thanks, >> Luca >> >> Mr. Luca Meyer >> www.lucameyer.com >> IBM SPSS Statistics release 19.0.0 >> R version 2.12.1 (2010-12-16) >> Mac OS X 10.6.5 (10H574) - kernel Darwin 10.5.0 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- > Peter Dalgaard > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From bjoner at combitel.no Mon Jan 3 10:06:40 2011 From: bjoner at combitel.no (Siri Bjoner) Date: Mon, 03 Jan 2011 10:06:40 +0100 Subject: [R] personal details appearing on google such after I pose a question in the R-help forum In-Reply-To: <365505.13403.qm@web112612.mail.gq1.yahoo.com> References: <365505.13403.qm@web112612.mail.gq1.yahoo.com> Message-ID: <20110103100640.56893zxma85b6oow@webmail.combitel.no> Not use a sig when posting to the forum? It is an open forum, and as such, all details that are sent to the list will also appear in the searchable archives. I don't think it is possible to reserve oneself against appearing in the archives... Siri. Siterer "taby gathoni" : > Hi all, > > Out of curiosity I googled my name yesterday to just see what new > information?? in the web there is associated with me. To?? my > surprise, I found in addition to the?? questions i have posted on > this forum, my email address and my signature details (name, > address, telephone number) seem to appear everytime i pose a > question. How can i conceal my personal details from the access of > anyone when using this forum? > > cheers > Taby > > > > > > > > Kind regards, > > An idea not coupled with action will never get any bigger than the > brain cell it occupied. > Arnold Glasgow > ...... > "Attempt something large enough that failure is guaranteed???unless > God steps in!" > > > > > [[alternative HTML version deleted]] > > From mdsumner at gmail.com Mon Jan 3 11:06:49 2011 From: mdsumner at gmail.com (Michael Sumner) Date: Mon, 3 Jan 2011 21:06:49 +1100 Subject: [R] personal details appearing on google such after I pose a question in the R-help forum In-Reply-To: <20110103100640.56893zxma85b6oow@webmail.combitel.no> References: <365505.13403.qm@web112612.mail.gq1.yahoo.com> <20110103100640.56893zxma85b6oow@webmail.combitel.no> Message-ID: library(fortunes);fortune() John Miller: How do I prevent google search to post my questions asked here?? Martin Maechler: you don't: R-help is famous and celebrity can't be gotten rid of ;-) -- John Miller and Martin Maechler R-help (June 2004) On Mon, Jan 3, 2011 at 8:06 PM, Siri Bjoner wrote: > Not use a sig when posting to the forum? > > It is an open forum, and as such, all details that are sent to the list will > also appear in the searchable archives. I don't think it is possible to > reserve oneself against appearing in the archives... > > Siri. > > Siterer "taby gathoni" : > >> Hi all, >> >> Out of curiosity I googled my name yesterday to just see what new >> information?? in the web there is associated with me. To?? my surprise, I >> found in addition to the?? questions i have posted on this forum, my email >> address and my signature details (name, address, telephone number) seem to >> appear everytime i pose a question. How can i conceal my personal details >> from the access of anyone when using this forum? >> >> cheers >> Taby >> >> >> >> >> >> >> >> Kind regards, >> >> An idea not coupled with action will never get any bigger than the brain >> cell it occupied. >> Arnold Glasgow >> ...... >> "Attempt something large enough that failure is guaranteed???unless God >> steps in!" >> >> >> >> >> ? ? ? ?[[alternative HTML version deleted]] >> >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Michael Sumner Institute for Marine and Antarctic Studies, University of Tasmania Hobart, Australia e-mail: mdsumner at gmail.com From lamprianou at yahoo.com Mon Jan 3 12:18:37 2011 From: lamprianou at yahoo.com (Iasonas Lamprianou) Date: Mon, 3 Jan 2011 03:18:37 -0800 (PST) Subject: [R] factor names Message-ID: <343707.26654.qm@web120602.mail.ne1.yahoo.com> Dear all I have a factor variable which holds values like "Engineer", "Doctor", "Teacher" etc. I would like to collapse those categories so that Teachers and Sociologists form one category named "Teach & Soc" etc. However, I do not know how I can do it. Recoding does not seem to work. Thank you Dr. Iasonas Lamprianou Assistant Professor (Educational Research and Evaluation) Department of Education Sciences European University-Cyprus P.O. Box 22006 1516 Nicosia Cyprus Tel.: +357-22-713178 Fax: +357-22-590539 Honorary Research Fellow Department of Education The University of Manchester Oxford Road, Manchester M13 9PL, UK Tel. 0044 161 275 3485 iasonas.lamprianou at manchester.ac.uk From wwwhsd at gmail.com Mon Jan 3 12:26:43 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Mon, 3 Jan 2011 09:26:43 -0200 Subject: [R] factor names In-Reply-To: <343707.26654.qm@web120602.mail.ne1.yahoo.com> References: <343707.26654.qm@web120602.mail.ne1.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From lamprianou at yahoo.com Mon Jan 3 12:41:55 2011 From: lamprianou at yahoo.com (Iasonas Lamprianou) Date: Mon, 3 Jan 2011 03:41:55 -0800 (PST) Subject: [R] factor names In-Reply-To: References: <343707.26654.qm@web120602.mail.ne1.yahoo.com> Message-ID: <829626.94684.qm@web120611.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From vseabra at uol.com.br Mon Jan 3 14:09:12 2011 From: vseabra at uol.com.br (Victor F Seabra) Date: Mon, 3 Jan 2011 11:09:12 -0200 Subject: [R] using "plot" with time series object - "axes = FALSE" option does not appear to work In-Reply-To: References: Message-ID: <4d21ca789b4cd_1f76bf4ee70141@weasel14.tmail> I guess you should specify x and y variables: plot(x=PCECTPI$var1, y=PCECTPI$var2, axes = FALSE) From marchywka at hotmail.com Mon Jan 3 12:32:20 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Mon, 3 Jan 2011 06:32:20 -0500 Subject: [R] using "plot" with time series object - "/hijack thread for quetion re R data capture In-Reply-To: References: Message-ID: > Date: Mon, 3 Jan 2011 00:03:15 -0600 > From: gnolffilc at gmail.com > To: r-help at r-project.org > Subject: [R] using "plot" with time series object - "axes = FALSE" option does not appear to work > > Dear R-help, > > I am attempting to plot data using standard R plot utilities. The > data was retrieved from FRED (St. Louis Federal Reserve) using the > package quantmod. My question is NOT about quantmod. While I > retrieve data using quantmod, I am not using its charting utility. I [...] > > Here is what I believe should be a smaller reproducible example of my issue: > > #----------------------------------------------------------------- > > library(quantmod) > > getSymbols('PCECTPI', src='FRED') > is.xts(PCECTPI) # check the type of object - response is 'TRUE' > > plot(PCECTPI) # This works fine. > I just tried the example and got a graph of something, presumably using data downloaded from the data provider FRED. I have a lot of bash scripts to download data from various FRB web pages but I seem to recall these are based on scraping html and maybe their ftp site ( I'm too lazy to look but I do remember they had csv data files for download). Often with sites like this, there is no API or even stable web pages from which to scrape data. Generally how stable are the R data download things when no agreement with data provider is in place? I'm impressed that someone is willing to maintain such a helpful facility as this is the most annoying kinds of stuff to write but also note that sometimes agencies do solicit public input on related topics and it may be helpful for R users to response and comment on the importance of computer readable data. From BChiquoine at tiff.org Mon Jan 3 14:12:42 2011 From: BChiquoine at tiff.org (Chiquoine, Ben) Date: Mon, 3 Jan 2011 13:12:42 +0000 Subject: [R] Probably with default library tree In-Reply-To: References: <60E01F7C81AAAE468B66CD1A361EFDB4AB1B30@VSW8EXCH2.tiff.local> Message-ID: <60E01F7C81AAAE468B66CD1A361EFDB4AB1B9C@VSW8EXCH2.tiff.local> Josh and Gabor thanks for those two extremely useful responses. I played around with those environment variables (R_LIBS and R_LIBS_USER) for a bit without success but I like the rationale hat you've given for the alternate path and will probably just stick with it. Thanks, Ben -----Original Message----- From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com] Sent: Sunday, January 02, 2011 5:31 PM To: Chiquoine, Ben Cc: r-help at R-project.org Subject: Re: [R] Probably with default library tree On Sun, Jan 2, 2011 at 2:35 PM, Chiquoine, Ben wrote: > Hi, > > > > I just installed R on a new windows 7 machine and am having a probelm with the default libraries. The default libraries are not what I want them to be so when i say install.packages("XXX") the packages don't install where I want them to. ?Ideally everything would install to the same location as the base packages. When I look at my library paths I get the following. > > > >> .libPaths() > [1] "C:\\Users\\Ben\\Documents/R/win-library/2.12" > [2] "C:/PROGRA~1/R/R-212~1.1/library" > > Its normally better to have your library in your own user area (which is what its doing for you) so that you don't need Administrative privileges to install packages. Unless there is some good reason not to want the default set up you should think about just going with it. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ___________________________________________ This message and any attached documents contain information which may be confidential, subject to privilege or exempt from disclosure under applicable law. These materials are solely for the use of the intended recipient. If you are not the intended recipient of this transmission, you are hereby notified that any distribution, disclosure, printing, copying, storage, modification or the taking of any action in reliance upon this transmission is strictly prohibited. Delivery of this message to any person other than the intended recipient shall not compromise or waive such confidentiality, privilege or exemption from disclosure as to this communication. If you have received this communication in error, please notify the sender immediately and delete this message from your system. From gnolffilc at gmail.com Mon Jan 3 14:31:07 2011 From: gnolffilc at gmail.com (Clifford Long) Date: Mon, 3 Jan 2011 07:31:07 -0600 Subject: [R] using "plot" with time series object - "axes = FALSE" option does not appear to work In-Reply-To: References: Message-ID: Dear R-help, I am told by Professor Ripley that my question (quote) wastes the time of (and has insulted on-list) the R developers by falsely claiming there are problems with their code. I am writing, as he instructed, to publicly apologize to the R developers for any slight that my question might have generated. When posing my question, I genuinely thought that by invoking "plot" that it was a Base R graphics package that I was using, and not that offered by Mr. Ryan. As I am told that I owe an apology, I would like to do as I was instructed and publicly apologize on the list to the R developers, but not for any malicious intent, but rather for my lack of expertise and any unintentional insult that resulted. (My apologies also to Mr. Ryan for my ignorance.) Signed "bruised, but maybe a little less ignorant about R". Cliff Long gnolffilc at gmail.com On Mon, Jan 3, 2011 at 12:03 AM, Clifford Long wrote: > Dear R-help, > > I am attempting to plot data using standard R plot utilities. ?The > data was retrieved from FRED (St. Louis Federal Reserve) using the > package quantmod. ?My question is NOT about quantmod. ?While I > retrieve data using quantmod, I am not using its charting utility. ?I > have been having success using the standard R "plot" utilities to this > point with this type of data. > > Eventually I want to put two series on the same plot but with the > y-axis for one series on the right side, and also inverted (min at the > top, max at the bottom). ?I believe that I see how to do this by using > par(new = TRUE) with a second plot statement with "axes = FALSE", > followed by the command "axis(side = 4, ylim = c(max(seriesname), > min(seriesname))". > > Here is what I believe should be a smaller reproducible example of my issue: > > #----------------------------------------------------------------- > > library(quantmod) > > getSymbols('PCECTPI', src='FRED') > is.xts(PCECTPI) ? ? ?# check the type of object - response is 'TRUE' > > plot(PCECTPI) ?# This works fine. > > plot(PCECTPI, axes = FALSE) ?# This works in that it gives me a plot, > but I get the axes regardless of the use of "axes = FALSE". > > #----------------------------------------------------------------- > > I did find that using par(yaxt = "n") seems to work to suppress the > y-axis. ?But it seems to me that the "axes = FALSE" command should > also work, and I believe that it would be easier to use in the larger > context of my goal. > > > I have spent time with the R help pages and Nabble searches of the R > help archives but I still seem to be missing something. > > Does the "axes = FALSE" option not work when using plot with this type > of data object? > Is there some other fundamental thing that I have overlooked? > Or should this work? > > My apologies if the answer is obvious and I've just missed it. ?Thank > you in advance for any help that can be provided. > > Cliff Long > gnolffilc at gmail.com > > > #################################################################### > > My system: > HP Pavilion > Windows 7 > The computer/OS is 64-bit. ?I am running the precompiled 32-bit > version of R (per sessionInfo). > Thus far, everything seems to have been working as expected. > > >> sessionInfo() > R version 2.12.1 (2010-12-16) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > > other attached packages: > [1] quantmod_0.3-15 TTR_0.20-2 ? ? ?xts_0.7-5 ? ? ? zoo_1.6-4 > [5] Defaults_1.1-1 > > loaded via a namespace (and not attached): > [1] grid_2.12.1 ? ? lattice_0.19-13 tools_2.12.1 > > >> Sys.getlocale() > [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252" > From muhammad.rahiz at ouce.ox.ac.uk Mon Jan 3 15:52:33 2011 From: muhammad.rahiz at ouce.ox.ac.uk (Muhammad Rahiz) Date: Mon, 3 Jan 2011 14:52:33 +0000 (GMT) Subject: [R] optimize Message-ID: Hi all, I'm trying to get the value of y when x=203 by using the intersect of three curves. The horizontal curve does not meet with the other two. How can I rectify the code below? Thanks Muhammad ts <- 1:10 dd <- 10:1 ts <- seq(200,209,1) dd <- c(NA,NA,NA,NA,1.87,1.83,1.86,NA,1.95,1.96) plot(ts,dd,ylim=c(1.5,2)) abline(lm(dd~ts),col="blue",lty=2) abline(v=203,col="blue",lty=2) xy <- lm(dd~ts) fc <- function(x) coef(xy)[1] + x*coef(xy)[2] val <- optimize(f=function(x) abs(fc(x)-203),c(1.5,2)) abline(h=val,col="blue",lty=2) From bbolker at gmail.com Mon Jan 3 16:10:08 2011 From: bbolker at gmail.com (Ben Bolker) Date: Mon, 3 Jan 2011 15:10:08 +0000 (UTC) Subject: [R] using " plot" with time series object - " axes = FALSE" option does not appear to work References: Message-ID: Clifford Long gmail.com> writes: > > Dear R-help, > > I am told by Professor Ripley that my question (quote) wastes the time > of (and has insulted on-list) the R developers by falsely claiming > there are problems with their code. I am writing, as he instructed, > to publicly apologize to the R developers for any slight that my > question might have generated. > > When posing my question, I genuinely thought that by invoking "plot" > that it was a Base R graphics package that I was using, and not that > offered by Mr. Ryan. > > As I am told that I owe an apology, I would like to do as I was > instructed and publicly apologize on the list to the R developers, but > not for any malicious intent, but rather for my lack of expertise and > any unintentional insult that resulted. (My apologies also to Mr. > Ryan for my ignorance.) > Your post did not look insulting to me, just mildly (and understandably) misinformed. It looks to me like this is a bug in plot.xts, which has the code dots <- list(...) if ("axes" %in% names(dots)) { if (!dots$axes) axes <- FALSE } else axes <- TRUE This would make sense if axes were *not* in the explicit list of arguments to the function, and the default was supposed to be TRUE. Because axes *is* in the explicit list of arguments, this overrides it. I would contact the maintainer [maintainer("xts")] and ask if this is a bug. cheers Ben Bolker From jwiley.psych at gmail.com Mon Jan 3 16:24:35 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Mon, 3 Jan 2011 07:24:35 -0800 Subject: [R] optimize In-Reply-To: References: Message-ID: Hi Muhammad, On Mon, Jan 3, 2011 at 6:52 AM, Muhammad Rahiz wrote: > Hi all, > > I'm trying to get the value of y when x=203 by using the intersect of three > curves. The horizontal curve does not meet with the other two. How can I > rectify the code below? What is your code supposed to do? This is one way to get three intersecting curves, but I am not sure it is doing what you were hoping to do with the sample code you sent. ts <- seq(200, 209, 1) dd <- c(NA,NA,NA,NA,1.87,1.83,1.86,NA,1.95,1.96) xy <- lm(dd ~ ts) plot(x = ts, y = dd, ylim = c(1.5,2)) abline(xy, col = "blue", lty = 2) abline(v = 203, h = predict(xy, data.frame(ts = 203)), col = "blue", lty = 2) Best regards, Josh > > Thanks > > Muhammad > > ?ts <- 1:10 > ?dd <- 10:1 > > ?ts <- seq(200,209,1) > ?dd <- c(NA,NA,NA,NA,1.87,1.83,1.86,NA,1.95,1.96) > > ?plot(ts,dd,ylim=c(1.5,2)) > ?abline(lm(dd~ts),col="blue",lty=2) > ?abline(v=203,col="blue",lty=2) > > ?xy <- lm(dd~ts) > ?fc <- function(x) coef(xy)[1] + x*coef(xy)[2] > ?val <- optimize(f=function(x) abs(fc(x)-203),c(1.5,2)) > ?abline(h=val,col="blue",lty=2) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From dwinsemius at comcast.net Mon Jan 3 16:26:23 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 3 Jan 2011 10:26:23 -0500 Subject: [R] optimize In-Reply-To: References: Message-ID: <6D68F27A-A1F4-4C5A-8272-536CC5D7EC75@comcast.net> On Jan 3, 2011, at 9:52 AM, Muhammad Rahiz wrote: > Hi all, > > I'm trying to get the value of y when x=203 by using the intersect > of three curves. The horizontal curve does not meet with the other > two. How can I rectify the code below? Extend the appropriate axes. xlim and ylim are the arguments to use. Looks too much like a homework problem for me to do the work for you, though. > > Thanks > > Muhammad > > ts <- 1:10 > dd <- 10:1 > > ts <- seq(200,209,1) > dd <- c(NA,NA,NA,NA,1.87,1.83,1.86,NA,1.95,1.96) > > plot(ts,dd,ylim=c(1.5,2)) > abline(lm(dd~ts),col="blue",lty=2) > abline(v=203,col="blue",lty=2) > > xy <- lm(dd~ts) > fc <- function(x) coef(xy)[1] + x*coef(xy)[2] > val <- optimize(f=function(x) abs(fc(x)-203),c(1.5,2)) > abline(h=val,col="blue",lty=2) > David Winsemius, MD West Hartford, CT From msamtani at gmail.com Mon Jan 3 16:44:08 2011 From: msamtani at gmail.com (mahesh samtani) Date: Mon, 3 Jan 2011 10:44:08 -0500 Subject: [R] How to compute the density of a variable that follows a proportional error distribution In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Mon Jan 3 16:56:16 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 3 Jan 2011 10:56:16 -0500 Subject: [R] optimize In-Reply-To: References: Message-ID: <095F4493-77AE-47E3-B926-38CB1712CCC4@comcast.net> On Jan 3, 2011, at 9:52 AM, Muhammad Rahiz wrote: > Hi all, > > I'm trying to get the value of y when x=203 by using the intersect > of three curves. The horizontal curve does not meet with the other > two. How can I rectify the code below? > > Thanks > > Muhammad > > ts <- 1:10 > dd <- 10:1 > > ts <- seq(200,209,1) > dd <- c(NA,NA,NA,NA,1.87,1.83,1.86,NA,1.95,1.96) > > plot(ts,dd,ylim=c(1.5,2)) > abline(lm(dd~ts),col="blue",lty=2) > abline(v=203,col="blue",lty=2) > > xy <- lm(dd~ts) > fc <- function(x) coef(xy)[1] + x*coef(xy)[2] > val <- optimize(f=function(x) abs(fc(x)-203), c(1.5,2)) ^???^ Joshua Wiley's answer made me realize that your question was different than I thought. At this point maybe you should have made your objective function match on the coordinate difference to be minimized. You are at this point taking the absolute difference between an x- value, 203. and a y-value. fc(x). That makes little sense. You can use predict() as Joshua showed you or you can change your objective functions so it is more meaningful. Furthermore, I have had better success with squared differences that with abs(differences). I'm not sure whether that is due to the availability of a more informative derivative. (Again, looks like homework and I refrain from posting completed solutions.) > abline(h=val,col="blue",lty=2) -- David Winsemius, MD West Hartford, CT From muhammad.rahiz at ouce.ox.ac.uk Mon Jan 3 17:09:04 2011 From: muhammad.rahiz at ouce.ox.ac.uk (Muhammad Rahiz) Date: Mon, 3 Jan 2011 16:09:04 +0000 (GMT) Subject: [R] optimize In-Reply-To: <095F4493-77AE-47E3-B926-38CB1712CCC4@comcast.net> References: <095F4493-77AE-47E3-B926-38CB1712CCC4@comcast.net> Message-ID: Josh's recommendation to use predict works. At the same time, I'll work on your suggestions, David. Thanks. Muhammad Rahiz Researcher & DPhil Candidate (Climate Systems & Policy) School of Geography & the Environment University of Oxford On Mon, 3 Jan 2011, David Winsemius wrote: > > On Jan 3, 2011, at 9:52 AM, Muhammad Rahiz wrote: > >> Hi all, >> >> I'm trying to get the value of y when x=203 by using the intersect >> of three curves. The horizontal curve does not meet with the other >> two. How can I rectify the code below? >> >> Thanks >> >> Muhammad >> >> ts <- 1:10 >> dd <- 10:1 >> >> ts <- seq(200,209,1) >> dd <- c(NA,NA,NA,NA,1.87,1.83,1.86,NA,1.95,1.96) >> >> plot(ts,dd,ylim=c(1.5,2)) >> abline(lm(dd~ts),col="blue",lty=2) >> abline(v=203,col="blue",lty=2) >> >> xy <- lm(dd~ts) >> fc <- function(x) coef(xy)[1] + x*coef(xy)[2] >> val <- optimize(f=function(x) abs(fc(x)-203), c(1.5,2)) > ^???^ > > Joshua Wiley's answer made me realize that your question was different > than I thought. At this point maybe you should have made your > objective function match on the coordinate difference to be minimized. > You are at this point taking the absolute difference between an x- > value, 203. and a y-value. fc(x). That makes little sense. You can use > predict() as Joshua showed you or you can change your objective > functions so it is more meaningful. Furthermore, I have had better > success with squared differences that with abs(differences). I'm not > sure whether that is due to the availability of a more informative > derivative. (Again, looks like homework and I refrain from posting > completed solutions.) > >> abline(h=val,col="blue",lty=2) > > -- > David Winsemius, MD > West Hartford, CT > > From dwinsemius at comcast.net Mon Jan 3 17:19:41 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 3 Jan 2011 11:19:41 -0500 Subject: [R] optimize In-Reply-To: References: <095F4493-77AE-47E3-B926-38CB1712CCC4@comcast.net> Message-ID: <2793C2EE-DFCC-4E7C-BA83-A0D92D752C60@comcast.net> On Jan 3, 2011, at 11:09 AM, Muhammad Rahiz wrote: > Josh's recommendation to use predict works. At the same time, I'll > work on your suggestions, David. > > Thanks. > > Muhammad Rahiz > Researcher & DPhil Candidate (Climate Systems & Policy) > School of Geography & the Environment > University of Oxford > > On Mon, 3 Jan 2011, David Winsemius wrote: > >> >> On Jan 3, 2011, at 9:52 AM, Muhammad Rahiz wrote: >> >>> Hi all, >>> >>> I'm trying to get the value of y when x=203 by using the intersect >>> of three curves. The horizontal curve does not meet with the other >>> two. How can I rectify the code below? >>> >>> Thanks >>> >>> Muhammad >>> >>> ts <- 1:10 >>> dd <- 10:1 >>> >>> ts <- seq(200,209,1) >>> dd <- c(NA,NA,NA,NA,1.87,1.83,1.86,NA,1.95,1.96) >>> >>> plot(ts,dd,ylim=c(1.5,2)) >>> abline(lm(dd~ts),col="blue",lty=2) >>> abline(v=203,col="blue",lty=2) >>> >>> xy <- lm(dd~ts) >>> fc <- function(x) coef(xy)[1] + x*coef(xy)[2] >>> val <- optimize(f=function(x) abs(fc(x)-203), c(1.5,2)) >> ^???^ And a further observation: You should pay close attention to the limits of optimization. If your answers keep coming out at the extremes of the ranges then perhaps you have confused the x and y coordinates there, as well as in your objective function. -- David. >> >> Joshua Wiley's answer made me realize that your question was >> different >> than I thought. At this point maybe you should have made your >> objective function match on the coordinate difference to be >> minimized. >> You are at this point taking the absolute difference between an x- >> value, 203. and a y-value. fc(x). That makes little sense. You can >> use >> predict() as Joshua showed you or you can change your objective >> functions so it is more meaningful. Furthermore, I have had better >> success with squared differences that with abs(differences). I'm not >> sure whether that is due to the availability of a more informative >> derivative. (Again, looks like homework and I refrain from posting >> completed solutions.) >> >>> abline(h=val,col="blue",lty=2) >> >> -- >> David Winsemius, MD >> West Hartford, CT >> >> David Winsemius, MD West Hartford, CT From statmailinglists at googlemail.com Mon Jan 3 17:27:54 2011 From: statmailinglists at googlemail.com (Paolo Rossi) Date: Mon, 3 Jan 2011 16:27:54 +0000 Subject: [R] ARIMA simulation including a constant Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From amwootte at ncsu.edu Mon Jan 3 18:18:07 2011 From: amwootte at ncsu.edu (Adrienne Wootten) Date: Mon, 3 Jan 2011 12:18:07 -0500 Subject: [R] parLapply - Error in do.call("fun", lapply(args, enquote)) : could not find function "fun" In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From eduardo.oliveirahorta at gmail.com Mon Jan 3 18:25:23 2011 From: eduardo.oliveirahorta at gmail.com (Eduardo de Oliveira Horta) Date: Mon, 3 Jan 2011 15:25:23 -0200 Subject: [R] Saving objects inside a list Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wwwhsd at gmail.com Mon Jan 3 18:39:09 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Mon, 3 Jan 2011 15:39:09 -0200 Subject: [R] Saving objects inside a list In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From eduardo.oliveirahorta at gmail.com Mon Jan 3 18:51:41 2011 From: eduardo.oliveirahorta at gmail.com (Eduardo de Oliveira Horta) Date: Mon, 3 Jan 2011 15:51:41 -0200 Subject: [R] Saving objects inside a list In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ggrothendieck at gmail.com Mon Jan 3 18:54:01 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Mon, 3 Jan 2011 12:54:01 -0500 Subject: [R] Saving objects inside a list In-Reply-To: References: Message-ID: On Mon, Jan 3, 2011 at 12:25 PM, Eduardo de Oliveira Horta wrote: > Hello there, > > any ideas on how to save all the objects on my workspace inside a list > object? > > For example, say my workspace is as follows > ls() > [1] "x" "y" "z" > > and suppose I want to put these objects inside a list object, say > > object.list <- list() > > without having to explicitly write down their names as in > > object.list$x = x > object.list$y = y > object.list$z = z > Try this: eapply(.GlobalEnv, identity) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From wwwhsd at gmail.com Mon Jan 3 18:54:46 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Mon, 3 Jan 2011 15:54:46 -0200 Subject: [R] Saving objects inside a list In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From lorenz at usgs.gov Mon Jan 3 18:55:33 2011 From: lorenz at usgs.gov (David L Lorenz) Date: Mon, 3 Jan 2011 11:55:33 -0600 Subject: [R] Saving objects inside a list In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wdunlap at tibco.com Mon Jan 3 18:55:28 2011 From: wdunlap at tibco.com (William Dunlap) Date: Mon, 3 Jan 2011 09:55:28 -0800 Subject: [R] Saving objects inside a list In-Reply-To: References: Message-ID: <77EB52C6DD32BA4D87471DCD70C8D70003C2DEEA@NA-PA-VBE03.na.tibco.com> > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Eduardo de > Oliveira Horta > Sent: Monday, January 03, 2011 9:25 AM > To: r-help > Subject: [R] Saving objects inside a list > > Hello there, > > any ideas on how to save all the objects on my workspace inside a list > object? Does as.list(.GlobalEnv) do what you want? Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > For example, say my workspace is as follows > ls() > [1] "x" "y" "z" > > and suppose I want to put these objects inside a list object, say > > object.list <- list() > > without having to explicitly write down their names as in > > object.list$x = x > object.list$y = y > object.list$z = z > > Is this possible? > > Thanks in advance, > > Eduardo Horta > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From jorgeivanvelez at gmail.com Mon Jan 3 18:56:22 2011 From: jorgeivanvelez at gmail.com (Jorge Ivan Velez) Date: Mon, 3 Jan 2011 12:56:22 -0500 Subject: [R] Saving objects inside a list In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ethan.a.arenson at gmail.com Mon Jan 3 19:15:45 2011 From: ethan.a.arenson at gmail.com (Ethan Arenson) Date: Mon, 3 Jan 2011 12:15:45 -0600 Subject: [R] Proof for computing sums of squares Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bates at stat.wisc.edu Mon Jan 3 19:32:52 2011 From: bates at stat.wisc.edu (Douglas Bates) Date: Mon, 3 Jan 2011 12:32:52 -0600 Subject: [R] Proof for computing sums of squares In-Reply-To: References: Message-ID: On Mon, Jan 3, 2011 at 12:15 PM, Ethan Arenson wrote: > Hi. > > I know that R computes sums of squares based on the diagonal of > > t(Q) %*% y %*% t(y) %*% Q, > > where Q comes from the QR-decomposition of the model matrix. So, how do you know that? When I look at the code for summary.lm I see the residual sum of squares being computed as r <- z$residuals rss <- sum(r^2) Also, your formula is not correct. I think you mean the trace of that matrix which is the squared length of Q'y and that's the same as the squared length of y. The matrix Q in a QR decomposition is orthogonal and one of the properties of orthogonal matrices is that they preserve lengths. So the length of Qy is the same as the length of Q'y is the same as the length of y. > Does anyone know where I can find a proof for this result? > > All Best and Happy New Year, > Ethan > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From ata.sonu at gmail.com Mon Jan 3 16:14:08 2011 From: ata.sonu at gmail.com (ATANU) Date: Mon, 3 Jan 2011 07:14:08 -0800 (PST) Subject: [R] changing method of estimation in GLM In-Reply-To: <4D209141.9090403@statistik.tu-dortmund.de> References: <1293967955520-3170836.post@n4.nabble.com> <4D209141.9090403@statistik.tu-dortmund.de> Message-ID: <1294067648438-3172088.post@n4.nabble.com> can you please explain me this with an example(preferably NEWTON RAPHSON METHOD)? -- View this message in context: http://r.789695.n4.nabble.com/changing-method-of-estimation-in-GLM-tp3170836p3172088.html Sent from the R help mailing list archive at Nabble.com. From elciclopeturnio at gmail.com Mon Jan 3 14:32:16 2011 From: elciclopeturnio at gmail.com (elciclopeturnio) Date: Mon, 3 Jan 2011 10:32:16 -0300 Subject: [R] Greetings. I have a question with mixed beta regression model in nlme. Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From elciclopeturnio at gmail.com Mon Jan 3 14:40:10 2011 From: elciclopeturnio at gmail.com (elciclopeturnio) Date: Mon, 3 Jan 2011 10:40:10 -0300 Subject: [R] Greetings. I have a question with mixed beta regression model in nlme (corrected version). Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From r.aly at ewi.utwente.nl Mon Jan 3 15:02:21 2011 From: r.aly at ewi.utwente.nl (Robin Aly) Date: Mon, 03 Jan 2011 15:02:21 +0100 Subject: [R] Logistic Regression Fitting with EM-Algorithm Message-ID: <4D21D6ED.6030807@ewi.utwente.nl> Hi all, is there any package which can do an EM algorithm fitting of logistic regression coefficients given only the explanatory variables? I tried to realize this using the Design package, but I didn't find a way. Thanks a lot & Kind regards Robin Aly From ray.laymon064 at gmail.com Mon Jan 3 16:38:29 2011 From: ray.laymon064 at gmail.com (Ray Laymon) Date: Mon, 3 Jan 2011 10:38:29 -0500 Subject: [R] Formatted output with alternating format at different rows Message-ID: Dear all, I have a simple question. I couldn't find a solution in the forums/R-user manual; I have also asked to my friends who use R, but couldn't get any answer from them either. I would appreciate any solutions. I want to write formatted text file like in Fortran. More specifically with the format choice of mine for any given line (more specifics are given below). The R function read.fortran() read files like in Fortran, but I couldn't find any function to write files in a similar way.. Since I want different formats for different lines, write(), write.table(), write.fwf() functions didn't work for me. Thanks for your time Ray ########## ########## In Fortran ########## real dummy3(3) real dummy4(4) real dummy2(2) dummy3 = (/1.1, 2.2, 3.3/) dummy4 = (/4.4, 5.5, 6.6, 7.7/) dummy2 = (/8.8, 9.9/) OPEN(UNIT=320,FILE='output.dat',FORM='FORMATTED') write(320,501) dummy3 write(320,502) dummy4 write(320,503) dummy2 close(320) 501 format(F5.1,F6.2,F7.3) 502 format(F5.2,F6.3,F7.4,F8.5) 503 format(F5.3,F6.4) ########## ########## This should produce an output something like below (I used * instead of the spaces) **1.1**2.20***3.300 *4.40*5.500**6.6000*7.70000 8.8009.9000 And a function in [R] that will enable me to write something like above is what I am looking for.. ########## ########## ########## From vseabra at uol.com.br Mon Jan 3 14:25:30 2011 From: vseabra at uol.com.br (Victor F Seabra) Date: Mon, 3 Jan 2011 11:25:30 -0200 Subject: [R] Please, need help with a plot In-Reply-To: <5630327E-CC9A-4CF9-AFDE-E12360AB6CB6@comcast.net> References: <4d20b7104796e_63e055fae74160@weasel21.tmail> <5630327E-CC9A-4CF9-AFDE-E12360AB6CB6@comcast.net> Message-ID: <4d21ce4accc47_318bf4ee701af@weasel14.tmail> although the code somehow didn't work on my Vista / R 2.8, it did work perfectly on a XP machine / R 2.10 I've been trying to fix this for days, Thank you very much for your help! ______________________________________________________________________ 02/01/2011 19:30, David Winsemius < dwinsemius at comcast.net > wrote: On Jan 2, 2011, at 1:15 PM, Ben Bolker wrote: > This is a little bit more 'magic' than I would like, but seems > to work. Perhaps someone else can suggest a cleaner solution. Here's the best I could come up with but will admit that there were many failed attempts before success: expr.vec <- as.expression(parse(text=table1$var1)) plot(x=table1$var2 ,y=1:11, xlim=c(0,20), pch=20) text(x=table1$var2, y=1:11, labels=expr.vec, pos=4) title(x=15, y=5, expression("Yet another way to process strings with operators like '<=' ) (The title expression works on my machine, but perhaps not on the OP's machine, given differences in encoding that have so far been exhibited.) > ages <- gsub("[^0-9]+","",table1$var1) > rel <- gsub("age\\s*([=<>]+)\\s*[0-9]+","\\1",table1$var1,perl=TRUE) > > with(table1,plot(var2,1:11,xlim=c(0,20),pch=20)) > invisible(with(table1, > mapply(function(x,y,a,r) { > text(x=x,y=y, > switch(r, > `<=`=bquote(age <= .(a)), > `<`=bquote(age < .(a)), > `>=`=bquote(age >= .(a))), > pos=4)}, > var2,1:11,ages,rel))) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From zhaoxing731 at yahoo.com.cn Mon Jan 3 15:57:14 2011 From: zhaoxing731 at yahoo.com.cn (zhaoxing731) Date: Mon, 3 Jan 2011 22:57:14 +0800 Subject: [R] matrices call a function element-wise Message-ID: <201101032257104769317@yahoo.com.cn> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jdaily at usgs.gov Mon Jan 3 20:09:15 2011 From: jdaily at usgs.gov (Jonathan P Daily) Date: Mon, 3 Jan 2011 14:09:15 -0500 Subject: [R] matrices call a function element-wise In-Reply-To: <201101032257104769317@yahoo.com.cn> References: <201101032257104769317@yahoo.com.cn> Message-ID: IIRC, R is perfectly able to call matrices as vectors, so you might be able to do this: FT <- function(i) fisher.test(matrix(c(A[i],B[i],C[i],D[i]),2)) E <- sapply(1:1000000, FT) Though I don't know how much time you will save. -------------------------------------- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 "Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it." - Jubal Early, Firefly r-help-bounces at r-project.org wrote on 01/03/2011 09:57:14 AM: > [image removed] > > [R] matrices call a function element-wise > > zhaoxing731 > > to: > > R-help > > 01/03/2011 01:46 PM > > Sent by: > > r-help-bounces at r-project.org > > Hello > > I have 4 1000*1000 matrix A,B,C,D. I want to use the corresponding > element of the 4 matrices. Using the "for loop" as follow: > > E<-o > for (i in 1:1000) > {for (j in 1:1000) > { > E<-fisher.test(matrix(c(A[i][j],B[i][j],C[i][j],D[i][j]), > 2))#call fisher.test for every element > } > } > > It is so time-consuming > Need vectorization > > Yours sincerely > > > > > ZhaoXing > Department of Health Statistics > West China School of Public Health > Sichuan University > No.17 Section 3, South Renmin Road > Chengdu, Sichuan 610041 > P.R.China > > [[alternative HTML version deleted]] > > > __________________________________________________ > 8O?lW"2aQE;"3,4sH]A?Cb7QSJOd? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From peter.langfelder at gmail.com Mon Jan 3 20:29:52 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Mon, 3 Jan 2011 11:29:52 -0800 Subject: [R] Problem loading Tcl/Tk (?) Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From anthony at resolution.com Mon Jan 3 20:59:29 2011 From: anthony at resolution.com (apresley) Date: Mon, 3 Jan 2011 11:59:29 -0800 (PST) Subject: [R] randomForest speed improvements Message-ID: <1294084769056-3172523.post@n4.nabble.com> Hi there, We're trying to use randomForest to do some predictions. The test-harness for our code is pretty straightforward: library ('randomForest'); data202 <- read.csv ("random.csv", header=TRUE); x<- data202[1:50000,1:6]; y<- data202[1:50000,8]; y<- y[,drop=TRUE]; x2 <- data202[50001:60000,1:6]; y2 <- data202[50001:60000,8]; y2 <- y2[,drop=TRUE]; RFobject <- randomForest(x,y,na.action=na.roughfix); p <- predict (RFobject, x2); In this case, the CSV contains 10 columns, of which 1-6 are numeric in nature (day of week, week of month, etc...) and column 8 is the target (sales, a numeric number). randomForest does fine with the data, our issue is how long it takes. In this case, about 5,000 rows of data seems to take just a few seconds, but going to 50,000 rows doesn't take 5x the time, it takes perhaps 30 or 40 minutes. We've downloaded and tried RT-Rank, which is a multi-threaded version of RandomForest, and this seems to produce the same (or slightly better) predictions, but also gets done fairly quickly. What can we do to improve the speed of this data computation? The system we're on is a dual quad-core Intel CPU @ 2.33Ghz, and with 16GB of RAM ... we're using the "stock" R RPM for CentOS 5.5. Thanks! -- Anthony -- View this message in context: http://r.789695.n4.nabble.com/randomForest-speed-improvements-tp3172523p3172523.html Sent from the R help mailing list archive at Nabble.com. From Louisa_Lafrez at hotmail.com Mon Jan 3 21:03:49 2011 From: Louisa_Lafrez at hotmail.com (Louisa) Date: Mon, 3 Jan 2011 12:03:49 -0800 (PST) Subject: [R] Inverse Gaussian Distribution Message-ID: <1294085029867-3172533.post@n4.nabble.com> Dear, I want to fit an inverse gaussion distribution to a data set. The predictor variables are gender, area and agecategory. For each of these variables I've defined a baseline e.g. #agecat: baseline is 3 data<-transform(data, agecat=C(factor(agecat,ordered=TRUE), contr.treatment(n=6,base=3))) The variable 'area' goes from A to F (6 areas: A,B,C,D,E,F) How can i manipulate the data to set the baseline of area to C? R is producing errors when I'm trying to do so. I'll be very thankful for any help you can provide. Louisa -- View this message in context: http://r.789695.n4.nabble.com/Inverse-Gaussian-Distribution-tp3172533p3172533.html Sent from the R help mailing list archive at Nabble.com. From jdaily at usgs.gov Mon Jan 3 21:10:55 2011 From: jdaily at usgs.gov (Jonathan P Daily) Date: Mon, 3 Jan 2011 15:10:55 -0500 Subject: [R] randomForest speed improvements In-Reply-To: <1294084769056-3172523.post@n4.nabble.com> References: <1294084769056-3172523.post@n4.nabble.com> Message-ID: Have you tried adjusting: mtry - the number of parameters to try per tree ntree - the number of trees grown keep.forest - logical on whether to store tree Specifically, I found huge improvements in speed by switching keep.forest to FALSE in the past when I didn't actually need the forest post analysis. -------------------------------------- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 "Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it." - Jubal Early, Firefly r-help-bounces at r-project.org wrote on 01/03/2011 02:59:29 PM: > [image removed] > > [R] randomForest speed improvements > > apresley > > to: > > r-help > > 01/03/2011 03:03 PM > > Sent by: > > r-help-bounces at r-project.org > > > Hi there, > > We're trying to use randomForest to do some predictions. The test-harness > for our code is pretty straightforward: > > library ('randomForest'); > data202 <- read.csv ("random.csv", header=TRUE); > x<- data202[1:50000,1:6]; > y<- data202[1:50000,8]; > y<- y[,drop=TRUE]; > > x2 <- data202[50001:60000,1:6]; > y2 <- data202[50001:60000,8]; > y2 <- y2[,drop=TRUE]; > > RFobject <- randomForest(x,y,na.action=na.roughfix); > p <- predict (RFobject, x2); > > In this case, the CSV contains 10 columns, of which 1-6 are numeric in > nature (day of week, week of month, etc...) and column 8 is the target > (sales, a numeric number). > > randomForest does fine with the data, our issue is how long it takes. In > this case, about 5,000 rows of data seems to take just a few seconds, but > going to 50,000 rows doesn't take 5x the time, it takes perhaps 30 or 40 > minutes. > > We've downloaded and tried RT-Rank, which is a multi-threaded version of > RandomForest, and this seems to produce the same (or slightly better) > predictions, but also gets done fairly quickly. > > What can we do to improve the speed of this data computation? The system > we're on is a dual quad-core Intel CPU @ 2.33Ghz, and with 16GB of RAM ... > we're using the "stock" R RPM for CentOS 5.5. > > Thanks! > > -- > Anthony > -- > View this message in context: http://r.789695.n4.nabble.com/ > randomForest-speed-improvements-tp3172523p3172523.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From dwinsemius at comcast.net Mon Jan 3 21:19:04 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 3 Jan 2011 15:19:04 -0500 Subject: [R] Inverse Gaussian Distribution In-Reply-To: <1294085029867-3172533.post@n4.nabble.com> References: <1294085029867-3172533.post@n4.nabble.com> Message-ID: On Jan 3, 2011, at 3:03 PM, Louisa wrote: > > Dear, > > I want to fit an inverse gaussion distribution to a data set. > > The predictor variables are gender, area and agecategory. > For each of these variables I've defined a baseline > > e.g. > #agecat: baseline is 3 > data<-transform(data, agecat=C(factor(agecat,ordered=TRUE), > contr.treatment(n=6,base=3))) > > The variable 'area' goes from A to F (6 areas: > > How can i manipulate the data to set the baseline of area to C? > R is producing errors when I'm trying to do so. In all likelihood it's a factor. Try area <- factor(area, levels=c("C", "A","B","D","E","F") ) If not, then you need to provide more information. Read the Posting Guide. > > I'll be very thankful for any help you can provide. > > Louisa > -- > View this message in context: http://r.789695.n4.nabble.com/Inverse-Gaussian-Distribution-tp3172533p3172533.html > Sent from the R help mailing list archive at Nabble.com. > David Winsemius, MD West Hartford, CT From jholtman at gmail.com Mon Jan 3 21:40:25 2011 From: jholtman at gmail.com (jim holtman) Date: Mon, 3 Jan 2011 15:40:25 -0500 Subject: [R] Formatted output with alternating format at different rows In-Reply-To: References: Message-ID: 'sprintf' if your friend: > dummy3 = c(1.1, 2.2, 3.3) > dummy4 = c(4.4, 5.5, 6.6, 7.7) > dummy2 = c(8.8, 9.9) > > cat(sprintf("%5.1f%6.2f%7.3f\n", dummy3[1], dummy3[2], dummy3[3])) 1.1 2.20 3.300 > cat(sprintf("%5.2f%6.3f%7.4f%8.5f\n", dummy4[1], dummy4[2], dummy4[3], dummy4[4])) 4.40 5.500 6.6000 7.70000 > cat(sprintf("%5.3f%6.4f\n", dummy2[1], dummy2[2])) 8.8009.9000 > > On Mon, Jan 3, 2011 at 10:38 AM, Ray Laymon wrote: > Dear all, > > I have a simple question. I couldn't find a solution in the > forums/R-user manual; I have also asked to my friends who use R, but > couldn't get any answer from them either. I would appreciate any > solutions. > > I want to write formatted text file like in Fortran. More specifically > with the format choice of mine for any given line (more specifics are > given below). > > The R function read.fortran() read files like in Fortran, but I > couldn't find any function to write files in a similar way.. > > Since I want different formats for different lines, write(), > write.table(), write.fwf() functions didn't work for me. > > Thanks for your time > > Ray > > > > > ########## > ########## > In Fortran > ########## > > real dummy3(3) > real dummy4(4) > real dummy2(2) > > dummy3 = (/1.1, 2.2, 3.3/) > dummy4 = (/4.4, 5.5, 6.6, 7.7/) > dummy2 = (/8.8, 9.9/) > > OPEN(UNIT=320,FILE='output.dat',FORM='FORMATTED') > write(320,501) dummy3 > write(320,502) dummy4 > write(320,503) dummy2 > close(320) > > 501 format(F5.1,F6.2,F7.3) > 502 format(F5.2,F6.3,F7.4,F8.5) > 503 format(F5.3,F6.4) > ########## > ########## > > This should produce an output something like below (I used * instead > of the spaces) > > **1.1**2.20***3.300 > *4.40*5.500**6.6000*7.70000 > 8.8009.9000 > > And a function in [R] that will enable me to write something like > above is what I am looking for.. > ########## > ########## > ########## > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From eduardo.oliveirahorta at gmail.com Mon Jan 3 21:58:04 2011 From: eduardo.oliveirahorta at gmail.com (Eduardo de Oliveira Horta) Date: Mon, 3 Jan 2011 18:58:04 -0200 Subject: [R] Saving objects inside a list In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ggrothendieck at gmail.com Mon Jan 3 22:02:41 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Mon, 3 Jan 2011 16:02:41 -0500 Subject: [R] Saving objects inside a list In-Reply-To: References: Message-ID: On Mon, Jan 3, 2011 at 3:58 PM, Eduardo de Oliveira Horta wrote: > sapply(ls(),get) works fine. Thanks. > > ps: the as.list and the eapply suggestions didn't work. > They work for me. Starting in a fresh session: > x <- 1; f <- function(x) x; DF <- data.frame(a = 1:3) > as.list(.GlobalEnv) $DF a 1 1 2 2 3 3 $f function (x) x $x [1] 1 > eapply(.GlobalEnv, identity) $DF a 1 1 2 2 3 3 $f function (x) x $x [1] 1 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From wdunlap at tibco.com Mon Jan 3 22:44:44 2011 From: wdunlap at tibco.com (William Dunlap) Date: Mon, 3 Jan 2011 13:44:44 -0800 Subject: [R] Saving objects inside a list In-Reply-To: References: Message-ID: <77EB52C6DD32BA4D87471DCD70C8D70003C2DEF9@NA-PA-VBE03.na.tibco.com> > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Eduardo de > Oliveira Horta > Sent: Monday, January 03, 2011 12:58 PM > To: r-help > Subject: Re: [R] Saving objects inside a list > > sapply(ls(),get) works fine. Thanks. > > ps: the as.list and the eapply suggestions didn't work. What version of R are you using? In 2.12.0 I get > x <- 1:7 > y <- function(...)list(...) > z <- runif(3) # creates .Random.seed also > str(as.list(.GlobalEnv)) # add all.names=TRUE to get .Random.seed List of 3 $ z: num [1:3] 0.698 0.475 0.354 $ y:function (...) ..- attr(*, "source")= chr "function(...)list(...)" $ x: int [1:7] 1 2 3 4 5 6 7 > str(eapply(.GlobalEnv, function(x)x)) # all.names=TRUE ok here also List of 3 $ z: num [1:3] 0.698 0.475 0.354 $ y:function (...) ..- attr(*, "source")= chr "function(...)list(...)" h$ x: int [1:7] 1 2 3 4 5 6 7 Use environment() instead of .GlobalEnv if you want to process the current environment. In any version of R sapply() can give incorrect results, as in (in a fresh vanilla R session): > x <- 1:4 > y <- 11:14 > z <- 21:24 > str(sapply(ls(), get)) int [1:4, 1:3] 1 2 3 4 11 12 13 14 21 22 ... - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:3] "x" "y" "z" or, in another fresh session > FUN <- "a string named FUN" > X <- 17 > sapply(ls(), get) $FUN function (x, pos = -1, envir = as.environment(pos), mode = "any", inherits = TRUE) .Internal(get(x, envir, mode, inherits)) $X [1] "FUN" "X" as.list(envir) and eapply(envir, function(x)x) do not have problems in those cases. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > On Mon, Jan 3, 2011 at 3:56 PM, Jorge Ivan Velez > wrote: > > > Hi Eduardo, > > > > Try > > > > r <- ls() > > result <- sapply(r, get) > > result > > > > HTH, > > Jorge > > > > > > On Mon, Jan 3, 2011 at 12:25 PM, Eduardo de Oliveira Horta <> wrote: > > > >> Hello there, > >> > >> any ideas on how to save all the objects on my workspace > inside a list > >> object? > >> > >> For example, say my workspace is as follows > >> ls() > >> [1] "x" "y" "z" > >> > >> and suppose I want to put these objects inside a list object, say > >> > >> object.list <- list() > >> > >> without having to explicitly write down their names as in > >> > >> object.list$x = x > >> object.list$y = y > >> object.list$z = z > >> > >> Is this possible? > >> > >> Thanks in advance, > >> > >> Eduardo Horta > >> > >> [[alternative HTML version deleted]] > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From f.harrell at vanderbilt.edu Mon Jan 3 22:45:45 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Mon, 3 Jan 2011 13:45:45 -0800 (PST) Subject: [R] packagename:::functionname vs. importFrom Message-ID: <1294091145769-3172684.post@n4.nabble.com> In my rms package I use the packagename:::functionname construct in a number of places. If I instead use the importFrom declaration in the NAMESPACE file would that require the package to be available, and does it load the package when my package loads? If so I would keep using packagename::: to avoid up-front loading of other packages that are not always used. Thanks Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/packagename-functionname-vs-importFrom-tp3172684p3172684.html Sent from the R help mailing list archive at Nabble.com. From simone.gabbriellini at gmail.com Mon Jan 3 23:06:36 2011 From: simone.gabbriellini at gmail.com (Simone Gabbriellini) Date: Mon, 3 Jan 2011 23:06:36 +0100 Subject: [R] how to invert the axes in the wireframe() plot Message-ID: <328DCFAF-45E4-4150-B20F-81B8B593D365@gmail.com> Dear List, I am using the wireframe function in the lattice package, and I am wondering if it is possible to invert the default axes orientation for x and y axes... what parameter should I look for? Best regards, Simone Gabbriellini From statmailinglists at googlemail.com Mon Jan 3 23:12:35 2011 From: statmailinglists at googlemail.com (Paolo Rossi) Date: Mon, 3 Jan 2011 22:12:35 +0000 Subject: [R] ARIMA simulation including a constant In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From aquanyc at gmail.com Mon Jan 3 20:24:24 2011 From: aquanyc at gmail.com (rivercode) Date: Mon, 3 Jan 2011 11:24:24 -0800 (PST) Subject: [R] Regex to remove last character Message-ID: <1294082664889-3172466.post@n4.nabble.com> Hi, Have been having trouble trying to figure out the right regex parameters to remove the last "." in timestamp with the following format: Convert 09:30:00.377.853 to 09:30:00.377853 Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Regex-to-remove-last-character-tp3172466p3172466.html Sent from the R help mailing list archive at Nabble.com. From bbolker at gmail.com Mon Jan 3 21:01:33 2011 From: bbolker at gmail.com (Ben Bolker) Date: Mon, 3 Jan 2011 20:01:33 +0000 (UTC) Subject: [R] =?utf-8?q?Greetings=2E_I_have_a_question_with_mixed_beta_regr?= =?utf-8?q?ession_model=09in_nlme=2E?= References: Message-ID: elciclopeturnio gmail.com> writes: > > *Dear R-help: > > My name is Rodrigo and I have a question with nlme package > in R to fit a mixed beta regression model. The details of the model are: > [snip] > *The question is: > How can I use nlme package in R to fit this model? > If you want to know additional information, send me a mail, please. nlme only fits models with normally distributed residual error (and normally distributed random effects). lmer (in the lme4) package fits models with exponential-family individual effects (i.e. GLMMs), but that doesn't include beta I don't know offhand of any package in R proper that will fit this model, although you could try Jim Lindsey's 'repeated' package or one of the HGLM packages. There are R interfaces to WinBUGS and AD Model Builder (possibly among others). good luck Ben Bolker From gajahorvat at hotmail.com Mon Jan 3 21:26:39 2011 From: gajahorvat at hotmail.com (gaja) Date: Mon, 3 Jan 2011 12:26:39 -0800 (PST) Subject: [R] Problem with uploading library In-Reply-To: <1293968955038-3170840.post@n4.nabble.com> References: <1293917146033-3170455.post@n4.nabble.com> <4D1FFFC5.3030504@bitwrit.com.au> <1293968955038-3170840.post@n4.nabble.com> Message-ID: <1294086399024-3172572.post@n4.nabble.com> Woho. I did it... Thanx guys ;) -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-uploading-library-tp3170455p3172572.html Sent from the R help mailing list archive at Nabble.com. From solymos at ualberta.ca Mon Jan 3 18:05:20 2011 From: solymos at ualberta.ca (Peter Solymos) Date: Mon, 3 Jan 2011 10:05:20 -0700 Subject: [R] [R-pkgs] dclone 1.3-0 Message-ID: Dear R Community, I am happy to introduce the latest version 1.3-0 of the 'dclone' R package. The package provides low level functions for implementing maximum likelihood estimating procedures for complex models using data cloning and Bayesian Markov chain Monte Carlo methods with support for JAGS, WinBUGS and OpenBUGS. Data cloning is a global optimization approach and a variant of simulated annealing which exploits Bayesian MCMC tools to get maximum likelihood point estimates and corresponding standard errors (see Lele et al. 2007, Ecology Letters, 10:551-563). The implementation used in the 'dclone' package is described in the recent paper: Solymos, P. 2010. dclone: Data Cloning in R. The R Journal, 2(2):29-37. URL: http://journal.r-project.org/archive/2010-2/RJournal_2010-2_Solymos.pdf The current release of 'dclone' supports parallel computations via the 'snow' package. Have fun, Peter Peter Solymos Alberta Biodiversity Monitoring Institute and Boreal Avian Modelling project Department of Biological Sciences CW 405, Biological Sciences Bldg University of Alberta Edmonton, Alberta, T6G 2E9, Canada Phone: 780.492.8534 Fax: 780.492.7635 email <- paste("solymos", "ualberta.ca", sep = "@") http://www.abmi.ca http://www.borealbirds.ca http://sites.google.com/site/psolymos -- Main functions in the 'dclone' package include: * dclone, dcdim: cloning R objects in various ways. * jags.fit, bugs.fit: conveniently fit BUGS models. (jags.parfit fits chains on parallel workers for JAGS.) * dc.fit: iterative model fitting by the data cloning algorithm. (dc.parfit is the parallelized version.) * dctable, dcdiag: helps evaluating data cloning convergence by descriptive statistics and diagnostic tools. (These are based on e.g. chisq.diag and lambdamax.diag.) * coef.mcmc.list, confint.mcmc.list.dc, dcsd.mcmc.list, quantile.mcmc.list, vcov.mcmc.list.dc, mcmcapply: convenient functions for mcmc.list objects. * write.jags.model, clean.jags.model, custommodel: convenient functions for handling BUGS models. _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages From davidD at qimr.edu.au Mon Jan 3 23:33:21 2011 From: davidD at qimr.edu.au (David Duffy) Date: Tue, 4 Jan 2011 08:33:21 +1000 (EST) Subject: [R] iPhone 3G App For R? In-Reply-To: References: Message-ID: > Roy Mendelssohn wrote: >> On Jan 2, 2011, at 6:28 PM, Shige Song wrote: >> Does iphone even support the GNU tool chain? > Check the archive for r-sig-mac > (https://stat.ethz.ch/pipermail/r-sig-mac/). There has been extensive > discussion about this. If memory serves (and it rarely does anymore :-) > ) the issue is mainly licensing, there already exists the ability to > compile to the ARM processor. It has been successfully packaged on Nokia ARM phones (maemo). Licensing would be the problem - I believe you could make your own provate version using the available gnu toolchain, though I can't see if gfortran is available. Cheers, David Duffy. From dwinsemius at comcast.net Mon Jan 3 23:44:07 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 3 Jan 2011 17:44:07 -0500 Subject: [R] Regex to remove last character In-Reply-To: <1294082664889-3172466.post@n4.nabble.com> References: <1294082664889-3172466.post@n4.nabble.com> Message-ID: <3E907E85-072B-4882-946B-ECCBB1940BB2@comcast.net> On Jan 3, 2011, at 2:24 PM, rivercode wrote: > > Hi, > > Have been having trouble trying to figure out the right regex > parameters to > remove the last "." in timestamp with the following format: > > Convert 09:30:00.377.853 to 09:30:00.377853 gsub("()(\\.)(.{3}$)", "\\1\\3" , "09:30:00.377.853") [1] "09:30:00.377853" > > -- > View this message in context: http://r.789695.n4.nabble.com/Regex-to-remove-last-character-tp3172466p3172466.html == David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Mon Jan 3 23:45:37 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 3 Jan 2011 17:45:37 -0500 Subject: [R] how to invert the axes in the wireframe() plot In-Reply-To: <328DCFAF-45E4-4150-B20F-81B8B593D365@gmail.com> References: <328DCFAF-45E4-4150-B20F-81B8B593D365@gmail.com> Message-ID: <65B6F1AE-FA91-4004-A8E9-589E712DC0B5@comcast.net> On Jan 3, 2011, at 5:06 PM, Simone Gabbriellini wrote: > Dear List, > > I am using the wireframe function in the lattice package, and I am > wondering if it is possible to invert the default axes orientation > for x and y axes... what parameter should I look for? Perhaps you should define "invert". > -- David Winsemius, MD West Hartford, CT From anthony at resolution.com Tue Jan 4 00:28:19 2011 From: anthony at resolution.com (apresley) Date: Mon, 3 Jan 2011 15:28:19 -0800 (PST) Subject: [R] randomForest speed improvements In-Reply-To: References: <1294084769056-3172523.post@n4.nabble.com> Message-ID: <1294097299183-3172834.post@n4.nabble.com> I haven't tried changing the mtry or ntree at all ... though I suppose with only 6 variables, and tens-of-thousands of rows, we can probably do less than 500 tree's (the default?). Although tossing the forest does speed things up a bit, seems to be about 15 - 20% faster in some cases, I need to keep the forest to do the prediction, otherwise, it complains that there is no forest component in the object. -- Anthony -- View this message in context: http://r.789695.n4.nabble.com/randomForest-speed-improvements-tp3172523p3172834.html Sent from the R help mailing list archive at Nabble.com. From djmuser at gmail.com Tue Jan 4 00:31:46 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Mon, 3 Jan 2011 15:31:46 -0800 Subject: [R] Inverse Gaussian Distribution In-Reply-To: <1294085029867-3172533.post@n4.nabble.com> References: <1294085029867-3172533.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wwwhsd at gmail.com Tue Jan 4 00:35:00 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Mon, 3 Jan 2011 21:35:00 -0200 Subject: [R] Regex to remove last character In-Reply-To: <1294082664889-3172466.post@n4.nabble.com> References: <1294082664889-3172466.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ted.harding at wlandres.net Tue Jan 4 00:36:59 2011 From: ted.harding at wlandres.net ( (Ted Harding)) Date: Mon, 03 Jan 2011 23:36:59 -0000 (GMT) Subject: [R] Logistic Regression Fitting with EM-Algorithm In-Reply-To: <4D21D6ED.6030807@ewi.utwente.nl> Message-ID: On 03-Jan-11 14:02:21, Robin Aly wrote: > Hi all, > is there any package which can do an EM algorithm fitting of > logistic regression coefficients given only the explanatory > variables? I tried to realize this using the Design package, > but I didn't find a way. > > Thanks a lot & Kind regards > Robin Aly As written, this is a strange question! You imply that you do not have data on the response (0/1) variable at all, only on the explanatory variables. In that case there is no possible estimate, because that would require data on at least some of the values of the response variable. I think you should explain more clearly and explicitly what the information is that you have for all the variables. Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 03-Jan-11 Time: 23:36:56 ------------------------------ XFMail ------------------------------ From hadley at rice.edu Tue Jan 4 00:48:27 2011 From: hadley at rice.edu (Hadley Wickham) Date: Mon, 3 Jan 2011 23:48:27 +0000 Subject: [R] packagename:::functionname vs. importFrom In-Reply-To: <1294091145769-3172684.post@n4.nabble.com> References: <1294091145769-3172684.post@n4.nabble.com> Message-ID: Hi Frank, I think you mean packagename::functionname? The three colon form is for accessing non-exported objects. Otherwise, I think using :: vs importFrom is functionally identical - either approach delays package loading until necessary. Hadley On Mon, Jan 3, 2011 at 9:45 PM, Frank Harrell wrote: > > In my rms package I use the packagename:::functionname construct in a number > of places. ?If I instead use the importFrom declaration in the NAMESPACE > file would that require the package to be available, and does it load the > package when my package loads? ?If so I would keep using packagename::: to > avoid up-front loading of other packages that are not always used. > > Thanks > Frank > > > ----- > Frank Harrell > Department of Biostatistics, Vanderbilt University > -- > View this message in context: http://r.789695.n4.nabble.com/packagename-functionname-vs-importFrom-tp3172684p3172684.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ From benjamin.ward at bathspa.org Tue Jan 4 01:03:40 2011 From: benjamin.ward at bathspa.org (Ben Ward) Date: Tue, 4 Jan 2011 00:03:40 +0000 Subject: [R] Resampling to find Confidence intervals Message-ID: Hi, I'm doing some modelling (lm) for my 3rd year dissertation and I want to do some resampling, especially as I'm working with microbes, getting them to evolve resistance to antimicrobial compounds, and after each exposure I'm measuring the minimum concentration required to kill them (which I'm expecting to rise over time, or exposures), I have 5 lineages per cleaner, and I'm using 2 cleaners(of different chemical origin, and it's these two different origins I'm interested in, or rather, and differences in concentration results between them). So the amount of data I get is small, hence my desire to resample. But thats not so important. I have used help from Kaplans Book: Statistical Modelling A Fresh Approach, to get write the following code for my project: samps = do(500)* coef(lm(MIC. ~ 1 + Challenge + Cleaner + Replicate, data=resample(ecoli))) sd(samps) But the "resample" and "do" operators are functions specific to a workspace that comes with the book, not a normal R setup. So I was thinking of ways I could achive the same result, or sort of result because the resample should be different each time, I think the following would work to the same effect: resampled_ecoli = sample(ecoli, 500, replace=T) coefs = (coef(lm(MIC. ~ 1 + Challenge + Cleaner + Replicate, data=resampled_ecoli))) sd(coefs) And then I can work out confidence intervals by multiplying the standard errors by 2. Although I'm not used to doing this sort of operation in R so I don't want to do the wrong thing. If anyon could tell me if that would work or what I need to do instead I'd be eternally greatful. Thanks, Ben Ward. From simone.gabbriellini at gmail.com Tue Jan 4 01:07:05 2011 From: simone.gabbriellini at gmail.com (Simone Gabbriellini) Date: Tue, 4 Jan 2011 01:07:05 +0100 Subject: [R] how to invert the axes in the wireframe() plot In-Reply-To: <65B6F1AE-FA91-4004-A8E9-589E712DC0B5@comcast.net> References: <328DCFAF-45E4-4150-B20F-81B8B593D365@gmail.com> <65B6F1AE-FA91-4004-A8E9-589E712DC0B5@comcast.net> Message-ID: I am trying to reproduce this graph: http://www.digitaldust.it/materiali/their.png the default axes orientation of wireframe gives me this: http://www.digitaldust.it/materiali/mine.pdf I am trying to understand if I can reproduce the axes orientation of the first figure. many thanks, simone Il giorno 03/gen/2011, alle ore 23.45, David Winsemius ha scritto: > > On Jan 3, 2011, at 5:06 PM, Simone Gabbriellini wrote: > >> Dear List, >> >> I am using the wireframe function in the lattice package, and I am wondering if it is possible to invert the default axes orientation for x and y axes... what parameter should I look for? > > Perhaps you should define "invert". >> > > > -- > > David Winsemius, MD > West Hartford, CT > From carl at witthoft.com Tue Jan 4 01:17:50 2011 From: carl at witthoft.com (Carl Witthoft) Date: Mon, 03 Jan 2011 19:17:50 -0500 Subject: [R] function masking and gmp questions Message-ID: <4D22672E.7020203@witthoft.com> Hi, Here's the problem I ran into: the gmp package has a method for apply() so it masks the base::apply function. With gmp installed, I tried to run the function turnpoints() from the pastecs package. It fails because it calls apply() internally, like this: apply(mymatrix,1,max,na.rm=TRUE) , but the code in the gmp package which sets up the operator overload for apply() strictly limits the arguments to the first three (a matrix, a dimension, and a function). I get, no surprise: Rgames> xs<-sin(seq(1,100)/10) Rgames> turnpoints(xs) Error in apply(ex, 1, max, na.rm = TRUE) : unused argument(s) (na.rm = TRUE) I'm assuming this is a bug in gmp code and will ask the owner of that package about it. But in the meantime, is there some way to force a function to search for functions in a different namespace, or at least to search with packages set in a different order? That is, in this example, to make turnpoints() look to package base before looking at gmp? Thanks for your help and corrections to any of my assumptions and conclusions here. Carl From dwinsemius at comcast.net Tue Jan 4 01:26:00 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 3 Jan 2011 19:26:00 -0500 Subject: [R] how to invert the axes in the wireframe() plot In-Reply-To: References: <328DCFAF-45E4-4150-B20F-81B8B593D365@gmail.com> <65B6F1AE-FA91-4004-A8E9-589E712DC0B5@comcast.net> Message-ID: <94DA6E78-3F6D-4FF9-AAD7-2E439D2F96D3@comcast.net> On Jan 3, 2011, at 7:07 PM, Simone Gabbriellini wrote: > I am trying to reproduce this graph: > > http://www.digitaldust.it/materiali/their.png > > the default axes orientation of wireframe gives me this: > > http://www.digitaldust.it/materiali/mine.pdf > > I am trying to understand if I can reproduce the axes orientation of > the first figure. Well, there is more to reproducing that figure than just rotating around the z axis by 180, which is what I now take your question to be: ?panel.3dwire Run the 2nd example in ?wireframe and then compare to this minor modification of the screen arguments: wireframe(z ~ x * y, data = g, groups = gr, scales = list(arrows = FALSE), drape = TRUE, colorkey = TRUE, screen = list(z = 30+180, x = -60), ) (Rotated 180 around the z-axis.) -- Best; David. > > many thanks, > simone > > Il giorno 03/gen/2011, alle ore 23.45, David Winsemius ha scritto: > >> >> On Jan 3, 2011, at 5:06 PM, Simone Gabbriellini wrote: >> >>> Dear List, >>> >>> I am using the wireframe function in the lattice package, and I am >>> wondering if it is possible to invert the default axes orientation >>> for x and y axes... what parameter should I look for? >> >> Perhaps you should define "invert". >>> >> >> >> -- >> >> David Winsemius, MD >> West Hartford, CT >> > David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Tue Jan 4 01:28:28 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 3 Jan 2011 19:28:28 -0500 Subject: [R] function masking and gmp questions In-Reply-To: <4D22672E.7020203@witthoft.com> References: <4D22672E.7020203@witthoft.com> Message-ID: <7AB57E7E-11C4-4378-AAA3-CF6ECB2A5528@comcast.net> On Jan 3, 2011, at 7:17 PM, Carl Witthoft wrote: > Hi, > Here's the problem I ran into: the gmp package has a method for > apply() so it masks the base::apply function. With gmp installed, I > tried to run the function turnpoints() from the pastecs package. It > fails because it calls apply() internally, like this: > > apply(mymatrix,1,max,na.rm=TRUE) > , > but the code in the gmp package which sets up the operator overload > for apply() strictly limits the arguments to the first three (a > matrix, a dimension, and a function). I get, no surprise: > > Rgames> xs<-sin(seq(1,100)/10) > Rgames> turnpoints(xs) > Error in apply(ex, 1, max, na.rm = TRUE) : > unused argument(s) (na.rm = TRUE) > > I'm assuming this is a bug in gmp code and will ask the owner of > that package about it. > > But in the meantime, is there some way to force a function to search > for functions in a different namespace, or at least to search with > packages set in a different order? That is, in this example, to > make turnpoints() look to package base before looking at gmp? In general the strategy is to use the "::" operator. Try : base::apply(mymatrix,1,max,na.rm=TRUE) (Untested in absence of example.) > > Thanks for your help and corrections to any of my assumptions and > conclusions here. > -- David Winsemius, MD West Hartford, CT From peter.langfelder at gmail.com Tue Jan 4 01:33:31 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Mon, 3 Jan 2011 16:33:31 -0800 Subject: [R] packagename:::functionname vs. importFrom In-Reply-To: References: <1294091145769-3172684.post@n4.nabble.com> Message-ID: On Mon, Jan 3, 2011 at 3:48 PM, Hadley Wickham wrote: > Hi Frank, > > I think you mean packagename::functionname? ?The three colon form is > for accessing non-exported objects. Normally two colons suffice, but within a package you need three to access exported but un-imported objects :) Peter From Jens.Rogmann at uni-hamburg.de Mon Jan 3 19:18:22 2011 From: Jens.Rogmann at uni-hamburg.de (Jens Rogmann (ABK/SK-Zentrum EPB Uni HH)) Date: Mon, 03 Jan 2011 19:18:22 +0100 Subject: [R] [R-pkgs] orddom package for Ordinal Dominance Statistics Message-ID: <4D2212EE.5070404@uni-hamburg.de> Dear R-users, a new R package has been released named orddom. This package provides ordinal estimates as alternatives to independent or paired group mean comparisons, especially for Cliff?s delta statistics. It provides basic parameters for various robust tests of stochastic equality with ordinally scaled variables. For two sets of data, ordinal comparison estimates are calculated such as - Cliff's delta (for which a Cohen's d effect size estimate, along with CIs are also calculated), (cf. Cliff, 1993; 1996; Long, Feng, & Cliff, 2003; Feng & Cliff, 2004; Feng, 2007); - the Common Language CL effect size or Probability of Superiority (PS) (Grissom, 1994; Grissom & Kim, 2005) estimate; and - estimates of Vargha and Delaney?s A as stochastic superiority (Vargha & Delaney, 1998, 2000; Delaney & Vargha, 2002). References and details can be found in the package help manual. The package is available as of now via http://cran.r-project.org/web/packages/orddom/index.html Your comments and suggestions are appreciated; please send your email to jens.rogmann _AT_ uni-hamburg.de Thank you, Kind regards Jens J. Rogmann --------------- Dr. J. J. Rogmann Universit?t Hamburg Fakultaet EPB / ABK Dept of Psychology Von-Melle-Park 5/IV D-20146 Hamburg Germany _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages From stvienna at gmail.com Tue Jan 4 01:44:03 2011 From: stvienna at gmail.com (stvienna wiener) Date: Tue, 4 Jan 2011 01:44:03 +0100 Subject: [R] unicode variable and function names? Message-ID: Dear List, Is it possible to have function names like ? (unicode universal quantifier)? This question is inspired by agda source code, which supports this. http://www.cs.nott.ac.uk/~nad/listings/lib-0.4/Algebra.html It would be handy to use. My guess is, however, that it's not supportet in R. Regards, Steve From carl at witthoft.com Tue Jan 4 01:44:15 2011 From: carl at witthoft.com (Carl Witthoft) Date: Mon, 03 Jan 2011 19:44:15 -0500 Subject: [R] function masking and gmp questions In-Reply-To: <7AB57E7E-11C4-4378-AAA3-CF6ECB2A5528@comcast.net> References: <4D22672E.7020203@witthoft.com> <7AB57E7E-11C4-4378-AAA3-CF6ECB2A5528@comcast.net> Message-ID: <4D226D5F.9060407@witthoft.com> The problem is that I can't 'insert' the :: operator inside an existing function. (Well, I could, but that would mean rewriting the function) I was hoping for a way to call the function, in this case turnpoints() in some way that told turnpoints itself to look for base::apply Carl On 1/3/11 7:28 PM, David Winsemius wrote: > > On Jan 3, 2011, at 7:17 PM, Carl Witthoft wrote: > >> Hi, >> Here's the problem I ran into: the gmp package has a method for >> apply() so it masks the base::apply function. With gmp installed, I >> tried to run the function turnpoints() from the pastecs package. It >> fails because it calls apply() internally, like this: >> >> apply(mymatrix,1,max,na.rm=TRUE) >> , >> but the code in the gmp package which sets up the operator overload >> for apply() strictly limits the arguments to the first three (a >> matrix, a dimension, and a function). I get, no surprise: >> >> Rgames> xs<-sin(seq(1,100)/10) >> Rgames> turnpoints(xs) >> Error in apply(ex, 1, max, na.rm = TRUE) : >> unused argument(s) (na.rm = TRUE) >> >> I'm assuming this is a bug in gmp code and will ask the owner of that >> package about it. >> >> But in the meantime, is there some way to force a function to search >> for functions in a different namespace, or at least to search with >> packages set in a different order? That is, in this example, to make >> turnpoints() look to package base before looking at gmp? > > In general the strategy is to use the "::" operator. Try : > > base::apply(mymatrix,1,max,na.rm=TRUE) > > (Untested in absence of example.) > >> >> Thanks for your help and corrections to any of my assumptions and >> conclusions here. >> From hadley at rice.edu Tue Jan 4 01:48:20 2011 From: hadley at rice.edu (Hadley Wickham) Date: Tue, 4 Jan 2011 00:48:20 +0000 Subject: [R] packagename:::functionname vs. importFrom In-Reply-To: References: <1294091145769-3172684.post@n4.nabble.com> Message-ID: >> I think you mean packagename::functionname? ?The three colon form is >> for accessing non-exported objects. > > Normally two colons suffice, but within a package you need three to > access exported but un-imported objects :) Are you sure? Note that it is typically a design mistake to use ?:::? in your code since the corresponding object has probably been kept internal for a good reason. Consider contacting the package maintainer if you feel the need to access the object for anything but mere inspection. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ From peter.langfelder at gmail.com Tue Jan 4 01:56:53 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Mon, 3 Jan 2011 16:56:53 -0800 Subject: [R] packagename:::functionname vs. importFrom In-Reply-To: References: <1294091145769-3172684.post@n4.nabble.com> Message-ID: Well, I'm pretty sure that, inside package A, calling B::functionName will not work if B has not been imported. That's why I use ::: (after spending some time trying to figure out why :: didn't work). At least that was the state of affairs as of R 2.9 or so, perhaps things have changed since then. Peter On Mon, Jan 3, 2011 at 4:48 PM, Hadley Wickham wrote: >>> I think you mean packagename::functionname? ?The three colon form is >>> for accessing non-exported objects. >> >> Normally two colons suffice, but within a package you need three to >> access exported but un-imported objects :) > > Are you sure? > > ? ? Note that it is typically a design mistake to use ?:::? in your > ? ? code since the corresponding object has probably been kept > ? ? internal for a good reason. ?Consider contacting the package > ? ? maintainer if you feel the need to access the object for anything > ? ? but mere inspection. > > Hadley > > -- > Assistant Professor / Dobelman Family Junior Chair > Department of Statistics / Rice University > http://had.co.nz/ > From murdoch.duncan at gmail.com Tue Jan 4 02:11:55 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Mon, 03 Jan 2011 20:11:55 -0500 Subject: [R] function masking and gmp questions In-Reply-To: <4D226D5F.9060407@witthoft.com> References: <4D22672E.7020203@witthoft.com> <7AB57E7E-11C4-4378-AAA3-CF6ECB2A5528@comcast.net> <4D226D5F.9060407@witthoft.com> Message-ID: <4D2273DB.4010202@gmail.com> On 11-01-03 7:44 PM, Carl Witthoft wrote: > The problem is that I can't 'insert' the :: operator inside an existing > function. (Well, I could, but that would mean rewriting the function) > > I was hoping for a way to call the function, in this case turnpoints() > in some way that told turnpoints itself to look for base::apply You could put a copy of base::apply earlier in the search list, e.g. at top level, apply <- base::apply But the pastecs package really needs a NAMESPACE defined to avoid this kind of thing. Then it would just naturally look in base before gmp. Since it is operating without one, this might not be the only surprise in the way it evaluates things. Duncan Murdoch > > Carl > > > On 1/3/11 7:28 PM, David Winsemius wrote: >> >> On Jan 3, 2011, at 7:17 PM, Carl Witthoft wrote: >> >>> Hi, >>> Here's the problem I ran into: the gmp package has a method for >>> apply() so it masks the base::apply function. With gmp installed, I >>> tried to run the function turnpoints() from the pastecs package. It >>> fails because it calls apply() internally, like this: >>> >>> apply(mymatrix,1,max,na.rm=TRUE) >>> , >>> but the code in the gmp package which sets up the operator overload >>> for apply() strictly limits the arguments to the first three (a >>> matrix, a dimension, and a function). I get, no surprise: >>> >>> Rgames> xs<-sin(seq(1,100)/10) >>> Rgames> turnpoints(xs) >>> Error in apply(ex, 1, max, na.rm = TRUE) : >>> unused argument(s) (na.rm = TRUE) >>> >>> I'm assuming this is a bug in gmp code and will ask the owner of that >>> package about it. >>> >>> But in the meantime, is there some way to force a function to search >>> for functions in a different namespace, or at least to search with >>> packages set in a different order? That is, in this example, to make >>> turnpoints() look to package base before looking at gmp? >> >> In general the strategy is to use the "::" operator. Try : >> >> base::apply(mymatrix,1,max,na.rm=TRUE) >> >> (Untested in absence of example.) >> >>> >>> Thanks for your help and corrections to any of my assumptions and >>> conclusions here. >>> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From dwinsemius at comcast.net Tue Jan 4 02:25:57 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 3 Jan 2011 20:25:57 -0500 Subject: [R] unicode variable and function names? In-Reply-To: References: Message-ID: On Jan 3, 2011, at 7:44 PM, stvienna wiener wrote: > Dear List, > > > Is it possible to have function names like ? (unicode universal > quantifier)? > This question is inspired by agda source code, which supports this. > > http://www.cs.nott.ac.uk/~nad/listings/lib-0.4/Algebra.html > > It would be handy to use. My guess is, however, that it's not > supportet in R. If you are willing continue to call it as `?` rather than s just ?, then it seems to "work as expected". > `?` <- function(x) {x+1} > > `?`( c(1,2,3) ) [1] 2 3 4 Efforts to use the un-backquoted variants were failures for me: > ? <- function(x) {x+1} Error: unexpected input in "?" (Question: does your keyboard support such a character??? I.e, Why would this be "handy"?) -- David Winsemius, MD West Hartford, CT From murdoch.duncan at gmail.com Tue Jan 4 02:32:52 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Mon, 03 Jan 2011 20:32:52 -0500 Subject: [R] unicode variable and function names? In-Reply-To: References: Message-ID: <4D2278C4.9000502@gmail.com> On 11-01-03 7:44 PM, stvienna wiener wrote: > Dear List, > > > Is it possible to have function names like ? (unicode universal quantifier)? > This question is inspired by agda source code, which supports this. > > http://www.cs.nott.ac.uk/~nad/listings/lib-0.4/Algebra.html > > It would be handy to use. My guess is, however, that it's not supportet in R. Why not just try it? I get this: > ? <- 1 Error: unexpected input in "?" > `?` <- 1 i.e. it is seen as a non-syntactic name, so needs backquotes to be used, but then it's fine (except it's probably a bad idea, since non-UTF8 systems will have trouble with it). Looks like the error message needs fixing. I was surprised how hard it was to find this documented, but it is there in the Introduction to R manual in section 1.8, "R commands, case sensitivity, etc." Someday someone should add this to the Language Definition manual. The rule is that syntactic names need to start with . or a letter, and contain only letters, digits, . and _. The confusing part is that "letter" doesn't have a universal definition, so the manual suggests limiting them to ASCII letters. Duncan Murdoch From dzhonatan at gmail.com Tue Jan 4 03:27:12 2011 From: dzhonatan at gmail.com (Jonathan Christensen) Date: Mon, 3 Jan 2011 19:27:12 -0700 Subject: [R] matrices call a function element-wise In-Reply-To: <201101032257104769317@yahoo.com.cn> References: <201101032257104769317@yahoo.com.cn> Message-ID: Hi, I would recommend reformatting the data as a 2x2x1000 array and using apply. Jonathan On Mon, Jan 3, 2011 at 7:57 AM, zhaoxing731 wrote: > Hello > > I have 4 1000*1000 matrix A,B,C,D. I want to use the corresponding element of the 4 matrices. Using the "for loop" as follow: > > E<-o > for (i in 1:1000) > {for (j in 1:1000) > { > E<-fisher.test(matrix(c(A[i][j],B[i][j],C[i][j],D[i][j]),2))#call fisher.test for every element > } > } > > It is so time-consuming > Need vectorization > > Yours sincerely > > > > > ZhaoXing > Department of Health Statistics > West China School of Public Health > Sichuan University > No.17 Section 3, South Renmin Road > Chengdu, Sichuan 610041 > P.R.China > > [[alternative HTML version deleted]] > > > __________________________________________________ > 8O?lW"2aQE;"3,4sH]A?Cb7QSJOd? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From f.harrell at vanderbilt.edu Tue Jan 4 03:50:27 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Mon, 3 Jan 2011 18:50:27 -0800 (PST) Subject: [R] packagename:::functionname vs. importFrom In-Reply-To: References: <1294091145769-3172684.post@n4.nabble.com> Message-ID: <1294109427406-3172984.post@n4.nabble.com> Correct. I'm doing this because of non-exported functions in other packages, so I need ::: I'd still appreciate any insight about whether importFrom in NAMESPACE defers package loading so that if the package is not actually used (and is not installed) there will be no problem. Thanks Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/packagename-functionname-vs-importFrom-tp3172684p3172984.html Sent from the R help mailing list archive at Nabble.com. From krishna at primps.com.sg Tue Jan 4 05:21:25 2011 From: krishna at primps.com.sg (SNV Krishna) Date: Tue, 4 Jan 2011 12:21:25 +0800 Subject: [R] how to subset unique factor combinations from a data frame. Message-ID: <5863F0E77BB746C7834DB95DD25FBA20@primpsg> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From hadley at rice.edu Tue Jan 4 06:06:29 2011 From: hadley at rice.edu (Hadley Wickham) Date: Tue, 4 Jan 2011 05:06:29 +0000 Subject: [R] packagename:::functionname vs. importFrom In-Reply-To: <1294109427406-3172984.post@n4.nabble.com> References: <1294091145769-3172684.post@n4.nabble.com> <1294109427406-3172984.post@n4.nabble.com> Message-ID: > Correct. ?I'm doing this because of non-exported functions in other packages, > so I need ::: But you really really shouldn't be doing that. Is there a reason that the package authors won't export the functions? > I'd still appreciate any insight about whether importFrom in NAMESPACE > defers package loading so that if the package is not actually used (and is > not installed) there will be no problem. Imported packages need to be installed - but it's the import vs. suggests vs. depends statement in DESCRIPTION that controls this behaviour, not the namespace. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ From Bill.Venables at csiro.au Tue Jan 4 06:28:20 2011 From: Bill.Venables at csiro.au (Bill.Venables at csiro.au) Date: Tue, 4 Jan 2011 16:28:20 +1100 Subject: [R] packagename:::functionname vs. importFrom In-Reply-To: References: <1294091145769-3172684.post@n4.nabble.com> <1294109427406-3172984.post@n4.nabble.com> Message-ID: <1BDAE2969943D540934EE8B4EF68F95FB27A44FE8D@EXNSW-MBX03.nexus.csiro.au> If you use ::: to access non-exported functions, as Frank confesses he does, then you can't complain if in the next release of the package involved the non-exported objects are missing and things are being done another way entirely. That's the deal. On the other hand, sometimes package authors do not envisage all the ways their package will be used and neglecting to export some object is mostly because the author simply did not anticipate that anyone would ever need to use it. But sometimes they do. A common case is when you need to do some operations very efficiently and there are simplifications in the input of which you can take advantage to cut down on the overheads. In that case you usually need the cut-down (non-exported) workhorse rather than the (exported) show-pony front end. The documentation suggests that if you ever need to use ::: perhaps you should be contacting the package maintainer to have the article in question exported. This makes a lot of sense, but it can also creates quite a bit of work for the maintainers, too, if they agree to do it. It's a very grey area, in my experience. Bill Venables. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Hadley Wickham Sent: Tuesday, 4 January 2011 3:06 PM To: Frank Harrell Cc: r-help at r-project.org Subject: Re: [R] packagename:::functionname vs. importFrom > Correct. ?I'm doing this because of non-exported functions in other packages, > so I need ::: But you really really shouldn't be doing that. Is there a reason that the package authors won't export the functions? > I'd still appreciate any insight about whether importFrom in NAMESPACE > defers package loading so that if the package is not actually used (and is > not installed) there will be no problem. Imported packages need to be installed - but it's the import vs. suggests vs. depends statement in DESCRIPTION that controls this behaviour, not the namespace. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From djmuser at gmail.com Tue Jan 4 06:41:05 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Mon, 3 Jan 2011 21:41:05 -0800 Subject: [R] matrices call a function element-wise In-Reply-To: <201101032257104769317@yahoo.com.cn> References: <201101032257104769317@yahoo.com.cn> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mtmorgan at fhcrc.org Tue Jan 4 07:59:27 2011 From: mtmorgan at fhcrc.org (Martin Morgan) Date: Mon, 03 Jan 2011 22:59:27 -0800 Subject: [R] packagename:::functionname vs. importFrom In-Reply-To: References: <1294091145769-3172684.post@n4.nabble.com> <1294109427406-3172984.post@n4.nabble.com> Message-ID: <4D22C54F.9090803@fhcrc.org> On 01/03/2011 09:06 PM, Hadley Wickham wrote: >> Correct. I'm doing this because of non-exported functions in other packages, >> so I need ::: > > But you really really shouldn't be doing that. Is there a reason that > the package authors won't export the functions? > >> I'd still appreciate any insight about whether importFrom in NAMESPACE >> defers package loading so that if the package is not actually used (and is >> not installed) there will be no problem. I think that with importFrom(packagename, functioname), the package will fail to INSTALL with message "object 'functioname' is not exported by namespace:packagename'". If the function is exported from packagname, then R CMD check will complain that 'Namespace dependency not required: packagename', which is to say that Imports: packagename is needed in the DESCRIPTION file. Packages that are listed in the Imports field of DESCRIPTION must be available at install time, so this implies that the user has packagename installed. I think this is trying to corral you to good programming practice: use Imports: packagename in the DESCRIPTION, use importsFrom(packagename, functioname) in the NAMESPACE, and only use functions that are exported from packagename. Martin > > Imported packages need to be installed - but it's the import vs. > suggests vs. depends statement in DESCRIPTION that controls this > behaviour, not the namespace. > > Hadley > > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 From diasandre at gmail.com Tue Jan 4 07:00:17 2011 From: diasandre at gmail.com (ADias) Date: Mon, 3 Jan 2011 22:00:17 -0800 (PST) Subject: [R] Help with "For" instruction Message-ID: <1294120817274-3173074.post@n4.nabble.com> Hi, I am having a problem in doing something similar to this example: Suppose I have this vector a, and from it I wish to create 5 other vector each one with less one value than what object a has So I have "a" a<-c(1,2,3,4,5) and I want a1 that shoud have (2,3,4,5) a2 that should have (1,3,4,5) a3 that should have (1,2,4,5) a4 that should have (1,2,3,5) a5 that should have (1,2,3,4) I have tried like this but with no luck For ( i in 1:5) { a<-c(1,2,3,4,5) a((i)<-a[-i] } Is there a way to do this? thank you A.Dias -- View this message in context: http://r.789695.n4.nabble.com/Help-with-For-instruction-tp3173074p3173074.html Sent from the R help mailing list archive at Nabble.com. From luke-tierney at uiowa.edu Tue Jan 4 04:58:23 2011 From: luke-tierney at uiowa.edu (luke-tierney at uiowa.edu) Date: Mon, 3 Jan 2011 21:58:23 -0600 Subject: [R] packagename:::functionname vs. importFrom In-Reply-To: <1294109427406-3172984.post@n4.nabble.com> References: <1294091145769-3172684.post@n4.nabble.com> <1294109427406-3172984.post@n4.nabble.com> Message-ID: On Mon, 3 Jan 2011, Frank Harrell wrote: > > Correct. I'm doing this because of non-exported functions in other packages, > so I need ::: > > I'd still appreciate any insight about whether importFrom in NAMESPACE > defers package loading so that if the package is not actually used (and is > not installed) there will be no problem. It does not -- the namespace from which you import is loaded when your package is. (Also you can't import unexported variables.) Best, luke > > Thanks > Frank > > > ----- > Frank Harrell > Department of Biostatistics, Vanderbilt University > -- Luke Tierney Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke at stat.uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu From seva.jurica at gmail.com Tue Jan 4 02:17:05 2011 From: seva.jurica at gmail.com (Jurica Seva) Date: Mon, 03 Jan 2011 20:17:05 -0500 Subject: [R] Print plot to pdf, jpg or any other format when using scatter3d error Message-ID: <4D227511.3040009@gmail.com> Hi, I have been trying to output my graphs to a file (jpeg, pdf, ps, it doesnt matter) but i cant seem to be able to get it to output. I tried a few things but none of them worked and am lost as what to do now. I am using the scatter3d function, and it prints out the graphs on tot he screen without any problems, but when it comes to writing them to a file i cant make it work. Is there any other way of producing 3dimensional graphs (they dont have to be rotatable/interactive after the print out)? The code is fairly simple and is listed down : #libraries library(RMySQL) library(rgl) library(scatterplot3d) library(Rcmdr) ############################################################################## #database connection mycon <- dbConnect(MySQL(), user='root',dbname='test',host='localhost',password='') #distinct sessions rsSessionsU01 <- dbSendQuery(mycon, "select distinct sessionID from actiontimes where userID = 'ID01'") sessionU01 <-fetch(rsSessionsU01) sessionU01[2,] #user01 data mycon <- dbConnect(MySQL(), user='root',dbname='test',host='localhost',password='') rsUser01 <- dbSendQuery(mycon, "select a.userID,a.sessionID,a.actionTaken,a.timelineMSEC,a.durationMSEC,b.X,b.Y,b.Rel__dist_,b.Total_dist_ from `actiontimes` as a , `ulogdata` as b where a.originalRECNO = b.RECNO and a.userID='ID01'") user01 <- fetch(rsUser01, n= -1) user01[1,1] #plot loop for (i in 1:10){ userSubset<-subset(user01,sessionID == sessionU01[i,],select=c(timelineMSEC,X,Y)) userSubset x<-as.numeric(userSubset$X) y<-as.numeric(userSubset$Y) scatter3d(x,y,userSubset$timeline,xlim = c(0,1280), ylim = c(0,1024), zlim=c(0,1800000),type="h",main=sessionU01[i,],sub=sessionU01[i,]) tmp6=rep(".ps") tmp7=paste(sessionU01[i,],tmp6,sep="") tmp7 rgl.postscript(tmp7,"ps",drawText=FALSE) #pdf(file=tmp7) #dev.print(file=tmp7, device=pdf, width=600) #dev.off(2) } From mtmorgan at fhcrc.org Tue Jan 4 08:05:20 2011 From: mtmorgan at fhcrc.org (Martin Morgan) Date: Mon, 03 Jan 2011 23:05:20 -0800 Subject: [R] Regex to remove last character In-Reply-To: <3E907E85-072B-4882-946B-ECCBB1940BB2@comcast.net> References: <1294082664889-3172466.post@n4.nabble.com> <3E907E85-072B-4882-946B-ECCBB1940BB2@comcast.net> Message-ID: <4D22C6B0.3060803@fhcrc.org> On 01/03/2011 02:44 PM, David Winsemius wrote: > > On Jan 3, 2011, at 2:24 PM, rivercode wrote: > >> >> Hi, >> >> Have been having trouble trying to figure out the right regex >> parameters to >> remove the last "." in timestamp with the following format: >> >> Convert 09:30:00.377.853 to 09:30:00.377853 > > gsub("()(\\.)(.{3}$)", "\\1\\3" , "09:30:00.377.853") > [1] "09:30:00.377853" The 'g' in 'gsub' says 'make multiple substitutions in a single character element' (compare gsub("0", "o", "09:30:00.377.853") with sub("0", "o", "09:30:00.377.853")) whereas here there's just a single substitution per character string ('.' with ''). Maybe sub("\\.([[:digit:]]+)$", "\\1", "09:30:00.377.853") where the pattern is 'a literal period \\. followed by 1 or more digits [[:digits:]]+ followed by an end-of-record $', with the 1 or more digits remembered as \\1 for the replacement. The [:digit:] is explained in ?regex; a less thorough solution would use "\\.(\\d+)$". Martin > >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/Regex-to-remove-last-character-tp3172466p3172466.html >> > > == > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 From djnordlund at frontier.com Tue Jan 4 08:28:18 2011 From: djnordlund at frontier.com (Daniel Nordlund) Date: Mon, 3 Jan 2011 23:28:18 -0800 Subject: [R] Help with "For" instruction In-Reply-To: <1294120817274-3173074.post@n4.nabble.com> References: <1294120817274-3173074.post@n4.nabble.com> Message-ID: <929370234F6A43EDB0971FE30AB25367@Aragorn> > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of ADias > Sent: Monday, January 03, 2011 10:00 PM > To: r-help at r-project.org > Subject: [R] Help with "For" instruction > > > Hi, > > I am having a problem in doing something similar to this example: > > Suppose I have this vector a, and from it I wish to create 5 other vector > each one with less one value than what object a has > > So I have "a" > a<-c(1,2,3,4,5) > > and I want > > a1 that shoud have (2,3,4,5) > a2 that should have (1,3,4,5) > a3 that should have (1,2,4,5) > a4 that should have (1,2,3,5) > a5 that should have (1,2,3,4) > > I have tried like this but with no luck > > > For ( i in 1:5) { > a<-c(1,2,3,4,5) > a((i)<-a[-i] > } > > Is there a way to do this? > > thank you > > A.Dias Does this do what you want? for(i in 1:length(a)) assign(paste('a', i, sep=''), a[-i]) Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA From dwinsemius at comcast.net Tue Jan 4 08:37:39 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 4 Jan 2011 02:37:39 -0500 Subject: [R] Print plot to pdf, jpg or any other format when using scatter3d error In-Reply-To: <4D227511.3040009@gmail.com> References: <4D227511.3040009@gmail.com> Message-ID: <50EBE7D9-9465-4FB0-AC58-134552C2E091@comcast.net> On Jan 3, 2011, at 8:17 PM, Jurica Seva wrote: > Hi, > > I have been trying to output my graphs to a file (jpeg, pdf, ps, it > doesnt matter) but i cant seem to be able to get it to output. I > tried a few things but none of them worked and am lost as what to do > now. I am using the scatter3d function, and it prints out the graphs > on tot he screen without any problems, but when it comes to writing > them to a file i cant make it work. Is there any other way of > producing 3dimensional graphs (they dont have to be rotatable/ > interactive after the print out)? > > The code is fairly simple and is listed down : > > #libraries > library(RMySQL) > library(rgl) > library(scatterplot3d) > library(Rcmdr) > > ############################################################################## > #database connection > mycon <- dbConnect(MySQL(), > user='root',dbname='test',host='localhost',password='') > #distinct sessions > rsSessionsU01 <- dbSendQuery(mycon, "select distinct sessionID from > actiontimes where userID = 'ID01'") > sessionU01 <-fetch(rsSessionsU01) > sessionU01[2,] > > #user01 data > mycon <- dbConnect(MySQL(), > user='root',dbname='test',host='localhost',password='') > rsUser01 <- dbSendQuery(mycon, "select > a > .userID > ,a > .sessionID > ,a > .actionTaken > ,a.timelineMSEC,a.durationMSEC,b.X,b.Y,b.Rel__dist_,b.Total_dist_ > from `actiontimes` as a , `ulogdata` as b where a.originalRECNO = > b.RECNO and a.userID='ID01'") > user01 <- fetch(rsUser01, n= -1) > user01[1,1] > > #plot loop > > for (i in 1:10){ > > userSubset<-subset(user01,sessionID == > sessionU01[i,],select=c(timelineMSEC,X,Y)) > userSubset > x<-as.numeric(userSubset$X) > y<-as.numeric(userSubset$Y) > scatter3d( #??? I thought the function was scatterplot3d() > x,y,userSubset$timeline,xlim = c(0,1280), ylim = c(0,1024), > zlim=c(0,1800000),type="h",main=sessionU01[i,],sub=sessionU01[i,]) > tmp6=rep(".ps") Why name it ".ps" when you are using pdf.dev()? > tmp7=paste(sessionU01[i,],tmp6,sep="") > tmp7 > rgl.postscript(tmp7,"ps",drawText=FALSE) When you want to get output to the file device, you need to "surround" the plotting commands. the pdf call goes at the beginning of the loop and the dev.odd at the end. It seems dangerous to specify the number to dev.off since there might be no or more than one device already open. If you just use dev.off() you will close the last device that was opened,' > #pdf(file=tmp7) # Move before the scatterplot3d call > #dev.print(file=tmp7, device=pdf, width=600) # you should also make sure you created a valid file string. We cannot check since you have not offered a reproducible example. > #dev.off(2) > } > David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Tue Jan 4 08:47:28 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 4 Jan 2011 02:47:28 -0500 Subject: [R] Help with "For" instruction In-Reply-To: <1294120817274-3173074.post@n4.nabble.com> References: <1294120817274-3173074.post@n4.nabble.com> Message-ID: On Jan 4, 2011, at 1:00 AM, ADias wrote: > > Hi, > > I am having a problem in doing something similar to this example: > > Suppose I have this vector a, and from it I wish to create 5 other > vector > each one with less one value than what object a has > > So I have "a" > a<-c(1,2,3,4,5) > > and I want > > a1 that shoud have (2,3,4,5) > a2 that should have (1,3,4,5) > a3 that should have (1,2,4,5) > a4 that should have (1,2,3,5) > a5 that should have (1,2,3,4) > > I have tried like this but with no luck > > For ( i in 1:5) { > a<-c(1,2,3,4,5) > a((i)<-a[-i] > } > > Is there a way to do this? Dan showed you a method using assign (since that is what is needed for what you asked for) but you would get a more flexible result if you used a structure that could be easily indexed such as a matrix or list: > A <- sapply(1:5, function(i) a[-i]) > colnames(A) <- paste("a", 1:5, sep="") > A a1 a2 a3 a4 a5 [1,] 2 1 1 1 1 [2,] 3 3 2 2 2 [3,] 4 4 4 3 3 [4,] 5 5 5 5 4 So: > A[ ,"a1"] [1] 2 3 4 5 > -- David Winsemius, MD West Hartford, CT From ligges at statistik.tu-dortmund.de Tue Jan 4 09:05:27 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Tue, 04 Jan 2011 09:05:27 +0100 Subject: [R] Print plot to pdf, jpg or any other format when using scatter3d error In-Reply-To: <50EBE7D9-9465-4FB0-AC58-134552C2E091@comcast.net> References: <4D227511.3040009@gmail.com> <50EBE7D9-9465-4FB0-AC58-134552C2E091@comcast.net> Message-ID: <4D22D4C7.4070809@statistik.tu-dortmund.de> On 04.01.2011 08:37, David Winsemius wrote: > > On Jan 3, 2011, at 8:17 PM, Jurica Seva wrote: > >> Hi, >> >> I have been trying to output my graphs to a file (jpeg, pdf, ps, it >> doesnt matter) but i cant seem to be able to get it to output. I tried >> a few things but none of them worked and am lost as what to do now. I >> am using the scatter3d function, and it prints out the graphs on tot >> he screen without any problems, but when it comes to writing them to a >> file i cant make it work. Is there any other way of producing >> 3dimensional graphs (they dont have to be rotatable/interactive after >> the print out)? >> >> The code is fairly simple and is listed down : >> >> #libraries >> library(RMySQL) >> library(rgl) >> library(scatterplot3d) >> library(Rcmdr) >> >> ############################################################################## >> >> #database connection >> mycon <- dbConnect(MySQL(), >> user='root',dbname='test',host='localhost',password='') >> #distinct sessions >> rsSessionsU01 <- dbSendQuery(mycon, "select distinct sessionID from >> actiontimes where userID = 'ID01'") >> sessionU01 <-fetch(rsSessionsU01) >> sessionU01[2,] >> >> #user01 data >> mycon <- dbConnect(MySQL(), >> user='root',dbname='test',host='localhost',password='') >> rsUser01 <- dbSendQuery(mycon, "select >> a.userID,a.sessionID,a.actionTaken,a.timelineMSEC,a.durationMSEC,b.X,b.Y,b.Rel__dist_,b.Total_dist_ >> from `actiontimes` as a , `ulogdata` as b where a.originalRECNO = >> b.RECNO and a.userID='ID01'") >> user01 <- fetch(rsUser01, n= -1) >> user01[1,1] >> >> #plot loop >> >> for (i in 1:10){ >> >> userSubset<-subset(user01,sessionID == >> sessionU01[i,],select=c(timelineMSEC,X,Y)) >> userSubset >> x<-as.numeric(userSubset$X) >> y<-as.numeric(userSubset$Y) >> scatter3d( > #??? I thought the function was scatterplot3d() scatter3d() is provided by Rcmdr and an interface to the rgl package. >> x,y,userSubset$timeline,xlim = c(0,1280), ylim = c(0,1024), >> zlim=c(0,1800000),type="h",main=sessionU01[i,],sub=sessionU01[i,]) >> tmp6=rep(".ps") > > Why name it ".ps" when you are using pdf.dev()? Actually he is not using it (at least it is in comments). >> tmp7=paste(sessionU01[i,],tmp6,sep="") >> tmp7 >> rgl.postscript(tmp7,"ps",drawText=FALSE) That should be correct code, according to the rgl documentation. > When you want to get output to the file device, you need to "surround" > the plotting commands. the pdf call goes at the beginning of the loop > and the dev.odd at the end. It seems dangerous to specify the number to > dev.off since there might be no or more than one device already open. If > you just use dev.off() you will close the last device that was opened,' Using R standard devices will not work for rgl graphics. Best wishes, Uwe > >> #pdf(file=tmp7) # Move before the scatterplot3d call > >> #dev.print(file=tmp7, device=pdf, width=600) > > # you should also make sure you created a valid file string. We cannot > check since you have not offered a reproducible example. > > > >> #dev.off(2) >> } >> > > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From petr.pikal at precheza.cz Tue Jan 4 09:06:22 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Tue, 4 Jan 2011 09:06:22 +0100 Subject: [R] Odp: how to subset unique factor combinations from a data frame. In-Reply-To: <5863F0E77BB746C7834DB95DD25FBA20@primpsg> References: <5863F0E77BB746C7834DB95DD25FBA20@primpsg> Message-ID: Hi r-help-bounces at r-project.org napsal dne 04.01.2011 05:21:25: > Hi All > > I have these questions and request members expert view on this. > > a) I have a dataframe (df) with five factors (identity variables) and value > (measured value). The id variables are Year, Country, Commodity, Attribute, > Unit. Value is a value for each combination of this. > > I would like to get just the unique combination of Commodity, Attribute and > Unit. I just need the unique factor combination into a dataframe or a table. > I know aggregate and subset but dont how to use them in this context. aggregate(Value, list(Comoditiy, Atribute, Unit), function) > > b) Is it possible to inclue non- aggregate columns with aggregate function > > say in the above case > aggregate(Value ~ Commodity + Attribute, data = df, > FUN = count). The use of count(Value) is just a round about to return the > combinations of Commodity & Attribute, and I would like to include 'Unit' > column in the returned data frame? Hm. Maybe xtabs? But without any example it is only a guess. > > c) Is it possible to subset based on unique combination, some thing like > this. > > > subset(df, unique(Commodity), select = c(Commodity, Attribute, Unit)). I > know this is not correct as it returns an error 'subset needs a logical > evaluation'. Trying various ways to accomplish the task. > Probably sqldf package has tools for doing it but I do not use it so you have to try yourself. df[Comodity==something, c("Commodity", "Attribute", "Unit")] can be other way. Anyway your explanation is ambiguous. Let say you have three rows with the same Commodity. Which row do you want to select? Regards Petr > will be grateful for any ideas and help > > Regards, > > SNVK > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Tue Jan 4 09:07:34 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Tue, 04 Jan 2011 09:07:34 +0100 Subject: [R] Please, need help with a plot In-Reply-To: <4d21ce4accc47_318bf4ee701af@weasel14.tmail> References: <4d20b7104796e_63e055fae74160@weasel21.tmail> <5630327E-CC9A-4CF9-AFDE-E12360AB6CB6@comcast.net> <4d21ce4accc47_318bf4ee701af@weasel14.tmail> Message-ID: <4D22D546.8050108@statistik.tu-dortmund.de> On 03.01.2011 14:25, Victor F Seabra wrote: > > although the code somehow didn't work on my Vista / R 2.8, > it did work perfectly on a XP machine / R 2.10 An when you update both machines to a recent R-2.12.1 you will find how much more will work in the end! Uwe Ligges > > I've been trying to fix this for days, > Thank you very much for your help! > > > ______________________________________________________________________ > > 02/01/2011 19:30, David Winsemius< dwinsemius at comcast.net> wrote: > > On Jan 2, 2011, at 1:15 PM, Ben Bolker wrote: > >> This is a little bit more 'magic' than I would like, but seems >> to work. Perhaps someone else can suggest a cleaner solution. > > Here's the best I could come up with but will admit that there were > many failed attempts before success: > > expr.vec<- as.expression(parse(text=table1$var1)) > plot(x=table1$var2 ,y=1:11, xlim=c(0,20), pch=20) > text(x=table1$var2, y=1:11, labels=expr.vec, pos=4) > title(x=15, y=5, expression("Yet another way to process strings with > operators like '<=' ) > > (The title expression works on my machine, but perhaps not on the OP's > machine, given differences in encoding that have so far been exhibited.) > > >> ages<- gsub("[^0-9]+","",table1$var1) >> rel<- gsub("age\\s*([=<>]+)\\s*[0-9]+","\\1",table1$var1,perl=TRUE) >> >> with(table1,plot(var2,1:11,xlim=c(0,20),pch=20)) >> invisible(with(table1, >> mapply(function(x,y,a,r) { >> text(x=x,y=y, >> switch(r, >> `<=`=bquote(age<= .(a)), >> `<`=bquote(age< .(a)), >> `>=`=bquote(age>= .(a))), >> pos=4)}, >> var2,1:11,ages,rel))) >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From wewolski at gmail.com Tue Jan 4 10:35:23 2011 From: wewolski at gmail.com (W Eryk Wolski) Date: Tue, 4 Jan 2011 10:35:23 +0100 Subject: [R] Default Working directory on windows 7? Message-ID: Hi, Just installed R on a new Windows 7 machine (as admin). I feel quite uncomfortable knowing that the default WD when starting R is: > getwd() [1] "C:/Windows/system32" I guess I did something wrong when installing R... How to change R's default working directory? regards Eryk From mdsumner at gmail.com Tue Jan 4 10:46:12 2011 From: mdsumner at gmail.com (Michael Sumner) Date: Tue, 4 Jan 2011 20:46:12 +1100 Subject: [R] Default Working directory on windows 7? In-Reply-To: References: Message-ID: Change the "Start in:" property on the shortcut you used to start R with. Right-click on the R icon and click Properties (you may need to right click once, then again if you are using the "pinned" start menu icon in Windows 7). Then when you start R again with that shortcut it will use the directory you choose. I'm not sure if you can control this during the install, but mine is set to use the "Documents" folder in my user account. Cheers, Mike. On Tue, Jan 4, 2011 at 8:35 PM, W Eryk Wolski wrote: > Hi, > > Just installed R on a new Windows 7 machine (as admin). > > > I feel quite uncomfortable knowing that the default WD when starting R is: > >> getwd() > [1] "C:/Windows/system32" > > > > I guess I did something wrong when installing R... How to change R's > default working directory? > > regards > > Eryk > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Michael Sumner Institute for Marine and Antarctic Studies, University of Tasmania Hobart, Australia e-mail: mdsumner at gmail.com From milano_amy at yahoo.com Tue Jan 4 10:52:07 2011 From: milano_amy at yahoo.com (Amy Milano) Date: Tue, 4 Jan 2011 01:52:07 -0800 (PST) Subject: [R] Fw: Re: Default Working directory on windows 7? Message-ID: <736838.74832.qm@web114402.mail.gq1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From m_hofert at web.de Tue Jan 4 10:53:33 2011 From: m_hofert at web.de (Marius Hofert) Date: Tue, 4 Jan 2011 10:53:33 +0100 Subject: [R] lattice: how to "center" a title? Message-ID: <8D33361D-F6CC-4485-91C5-81BC16107E6C@web.de> Dear expeRts, As you can see from this example... trellis.device("pdf", width = 5, height = 5) print(xyplot(0 ~ 0, main = "This title is not 'centered' for the human's eye", scales = list(alternating = c(1,1), tck = c(1,0)))) dev.off() ... the title does not seem to be "centered" for the human's eye [although it is centered when the plot (width) is considered with the y-axis label]. I could not easily (via trellis.par.get()) find a way to adjust this. Of course one could include more spaces in the title string, but is there a more elegant way to slightly shift the title to the right? Cheers, Marius From petr.pikal at precheza.cz Tue Jan 4 11:09:34 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Tue, 4 Jan 2011 11:09:34 +0100 Subject: [R] Odp: Default Working directory on windows 7? In-Reply-To: References: Message-ID: Hi r-help-bounces at r-project.org napsal dne 04.01.2011 10:35:23: > Hi, > > Just installed R on a new Windows 7 machine (as admin). > > > I feel quite uncomfortable knowing that the default WD when starting R is: > > > getwd() > [1] "C:/Windows/system32" Something was set incorrectly. I believe that with default setting R shall be somewhere in Program Files directory (but you can install it to any directory without problems). Try to remove it and reinstall if it is in some inconvenient place. There is one file .RData which is used as starting workspace. So make any directory, copy this file to it, start R by double click on it and now the new directory is working directory. Quite useful when working on several projects. Regards Petr > > > > I guess I did something wrong when installing R... How to change R's > default working directory? > > regards > > Eryk > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From krishna at primps.com.sg Tue Jan 4 11:19:02 2011 From: krishna at primps.com.sg (SNV Krishna) Date: Tue, 4 Jan 2011 18:19:02 +0800 Subject: [R] how to subset unique factor combinations from a data frame. In-Reply-To: References: <5863F0E77BB746C7834DB95DD25FBA20@primpsg> Message-ID: <3C14C468291E41F2986AD3B30DBDA276@primpsg> Hi, Sorry that my example is not clear. I will give an example of what each variable holds. I hope this clearly explains the case. Names of the dataframe (df) and description Year :- Year is calendar year, from 1980 to 2010 Country :- is the country name, total no. (levels) of countries is ~ 190 Commodity :- Crude oil, Sugar, Rubber, Coffee .... No. (levels) of commodities is 20 Attribute: - Production, Consumption, Stock, Import, Export... Levels ~ 20 Unit :- this is actually not a factor. It describes the unit of Attribute. Say the unit for Coffee (commodity) - Production (attribute) is 60 kgs. While the unit for Crude oil - Production is 1000 barrels Value :- value > tail(df, n = 10) // example data// Year Country Commodity Attribute Unit Value 1991 United Kingdom Wheat, Durum Total Supply (1000 MT) 70 1991 United Kingdom Wheat, Durum TY Exports (1000 MT) 0 1991 United Kingdom Wheat, Durum TY Imp. from U (1000 MT) 0 1991 United Kingdom Wheat, Durum TY Imports (1000 MT) 60 1991 United Kingdom Wheat, Durum Yield (MT/HA) 5 Wish this is clear. Any suggestion Regards, SNVK -----Original Message----- From: Petr PIKAL [mailto:petr.pikal at precheza.cz] Sent: Tuesday, January 04, 2011 4:06 PM To: SNV Krishna Cc: r-help at r-project.org Subject: Odp: [R] how to subset unique factor combinations from a data frame. Hi r-help-bounces at r-project.org napsal dne 04.01.2011 05:21:25: > Hi All > > I have these questions and request members expert view on this. > > a) I have a dataframe (df) with five factors (identity variables) and value > (measured value). The id variables are Year, Country, Commodity, Attribute, > Unit. Value is a value for each combination of this. > > I would like to get just the unique combination of Commodity, > Attribute and > Unit. I just need the unique factor combination into a dataframe or a table. > I know aggregate and subset but dont how to use them in this context. aggregate(Value, list(Comoditiy, Atribute, Unit), function) > > b) Is it possible to inclue non- aggregate columns with aggregate function > > say in the above case > aggregate(Value ~ Commodity + Attribute, data > = df, > FUN = count). The use of count(Value) is just a round about to return the > combinations of Commodity & Attribute, and I would like to include 'Unit' > column in the returned data frame? Hm. Maybe xtabs? But without any example it is only a guess. > > c) Is it possible to subset based on unique combination, some thing > like this. > > > subset(df, unique(Commodity), select = c(Commodity, Attribute, Unit)). I > know this is not correct as it returns an error 'subset needs a > logical evaluation'. Trying various ways to accomplish the task. > Probably sqldf package has tools for doing it but I do not use it so you have to try yourself. df[Comodity==something, c("Commodity", "Attribute", "Unit")] can be other way. Anyway your explanation is ambiguous. Let say you have three rows with the same Commodity. Which row do you want to select? Regards Petr > will be grateful for any ideas and help > > Regards, > > SNVK > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From benjamin.ward at bathspa.org Tue Jan 4 11:36:25 2011 From: benjamin.ward at bathspa.org (Ben Ward) Date: Tue, 4 Jan 2011 10:36:25 +0000 Subject: [R] Writing do and resample functions Message-ID: Hi, I'm trying to take a function from a workspace download provided in a stats textbook book, so I have it in my workspace to use all the time. I opened the workspace and typed the names of the two functions to get the code that makes them up: ------------------------------------------------------------------------------------- > resample function(d, size, replace=TRUE,prob=NULL,within=NULL) { if (!is.null(within)) return( resample.within(d, within,replace=replace) ) if (is.null(dim(d))) { # it's just a vector if (missing(size)) size=length(d) return( d[ sample(1:length(d),size, replace=replace, prob=prob)]) } else { if (missing(size)) size = dim(d)[1]; inds = sample(1:(dim(d))[1], size, replace=replace, prob=prob) if (is.data.frame(d) | is.matrix(d)) { return(d[inds,]); } else { return(d[inds]); } } } ------------------------------------------------------------------------------------- > do function(n=10){ as.repeater(n) } ------------------------------------------------------------------------------------- > as.repeater function(n=5){ foo = list(n=n) class(foo) = 'repeater' return(foo) } Then I made the functions in my workspace by choosing the name (same name), and then "=" and then copied and pasted the function, beginning with function( and ending with the final }'s. But when I try to do the following in my workspace afterwards: samps = do(500)* coef(lm(MIC.~1+Challenge+Cleaner+Replicate, data=resample(ecoli))) sd(samps) I get an error: Error in do(500) * coef(lm(MIC. ~ 1 + Challenge + Cleaner + Replicate, : non-numeric argument to binary operator. But in the workspace that comes with that book, I get a decent output: sd(samps) (Intercept) Challenge CleanerGarlic ReplicateFirst ReplicateFourth 3.9455401 0.7178385 1.6830641 5.4564926 5.4320998 ReplicateSecond ReplicateThird 5.3895562 5.5422622 Is there anybody out there who know a lot more about programming functions in R than I do, that might know why this is giving me the error? I don't understand why one workspace would accept the model formula, when the other give me the non-numeric argument to binary vector, the only vector that's not numerical is Replicate, but I don't that's what the error is talking about. Thanks, Ben Ward. From milano_amy at yahoo.com Tue Jan 4 11:43:36 2011 From: milano_amy at yahoo.com (Amy Milano) Date: Tue, 4 Jan 2011 02:43:36 -0800 (PST) Subject: [R] RSQLite to input dataframe Message-ID: <816120.47104.qm@web114412.mail.gq1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From simone.gabbriellini at gmail.com Tue Jan 4 11:53:27 2011 From: simone.gabbriellini at gmail.com (Simone Gabbriellini) Date: Tue, 4 Jan 2011 11:53:27 +0100 Subject: [R] how to invert the axes in the wireframe() plot In-Reply-To: <94DA6E78-3F6D-4FF9-AAD7-2E439D2F96D3@comcast.net> References: <328DCFAF-45E4-4150-B20F-81B8B593D365@gmail.com> <65B6F1AE-FA91-4004-A8E9-589E712DC0B5@comcast.net> <94DA6E78-3F6D-4FF9-AAD7-2E439D2F96D3@comcast.net> Message-ID: <182477E4-60AA-437C-80CC-8A8FF9E6842C@gmail.com> thank you very much, I am now closer to the result I would like to achieve. Still, I have to figure out how to "invert" the n axes, see figure: http://www.digitaldust.it/materiali/mine2.pdf is panel.3dwire the function I have to study for this last step? thanks, simone Il giorno 04/gen/2011, alle ore 01.26, David Winsemius ha scritto: > > On Jan 3, 2011, at 7:07 PM, Simone Gabbriellini wrote: > >> I am trying to reproduce this graph: >> >> http://www.digitaldust.it/materiali/their.png >> >> the default axes orientation of wireframe gives me this: >> >> http://www.digitaldust.it/materiali/mine.pdf >> >> I am trying to understand if I can reproduce the axes orientation of the first figure. > > Well, there is more to reproducing that figure than just rotating around the z axis by 180, which is what I now take your question to be: > > ?panel.3dwire > > Run the 2nd example in ?wireframe and then compare to this minor modification of the screen arguments: > > wireframe(z ~ x * y, data = g, groups = gr, > scales = list(arrows = FALSE), > drape = TRUE, colorkey = TRUE, > screen = list(z = 30+180, x = -60), ) > > (Rotated 180 around the z-axis.) > > -- > Best; > David. > >> >> many thanks, >> simone >> >> Il giorno 03/gen/2011, alle ore 23.45, David Winsemius ha scritto: >> >>> >>> On Jan 3, 2011, at 5:06 PM, Simone Gabbriellini wrote: >>> >>>> Dear List, >>>> >>>> I am using the wireframe function in the lattice package, and I am wondering if it is possible to invert the default axes orientation for x and y axes... what parameter should I look for? >>> >>> Perhaps you should define "invert". >>>> >>> >>> >>> -- >>> >>> David Winsemius, MD >>> West Hartford, CT >>> >> > > David Winsemius, MD > West Hartford, CT > From m_hofert at web.de Tue Jan 4 11:57:26 2011 From: m_hofert at web.de (Marius Hofert) Date: Tue, 4 Jan 2011 11:57:26 +0100 Subject: [R] lattice: par.settings with standard.theme() + additional arguments? Message-ID: <75668119-4D35-4A72-A23F-1B624F2BA6AA@web.de> Dear expeRts, I usually use par.settings = standard.theme(color = FALSE) to create lattice graphics without colors, so something like library(lattice) x <- runif(10) xyplot(x ~ 1:10, type = "l", par.settings = standard.theme(color = FALSE)) Now I would like to use an additional component in par.settings. I tried several things like xyplot(x ~ 1:10, type = "l", par.settings = c(standard.theme(color = FALSE), list(par.xlab.text = list(cex = 5, col = "blue")))) but it doesn't work. I know I could use lattice.options() but is there a way to get it right ("locally") with par.settings? Cheers, Marius From petr.pikal at precheza.cz Tue Jan 4 12:05:39 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Tue, 4 Jan 2011 12:05:39 +0100 Subject: [R] how to subset unique factor combinations from a data frame. In-Reply-To: <3C14C468291E41F2986AD3B30DBDA276@primpsg> References: <5863F0E77BB746C7834DB95DD25FBA20@primpsg> <3C14C468291E41F2986AD3B30DBDA276@primpsg> Message-ID: Hi r-help-bounces at r-project.org napsal dne 04.01.2011 11:19:02: > Hi, > > Sorry that my example is not clear. I will give an example of what each > variable holds. I hope this clearly explains the case. > > Names of the dataframe (df) and description > > Year :- Year is calendar year, from 1980 to 2010 > > Country :- is the country name, total no. (levels) of countries is ~ 190 > > Commodity :- Crude oil, Sugar, Rubber, Coffee .... No. (levels) of > commodities is 20 > > Attribute: - Production, Consumption, Stock, Import, Export... Levels ~ 20 > > Unit :- this is actually not a factor. It describes the unit of Attribute. > Say the unit for Coffee (commodity) - Production (attribute) is 60 kgs. > While the unit for Crude oil - Production is 1000 barrels > > Value :- value > > > tail(df, n = 10) // example data// > > Year Country Commodity Attribute Unit > Value > 1991 United Kingdom Wheat, Durum Total Supply (1000 MT) 70 > 1991 United Kingdom Wheat, Durum TY Exports (1000 MT) 0 > 1991 United Kingdom Wheat, Durum TY Imp. from U (1000 MT) 0 > 1991 United Kingdom Wheat, Durum TY Imports (1000 MT) 60 > 1991 United Kingdom Wheat, Durum Yield (MT/HA) 5 > > Wish this is clear. Any suggestion suggestion is still the same, use aggregate on any other similar function maybe from plyr package. No matter how exactly you will describe your data if you fail to show any code you used and how this code failed in delivering desired result you will get only vague responses. Regards Petr > > Regards, > > SNVK > > -----Original Message----- > From: Petr PIKAL [mailto:petr.pikal at precheza.cz] > Sent: Tuesday, January 04, 2011 4:06 PM > To: SNV Krishna > Cc: r-help at r-project.org > Subject: Odp: [R] how to subset unique factor combinations from a data > frame. > > Hi > > r-help-bounces at r-project.org napsal dne 04.01.2011 05:21:25: > > > Hi All > > > > I have these questions and request members expert view on this. > > > > a) I have a dataframe (df) with five factors (identity variables) and > value > > (measured value). The id variables are Year, Country, Commodity, > Attribute, > > Unit. Value is a value for each combination of this. > > > > I would like to get just the unique combination of Commodity, > > Attribute > and > > Unit. I just need the unique factor combination into a dataframe or a > table. > > I know aggregate and subset but dont how to use them in this context. > > aggregate(Value, list(Comoditiy, Atribute, Unit), function) > > > > > b) Is it possible to inclue non- aggregate columns with aggregate > function > > > > say in the above case > aggregate(Value ~ Commodity + Attribute, data > > = > df, > > FUN = count). The use of count(Value) is just a round about to return > the > > combinations of Commodity & Attribute, and I would like to include > 'Unit' > > column in the returned data frame? > > Hm. Maybe xtabs? But without any example it is only a guess. > > > > > c) Is it possible to subset based on unique combination, some thing > > like this. > > > > > subset(df, unique(Commodity), select = c(Commodity, Attribute, Unit)). > I > > know this is not correct as it returns an error 'subset needs a > > logical evaluation'. Trying various ways to accomplish the task. > > > > Probably sqldf package has tools for doing it but I do not use it so you > have to try yourself. > > df[Comodity==something, c("Commodity", "Attribute", "Unit")] > > can be other way. > > Anyway your explanation is ambiguous. Let say you have three rows with the > same Commodity. Which row do you want to select? > > Regards > Petr > > > > will be grateful for any ideas and help > > > > Regards, > > > > SNVK > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From pmakananisa at sars.gov.za Tue Jan 4 11:06:56 2011 From: pmakananisa at sars.gov.za (Mangalani Peter Makananisa) Date: Tue, 4 Jan 2011 12:06:56 +0200 Subject: [R] 95% CI of the "Predicted values" from the GLS Model Message-ID: <9D4E27B0394D874FB2D2AB36981CF29F02445DFE@ptabrmsg02.sars.gov.za> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.bedward at gmail.com Tue Jan 4 12:32:53 2011 From: michael.bedward at gmail.com (Michael Bedward) Date: Tue, 4 Jan 2011 22:32:53 +1100 Subject: [R] RSQLite to input dataframe In-Reply-To: <816120.47104.qm@web114412.mail.gq1.yahoo.com> References: <816120.47104.qm@web114412.mail.gq1.yahoo.com> Message-ID: Hi Amy, I'm not sure if I understand your question correctly so let me know if the following is off track. Starting with your example, here is how to create a data.frame and write it to a new table in a new database file... my.data = data.frame(X = c("US", "UK", "Canada", "Australia", "Newzealand"), Y = c(52, 36, 74, 10, 98)) drv <- dbDriver("SQLite") con <- dbConnect(drv, "myfilename.db") dbWriteTable(con, "sometablename", my.data) To verify that the table is now in the file... dbListTables(con) To check the fields in the table (should match the colnames in your data.frame)... dbListFields(con, "sometablename") To read the whole table into the workspace as a new data.frame my.data.copy <- dbReadTable(con, "sometablename") If you have data in a CSV file, and the contents are small enough to read in one go, you would use the read.csv function to read the contents of the file into a data.frame and then use dbWriteTable to transfer this to your database. Hope this helps, Michael On 4 January 2011 21:43, Amy Milano wrote: > Dear r helpers, > > At first, I apologize for raising a query which seems to be a stupid interpretation on my part. I am trying to learn SQLite. > > > > Following is an example given in the RSQLite.zip file (Page # 4) > > drv <- dbDriver("SQLite") > tfile <- tempfile() > con <- dbConnect(drv, dbname = tfile) > data(USArrests) > dbWriteTable(con, "arrests", USArrests) > > > On the similar line I am trying to read my data. > > Suppose I have a dataframe as given below. > > DF = data.frame(X = c("US", "UK", "Canada", "Australia", "Newzealand"), Y = c(52, 36, 74, 10, 98)) > > drv <- dbDriver("SQLite") > tfile <- tempfile() > con <- dbConnect(drv, dbname = tfile) > data(DF) > dbWriteTable(con, ......., .......) # Didn't know what to write here. > > I understand I have raised a query in a stupid manner. I need to understand is there any way I can use SQLite to read > ?dataframe or for that matter any csv file say e.g. 'DF.csv'. > > Please enlighten me. > > Amy > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From olivier.eterradossi at mines-ales.fr Tue Jan 4 12:41:47 2011 From: olivier.eterradossi at mines-ales.fr (Olivier ETERRADOSSI) Date: Tue, 04 Jan 2011 12:41:47 +0100 Subject: [R] Saving objects inside a list Message-ID: <4D23077B.20807@mines-ales.fr> > Message: 42 > Date: Mon, 3 Jan 2011 18:58:04 -0200 > From: Eduardo de Oliveira Horta > To: r-help > Subject: Re: [R] Saving objects inside a list > Message-ID: > > Content-Type: text/plain > > sapply(ls(),get) works fine. Thanks. > > ps: the as.list and the eapply suggestions didn't work. Hi Eduardo (and all the best for this new year), are you sure the as.list and eapply solutions didn't work ? On my machine they produce a list but in "reverse order" compared to the result of ls(),...maybe it's the same with you : names(as.list(.GlobalEnv))[6] is the name of the 6th variable FROM THE END of ls(). Regards. Olivier -- Olivier ETERRADOSSI Ma?tre-Assistant animateur du groupe Sensomines (Institut Carnot M.I.N.E.S) ------------------------------------------------------------- CMGD P?le "Mat?riaux Polym?res Avanc?s" axe "Propri?t?s Psycho-Sensorielles des Mat?riaux" ------------------------------------------------------------- Ecole des Mines d'Al?s H?lioparc, 2 av. P. Angot, F-64053 PAU CEDEX 9 tel std: +33 (0)5.59.30.54.25 tel direct: +33 (0)5.59.30.90.35 fax: +33 (0)5.59.30.63.68 http://www.mines-ales.fr e-mail : olivier.eterradossi at mines-ales.fr From pomchip at free.fr Tue Jan 4 13:13:54 2011 From: pomchip at free.fr (=?ISO-8859-1?Q?S=E9bastien_Bihorel?=) Date: Tue, 4 Jan 2011 07:13:54 -0500 Subject: [R] Listing of available functions Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From erich.neuwirth at univie.ac.at Tue Jan 4 13:30:07 2011 From: erich.neuwirth at univie.ac.at (Erich Neuwirth) Date: Tue, 04 Jan 2011 13:30:07 +0100 Subject: [R] Writing do and resample functions In-Reply-To: References: Message-ID: <4D2312CF.5030202@univie.ac.at> It seems that the textbook workspace you are using as a method definition for * for class repeater. Load the workspace and try methods(`*`) If you get something like '*.repeater' this suspicion is confirmed On 1/4/2011 11:36 AM, Ben Ward wrote: > Hi, > > I'm trying to take a function from a workspace download provided in a > stats textbook book, so I have it in my workspace to use all the time. > > I opened the workspace and typed the names of the two functions to get > the code that makes them up: > ------------------------------------------------------------------------------------- > >> resample > function(d, size, replace=TRUE,prob=NULL,within=NULL) { > if (!is.null(within)) return( resample.within(d, > within,replace=replace) ) > if (is.null(dim(d))) { > # it's just a vector > if (missing(size)) size=length(d) > return( d[ sample(1:length(d),size, replace=replace, prob=prob)]) > } > else { > if (missing(size)) size = dim(d)[1]; > inds = sample(1:(dim(d))[1], size, replace=replace, prob=prob) > if (is.data.frame(d) | is.matrix(d)) { > return(d[inds,]); > } else { > return(d[inds]); > } > } > } > ------------------------------------------------------------------------------------- > >> do > function(n=10){ > as.repeater(n) > } > ------------------------------------------------------------------------------------- > >> as.repeater > function(n=5){ > foo = list(n=n) > class(foo) = 'repeater' > return(foo) > } > > Then I made the functions in my workspace by choosing the name (same > name), and then "=" and then copied and pasted the function, beginning > with function( and ending with the final }'s. > > But when I try to do the following in my workspace afterwards: > > samps = do(500)* coef(lm(MIC.~1+Challenge+Cleaner+Replicate, > data=resample(ecoli))) > sd(samps) > > I get an error: > Error in do(500) * coef(lm(MIC. ~ 1 + Challenge + Cleaner + Replicate, : > non-numeric argument to binary operator. > > But in the workspace that comes with that book, I get a decent output: > > sd(samps) > (Intercept) Challenge CleanerGarlic ReplicateFirst > ReplicateFourth > 3.9455401 0.7178385 1.6830641 5.4564926 > 5.4320998 > ReplicateSecond ReplicateThird > 5.3895562 5.5422622 > > Is there anybody out there who know a lot more about programming > functions in R than I do, that might know why this is giving me the > error? I don't understand why one workspace would accept the model > formula, when the other give me the non-numeric argument to binary > vector, the only vector that's not numerical is Replicate, but I don't > that's what the error is talking about. > > Thanks, > Ben Ward. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From andreas.borg at unimedizin-mainz.de Tue Jan 4 13:37:29 2011 From: andreas.borg at unimedizin-mainz.de (Andreas Borg) Date: Tue, 04 Jan 2011 13:37:29 +0100 Subject: [R] RSQLite to input dataframe In-Reply-To: <816120.47104.qm@web114412.mail.gq1.yahoo.com> References: <816120.47104.qm@web114412.mail.gq1.yahoo.com> Message-ID: <4D231489.9010507@unimedizin-mainz.de> Hi Amy, > Suppose I have a dataframe as given below. > > DF = data.frame(X = c("US", "UK", "Canada", "Australia", "Newzealand"), Y = c(52, 36, 74, 10, 98)) > > drv <- dbDriver("SQLite") > tfile <- tempfile() > con <- dbConnect(drv, dbname = tfile) > data(DF) > dbWriteTable(con, ......., .......) # Didn't know what to write here. > data() loads data sets which ship with R or an extension package. Your data frame DF is already in the working environment, hence it is neither possible nor necessary to "load" it, the expression data(DF) is an error. The command to write DF to a database would be something like dbWriteTable(con, "DF", DF) where the second argument is the name of the database table to write to, the third argument the data frame you want to write (please have a look at the documentation). > I understand I have raised a query in a stupid manner. I need to understand is there any way I can use SQLite to read > dataframe or for that matter any csv file say e.g. 'DF.csv'. To copy a CSV file to SQLite, read it via read.table() or read.csv() first, then copy the result with dbWriteTable. There might be a way to read files directly from SQLite, but I don't know about that. Best regards, Andreas > > -- Andreas Borg Medizinische Informatik UNIVERSIT?TSMEDIZIN der Johannes Gutenberg-Universit?t Institut f?r Medizinische Biometrie, Epidemiologie und Informatik Obere Zahlbacher Stra?e 69, 55131 Mainz www.imbei.uni-mainz.de Telefon +49 (0) 6131 175062 E-Mail: borg at imbei.uni-mainz.de Diese E-Mail enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und l?schen Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail und der darin enthaltenen Informationen ist nicht gestattet. From mdowle at mdowle.plus.com Tue Jan 4 13:41:41 2011 From: mdowle at mdowle.plus.com (Matthew Dowle) Date: Tue, 4 Jan 2011 12:41:41 -0000 Subject: [R] Listing of available functions References: Message-ID: Try : objects("package:base") Also, as it happens, a new package called unknownR is in development on R-Forge. It's description says : Do you know how many functions there are in base R? How many of them do you know you don't know? Run unk() to discover your unknown unknowns. It's fast and it's fun ! It's not ready to try yet (and may not live up to it's promises) but hopefully should be ready soon. Matthew "S?bastien Bihorel" wrote in message news:AANLkTinfpMThB2OsGjckEO3jWsqHW+-ZDyd0xtdMK8ix at mail.gmail.com... > Dear R-users, > > Is there a easy way to access to a complete listing of available functions > from a R session? The help.start() and ? functions are great, but I feel > like they require the user to know the answer in advance (especially with > respect to function names)... I could not find a easy way to simply browse > through a list of functions and randomly pick one function to see what is > does. > > Is there such a possibility in R? > > Thanks > > PS: I apologize if this question appears trivial. > > [[alternative HTML version deleted]] > From wwwhsd at gmail.com Tue Jan 4 13:42:07 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Tue, 4 Jan 2011 10:42:07 -0200 Subject: [R] Listing of available functions In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From benjamin.ward at bathspa.org Tue Jan 4 13:51:17 2011 From: benjamin.ward at bathspa.org (Ben Ward) Date: Tue, 4 Jan 2011 12:51:17 +0000 Subject: [R] Writing do and resample functions In-Reply-To: <4D2312CF.5030202@univie.ac.at> References: <4D2312CF.5030202@univie.ac.at> Message-ID: I've just loaded the workspace and done > methods('*') And got the ouptut [1] *.difftime *.repeater I'm not entirely sure on what this repeater and as.repeater is. From my reading when one want to repeat stuff for say bootstrapping, you have to use repeat as part of a function and put in some sort of qualification that end the repetitions. Coding functions is a bit beyond what is taught on my BSc Biology course. However, I do know that if I do the same in a new R session: > methods('*') [1] *.difftime *.repeater is missing. So to clarify my understanding, the * is the problem because it hasn't got this .repeater value attributed to it? Thanks, Ben Ward. On 04/01/2011 12:30, Erich Neuwirth wrote: > It seems that the textbook workspace you are using as a method > definition for * for class repeater. Load the workspace and try > methods(`*`) > If you get something like '*.repeater' this suspicion is confirmed > > > On 1/4/2011 11:36 AM, Ben Ward wrote: >> Hi, >> >> I'm trying to take a function from a workspace download provided in a >> stats textbook book, so I have it in my workspace to use all the time. >> >> I opened the workspace and typed the names of the two functions to get >> the code that makes them up: >> ------------------------------------------------------------------------------------- >> >>> resample >> function(d, size, replace=TRUE,prob=NULL,within=NULL) { >> if (!is.null(within)) return( resample.within(d, >> within,replace=replace) ) >> if (is.null(dim(d))) { >> # it's just a vector >> if (missing(size)) size=length(d) >> return( d[ sample(1:length(d),size, replace=replace, prob=prob)]) >> } >> else { >> if (missing(size)) size = dim(d)[1]; >> inds = sample(1:(dim(d))[1], size, replace=replace, prob=prob) >> if (is.data.frame(d) | is.matrix(d)) { >> return(d[inds,]); >> } else { >> return(d[inds]); >> } >> } >> } >> ------------------------------------------------------------------------------------- >> >>> do >> function(n=10){ >> as.repeater(n) >> } >> ------------------------------------------------------------------------------------- >> >>> as.repeater >> function(n=5){ >> foo = list(n=n) >> class(foo) = 'repeater' >> return(foo) >> } >> >> Then I made the functions in my workspace by choosing the name (same >> name), and then "=" and then copied and pasted the function, beginning >> with function( and ending with the final }'s. >> >> But when I try to do the following in my workspace afterwards: >> >> samps = do(500)* coef(lm(MIC.~1+Challenge+Cleaner+Replicate, >> data=resample(ecoli))) >> sd(samps) >> >> I get an error: >> Error in do(500) * coef(lm(MIC. ~ 1 + Challenge + Cleaner + Replicate, : >> non-numeric argument to binary operator. >> >> But in the workspace that comes with that book, I get a decent output: >> >> sd(samps) >> (Intercept) Challenge CleanerGarlic ReplicateFirst >> ReplicateFourth >> 3.9455401 0.7178385 1.6830641 5.4564926 >> 5.4320998 >> ReplicateSecond ReplicateThird >> 5.3895562 5.5422622 >> >> Is there anybody out there who know a lot more about programming >> functions in R than I do, that might know why this is giving me the >> error? I don't understand why one workspace would accept the model >> formula, when the other give me the non-numeric argument to binary >> vector, the only vector that's not numerical is Replicate, but I don't >> that's what the error is talking about. >> >> Thanks, >> Ben Ward. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > From diasandre at gmail.com Tue Jan 4 12:31:07 2011 From: diasandre at gmail.com (ADias) Date: Tue, 4 Jan 2011 03:31:07 -0800 (PST) Subject: [R] Help with "For" instruction In-Reply-To: <929370234F6A43EDB0971FE30AB25367@Aragorn> References: <1294120817274-3173074.post@n4.nabble.com> <929370234F6A43EDB0971FE30AB25367@Aragorn> Message-ID: <1294140667706-3173386.post@n4.nabble.com> Hi thank you all. I think I have what I need to solve my problem. Regards, A.Dias -- View this message in context: http://r.789695.n4.nabble.com/Help-with-For-instruction-tp3173074p3173386.html Sent from the R help mailing list archive at Nabble.com. From kossi at cs.tu-berlin.de Tue Jan 4 13:07:31 2011 From: kossi at cs.tu-berlin.de (Daniel Kosztyla) Date: Tue, 4 Jan 2011 13:07:31 +0100 (CET) Subject: [R] S4 plot function (Package)... Message-ID: Hello, I have a problem programming 2 plot methods for 2 classes. 1. I have 2 classes. Each class has its own plot method. !!! How the plot,xxx-method.R -files have to look like, so that the plot method for class 1 und plot methods for class 2 can act for each belonging class. Prob My example for "plot_class1.R" for class labeled class1 (class2 equivalent): if (!isGeneric("plot")) { setGeneric("plot", function(x="class1", y="ANY", ...) standardGeneric("plot"), package="the_package_name"); } setMethod("plot", signature(x="class1", y="ANY"), function(x, y, type="xy", method = names(x at misclass), anno="symbol", ...) { ... #code } } #DOCS \name{plot,class1-method} \docType{methods} \alias{plot} \alias{plot,class1-method} ... Looks it correct? thx so far. Dan From Louisa_Lafrez at hotmail.com Tue Jan 4 13:50:04 2011 From: Louisa_Lafrez at hotmail.com (Louisa) Date: Tue, 4 Jan 2011 04:50:04 -0800 (PST) Subject: [R] Inverse Gaussian Distribution In-Reply-To: <1294085029867-3172533.post@n4.nabble.com> References: <1294085029867-3172533.post@n4.nabble.com> Message-ID: <1294145404452-3173468.post@n4.nabble.com> Thank you! But i'm wondering: if you run area <- factor(area, levels=c("C", "A","B","D","E","F") ) then you are transforming only 'area', aren't you? isn't it possible to transform the whole data like i did for agecat but now for area and area C as baseline, or are you doing so when you run > area <- factor(area, levels=c("C", "A","B","D","E","F") ) > attach(data) and then run the model with area as predictorvariable: > model <- glm(Y~ agecat+gender+area,...) My question is if i can run it as follows and still have a right solution : > data <-transform(data, area=(factor(area, levels=c("C", > "A","B","D","E","F") ) ) I'll be very grateful for any help you can provide! Kind regards, Louisa -- View this message in context: http://r.789695.n4.nabble.com/Inverse-Gaussian-Distribution-tp3172533p3173468.html Sent from the R help mailing list archive at Nabble.com. From milano_amy at yahoo.com Tue Jan 4 13:32:39 2011 From: milano_amy at yahoo.com (Amy Milano) Date: Tue, 4 Jan 2011 04:32:39 -0800 (PST) Subject: [R] RSQLite to input dataframe In-Reply-To: Message-ID: <885706.21976.qm@web114405.mail.gq1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ripley at stats.ox.ac.uk Tue Jan 4 14:05:06 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Tue, 4 Jan 2011 13:05:06 +0000 (GMT) Subject: [R] ARIMA simulation including a constant In-Reply-To: References: Message-ID: That output is not from arima(), which is the function paired with arima.sim(). Nor is it from arima0(), so I don't believe it is 'ouptut from R' (perhaps from a contributed package you have not mentioned?). With arima(), the intercept is 'm' in the notation on the help page and not 'a' in your personal re-definition. You can easily go from one to the other by some trivial algebra ( a = m*(1-sum(ar)) ), but you do need to be sure what the unstated function you use is using. On Mon, 3 Jan 2011, Paolo Rossi wrote: > Sorry I am not really sure I have been taht clear. > ? > I meant ARMA which is not bound to have zero mean. More precisely, suppose I > estimate y(t) = a + by(t-1) ?+ e(t) + ce(t-1) , i.e. and ARMA(1,1). My > question is how do I simulate values for yt given the values for a, b and c? > My problem with arima.sim is that I cannot find a way to pass the value for > the constant a. > ? > Output from ?R is: > ? > Coefficient(s): > ?????????? Estimate? Std. Error? t value Pr(>|t|)??? > ar1???????? 0.82978???? 0.01033?? 80.297? < 2e-16 *** > ma1???????? 0.46347???? 0.01548?? 29.942? < 2e-16 *** > intercept? -0.02666???? 0.01012?? -2.635? 0.00841 ** > --- > Intercept is significant and I suppose it should be used if I want simulate > values from this ARMA(1,1) > ? > ? > Thanks and Apologies for not being clear > ? > Paolo > ? > > > ? > On 3 January 2011 16:46, Prof Brian Ripley wrote: > On Mon, 3 Jan 2011, Paolo Rossi wrote: > > Hi, > > I have been looking at arima.sim to simulate the output > from an ARMA model > fed with a normal and uncorrelated input series but I > cannot find a way to > pass an intercept / constant into the model. In other > words, the model input > in the function allows only for the AR and MA components > but I need to pass > a constant. > > Can anyone help? > > > Well, an ARIMA model by definition has zero mean (as the link on the > help page for arima.sim to the exact definition tells you). ?Perhaps > you mean that (X-m) = Z follows an ARIMA model, in which case simulate > Z and add m. ?For a differenced ARIMA model it is not clear if you > meant that you wanted an intercept for the original or differenced > series: for the latter simply simulate the differenced series, add the > intercept and use diffinv(). > > Thanks > > Paolo > > ? ? ? ?[[alternative HTML version deleted]] > > > Please do as we asked in the posting guide and not send HTML. > > -- > Brian D. Ripley, ? ? ? ? ? ? ? ? ?ripley at stats.ox.ac.uk > Professor of Applied Statistics, ?http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, ? ? ? ? ? ? Tel: ?+44 1865 272861 (self) > 1 South Parks Road, ? ? ? ? ? ? ? ? ? ? +44 1865 272866 (PA) > Oxford OX1 3TG, UK ? ? ? ? ? ? ? ?Fax: ?+44 1865 272595 > > > > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From djmuser at gmail.com Tue Jan 4 14:07:03 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Tue, 4 Jan 2011 05:07:03 -0800 Subject: [R] how to subset unique factor combinations from a data frame. In-Reply-To: <3C14C468291E41F2986AD3B30DBDA276@primpsg> References: <5863F0E77BB746C7834DB95DD25FBA20@primpsg> <3C14C468291E41F2986AD3B30DBDA276@primpsg> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From murdoch.duncan at gmail.com Tue Jan 4 14:31:15 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Tue, 04 Jan 2011 08:31:15 -0500 Subject: [R] Print plot to pdf, jpg or any other format when using scatter3d error In-Reply-To: <4D227511.3040009@gmail.com> References: <4D227511.3040009@gmail.com> Message-ID: <4D232123.8010908@gmail.com> On 03/01/2011 8:17 PM, Jurica Seva wrote: > Hi, > > I have been trying to output my graphs to a file (jpeg, pdf, ps, it > doesnt matter) but i cant seem to be able to get it to output. As Uwe said, you are using rgl graphics, not base graphics. So none of the standard devices work, you need to use the tools built into rgl. Attach that package, and then read ?rgl.postscript (for graphics in various vector formats, not just Postscript) and ?rgl.snapshot (for bitmapped graphics). Some notes: - For a while rgl.snapshot wasn't working in the Windows builds with R 2.12.1; that is now fixed, so you should update rgl before getting frustrated. - rgl.snapshot just takes a copy of the graphics buffer that is showing on screen, so it is limited to the size you can display - rgl.postscript does a better job for the parts of an image that it can handle, but it is not a perfect OpenGL emulator, so it doesn't always include all components of a graph properly. Duncan Murdoch > I tried a > few things but none of them worked and am lost as what to do now. I am > using the scatter3d function, and it prints out the graphs on tot he > screen without any problems, but when it comes to writing them to a file > i cant make it work. Is there any other way of producing 3dimensional > graphs (they dont have to be rotatable/interactive after the print out)? > > The code is fairly simple and is listed down : > > #libraries > library(RMySQL) > library(rgl) > library(scatterplot3d) > library(Rcmdr) > > ############################################################################## > #database connection > mycon<- dbConnect(MySQL(), > user='root',dbname='test',host='localhost',password='') > #distinct sessions > rsSessionsU01<- dbSendQuery(mycon, "select distinct sessionID from > actiontimes where userID = 'ID01'") > sessionU01<-fetch(rsSessionsU01) > sessionU01[2,] > > #user01 data > mycon<- dbConnect(MySQL(), > user='root',dbname='test',host='localhost',password='') > rsUser01<- dbSendQuery(mycon, "select > a.userID,a.sessionID,a.actionTaken,a.timelineMSEC,a.durationMSEC,b.X,b.Y,b.Rel__dist_,b.Total_dist_ > from `actiontimes` as a , `ulogdata` as b where a.originalRECNO = > b.RECNO and a.userID='ID01'") > user01<- fetch(rsUser01, n= -1) > user01[1,1] > > #plot loop > > for (i in 1:10){ > > userSubset<-subset(user01,sessionID == > sessionU01[i,],select=c(timelineMSEC,X,Y)) > userSubset > x<-as.numeric(userSubset$X) > y<-as.numeric(userSubset$Y) > scatter3d(x,y,userSubset$timeline,xlim = c(0,1280), ylim = > c(0,1024), > zlim=c(0,1800000),type="h",main=sessionU01[i,],sub=sessionU01[i,]) > tmp6=rep(".ps") > tmp7=paste(sessionU01[i,],tmp6,sep="") > tmp7 > rgl.postscript(tmp7,"ps",drawText=FALSE) > #pdf(file=tmp7) > #dev.print(file=tmp7, device=pdf, width=600) > #dev.off(2) > } > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From nevil.amos at gmail.com Tue Jan 4 14:31:28 2011 From: nevil.amos at gmail.com (Nevil Amos) Date: Wed, 05 Jan 2011 00:31:28 +1100 Subject: [R] how to keep keep matching column in output of merge Message-ID: <4D232130.1020801@sci.monash.edu.au> How do I keep the linking column[s] in a merge()? I need to use the values again in a further merge. thanks Nevil Amos From dwinsemius at comcast.net Tue Jan 4 14:32:41 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 4 Jan 2011 08:32:41 -0500 Subject: [R] lattice: par.settings with standard.theme() + additional arguments? In-Reply-To: <75668119-4D35-4A72-A23F-1B624F2BA6AA@web.de> References: <75668119-4D35-4A72-A23F-1B624F2BA6AA@web.de> Message-ID: On Jan 4, 2011, at 5:57 AM, Marius Hofert wrote: > Dear expeRts, > > I usually use par.settings = standard.theme(color = FALSE) to create > lattice graphics > without colors, so something like > > library(lattice) > x <- runif(10) > xyplot(x ~ 1:10, type = "l", par.settings = standard.theme(color = > FALSE)) > > Now I would like to use an additional component in par.settings. I > tried several things > like > > xyplot(x ~ 1:10, type = "l", par.settings = c(standard.theme(color = > FALSE), list(par.xlab.text = list(cex = 5, col = "blue")))) > > but it doesn't work. I know I could use lattice.options() but is > there a way to get it > right ("locally") with par.settings? Add it as a list element: xyplot(x ~ 1:10, type = "l", par.settings = list(standard.theme(color = FALSE), par.xlab.text = list(cex = 5, col = "blue"))) -- David Winsemius, MD West Hartford, CT From murdoch.duncan at gmail.com Tue Jan 4 14:37:20 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Tue, 04 Jan 2011 08:37:20 -0500 Subject: [R] Listing of available functions In-Reply-To: References: Message-ID: <4D232290.8000502@gmail.com> On 04/01/2011 7:13 AM, S?bastien Bihorel wrote: > Dear R-users, > > Is there a easy way to access to a complete listing of available functions > from a R session? The help.start() and ? functions are great, but I feel > like they require the user to know the answer in advance (especially with > respect to function names)... I could not find a easy way to simply browse > through a list of functions and randomly pick one function to see what is > does. > > Is there such a possibility in R? > > Thanks > > PS: I apologize if this question appears trivial. This requires a two level search, but is probably more useful than just a list of function names: Run help.start() to get into the main help page. Click on "Packages". Now you'll see a list of all installed packages and their titles. Choose an interesting package, and click on its name. Now you'll see a list of most help aliases (generally all functions, plus some other things) and the title of the associated help page. You don't see all help aliases: S4 method documentation tends to have a lot of aliases, and some of those are suppressed. But you should see all help pages at least once. Duncan Murdoch From dwinsemius at comcast.net Tue Jan 4 14:38:05 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 4 Jan 2011 08:38:05 -0500 Subject: [R] Listing of available functions In-Reply-To: References: Message-ID: <9BD89916-C371-44C3-8D7A-C47D3F093250@comcast.net> You can also search for a "cheatsheet". There are several out there searching on "cheatsheet r" This one at Oregon State is presented as a web page" http://www.science.oregonstate.edu/~shenr/Rhelp/00cheat.htm Others are available as pdf's. On Jan 4, 2011, at 7:13 AM, S?bastien Bihorel wrote: > Dear R-users, > > Is there a easy way to access to a complete listing of available > functions > from a R session? The help.start() and ? functions are great, but I > feel > like they require the user to know the answer in advance (especially > with > respect to function names)... I could not find a easy way to simply > browse > through a list of functions and randomly pick one function to see > what is > does. > > Is there such a possibility in R? > > Thanks > > PS: I apologize if this question appears trivial. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From m_hofert at web.de Tue Jan 4 14:39:17 2011 From: m_hofert at web.de (Marius Hofert) Date: Tue, 4 Jan 2011 14:39:17 +0100 Subject: [R] lattice: par.settings with standard.theme() + additional arguments? In-Reply-To: References: <75668119-4D35-4A72-A23F-1B624F2BA6AA@web.de> Message-ID: Dear David, this I already tried. But as you can see, the plot itself *is* colored. However, I want to have color = FALSE, so, unfortunately, this approach does not work... Cheers, Marius On 2011-01-04, at 14:32 , David Winsemius wrote: > > On Jan 4, 2011, at 5:57 AM, Marius Hofert wrote: > >> Dear expeRts, >> >> I usually use par.settings = standard.theme(color = FALSE) to create lattice graphics >> without colors, so something like >> >> library(lattice) >> x <- runif(10) >> xyplot(x ~ 1:10, type = "l", par.settings = standard.theme(color = FALSE)) >> >> Now I would like to use an additional component in par.settings. I tried several things >> like >> >> xyplot(x ~ 1:10, type = "l", par.settings = c(standard.theme(color = FALSE), list(par.xlab.text = list(cex = 5, col = "blue")))) >> >> but it doesn't work. I know I could use lattice.options() but is there a way to get it >> right ("locally") with par.settings? > > Add it as a list element: > > xyplot(x ~ 1:10, type = "l", par.settings = list(standard.theme(color = FALSE), par.xlab.text = list(cex = 5, col = "blue"))) > > -- > > David Winsemius, MD > West Hartford, CT > From dwinsemius at comcast.net Tue Jan 4 14:52:46 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 4 Jan 2011 08:52:46 -0500 Subject: [R] Inverse Gaussian Distribution In-Reply-To: <1294145404452-3173468.post@n4.nabble.com> References: <1294085029867-3172533.post@n4.nabble.com> <1294145404452-3173468.post@n4.nabble.com> Message-ID: On Jan 4, 2011, at 7:50 AM, Louisa wrote: > > Thank you! > > But i'm wondering: > > if you run > > area <- factor(area, levels=c("C", "A","B","D","E","F") ) > > then you are transforming only 'area', aren't you? > > isn't it possible to transform the whole data like i did for agecat > but now for area and area C as baseline, Of course, but may I point out that you did no earlier say that you may have attached a dataframe containing "area". There are many confusing situations that arise when using attached dataframes and this may be one of them. Many experienced R users avoid attaching data objects like the plague. > > or are you doing so when you run > >> area <- factor(area, levels=c("C", "A","B","D","E","F") ) >> attach(data) > > and then run the model with area as predictorvariable: > >> model <- glm(Y~ agecat+gender+area,...) > > My question is if i can run it as follows and still have a right > solution : > >> data <-transform(data, area=(factor(area, levels=c("C", >> "A","B","D","E","F") ) ) Is there any reason that you suspect this will not work? Why not try it ... now assumimg that you failed to tell us that area was a column in a dataframe that is. (One set of parentheses appear extraneous.) data <-transform(data, area = factor(area, levels=c("C", "A","B","D","E","F") ) (Untested in the absence of reproducible example ,,, but it looks as though it might work.) > > I'll be very grateful for any help you can provide! > > Kind regards, > > Louisa > > > -- > View this message in context: http://r.789695.n4.nabble.com/Inverse-Gaussian-Distribution-tp3172533p3173468.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From Timmie.Smet at econ.kuleuven.be Tue Jan 4 14:00:55 2011 From: Timmie.Smet at econ.kuleuven.be (F1Tim) Date: Tue, 4 Jan 2011 05:00:55 -0800 (PST) Subject: [R] lasso/lars error In-Reply-To: References: Message-ID: <1294146055715-3173480.post@n4.nabble.com> I'm having the same error too, but it's very rare though... (especially when using datasets with many variables, say 25). A somewhat odd, but seemingly effective, way to solve this, is by writing try(lars(x,y),silent = T). The try-command in this shape will not generate an error on the console and therefore, any running programs are not aborted. When there is no error, you simply get the LARS object. When there is an error, you get a character string (can be checked by 'is.character()'). I suggest to perturb the data a little to get it to work... Best Regards, Timmie Smet B.A.P. Operations Research and Business Statistics University of Leuven, Belgium -- View this message in context: http://r.789695.n4.nabble.com/lasso-lars-error-tp831400p3173480.html Sent from the R help mailing list archive at Nabble.com. From bbolker at gmail.com Tue Jan 4 15:20:41 2011 From: bbolker at gmail.com (Ben Bolker) Date: Tue, 4 Jan 2011 14:20:41 +0000 (UTC) Subject: [R] how to keep keep matching column in output of merge References: <4D232130.1020801@sci.monash.edu.au> Message-ID: Nevil Amos gmail.com> writes: > How do I keep the linking column[s] in a merge()? > I need to use the values again in a further merge. simple reproducible example please? (e.g. make up a couple of 4-row datasets that show what you want, what you are trying, and what you are getting instead). From eduardo.oliveirahorta at gmail.com Tue Jan 4 15:24:30 2011 From: eduardo.oliveirahorta at gmail.com (Eduardo de Oliveira Horta) Date: Tue, 4 Jan 2011 12:24:30 -0200 Subject: [R] Saving objects inside a list In-Reply-To: <4D23077B.20807@mines-ales.fr> References: <4D23077B.20807@mines-ales.fr> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sarah.goslee at gmail.com Tue Jan 4 15:26:54 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Tue, 4 Jan 2011 09:26:54 -0500 Subject: [R] how to keep keep matching column in output of merge In-Reply-To: <4D232130.1020801@sci.monash.edu.au> References: <4D232130.1020801@sci.monash.edu.au> Message-ID: Hi Nevil, We really need an example here of what you're doing, since merge() does keep the id column by default. > x <- data.frame(id = c("a", "b", "c", "d"), x=c(1,2,3,4)) > y <- data.frame(id = c("b", "a", "d", "c"), y=c(101, 102, 103, 104)) > merge(x, y) id x y 1 a 1 102 2 b 2 101 3 c 3 104 4 d 4 103 Sarah On Tue, Jan 4, 2011 at 8:31 AM, Nevil Amos wrote: > How do I keep the linking column[s] in a merge()? > ?I need to use the values again in a further merge. > > thanks > > Nevil Amos > -- Sarah Goslee http://www.functionaldiversity.org From diegopujoni at gmail.com Tue Jan 4 15:31:03 2011 From: diegopujoni at gmail.com (Diego Pujoni) Date: Tue, 4 Jan 2011 12:31:03 -0200 Subject: [R] How to make a Cluster of Clusters Message-ID: Dear R-help, In my Master thesis I measured 10 variables from 18 lakes. These measurements were taken 4 times a year in 3 depths, so I have 12 samples from each lake. I know that 12 samples can not be treated as replications, since they don't correspond to the same environmental characteristics and are not statistically independent, but I want to use these 12 samples as an estimate of an annual range the 18 lakes have of the 10 variables. I want to make a cluster analysis of the 18 lakes and my known possibilities were: 1- Make an average of the 12 samples from each lake and make the cluster (Using ward's method); 2- Use all 216 samples (18*12) to make the cluster (Which yields a mess). But I thought I could begin the cluster algorithm already with 18 clusters (Lakes) each with 12 individuals (samples) and normally proceed with the calculations (using ward's method). So I will obtain a cluster of the 18 lakes, but using the 12 samples. I got the cluster Fortran algorithm and I'm trying to translate it to the R language to see how it works and maybe implement this kind of cluster of cluster analysis. Does anyone knows if there is an algorithm that does this? Actually I did it by hand and got very good and meaningful results, but I want to implement it to try another merging criterias. Thanks Diego Pujoni Zooplankton Ecology Laboratory Biological Sciences Institute Federal University of Minas Gerais Brazil From cs.loonstar at gmail.com Tue Jan 4 15:02:23 2011 From: cs.loonstar at gmail.com (LOON88) Date: Tue, 4 Jan 2011 06:02:23 -0800 (PST) Subject: [R] Calendar in R-program Message-ID: <1294149743785-3173566.post@n4.nabble.com> Hey. I have to do calendar in program R. I was looking for examples on this forum but havent found it. Can someone help me in this thing ? I would be really appreciate for that. Calendar should be the same as we have in Windows but I really dont know how to begin it. Hope u can show me the best way to do it. Cheers -- View this message in context: http://r.789695.n4.nabble.com/Calendar-in-R-program-tp3173566p3173566.html Sent from the R help mailing list archive at Nabble.com. From hadley at rice.edu Tue Jan 4 15:14:50 2011 From: hadley at rice.edu (Hadley Wickham) Date: Tue, 4 Jan 2011 14:14:50 +0000 Subject: [R] [R-pkgs] plyr 1.4 Message-ID: # plyr plyr is a set of tools for a common set of problems: you need to __split__ up a big data structure into homogeneous pieces, __apply__ a function to each piece and then __combine__ all the results back together. For example, you might want to: * fit the same model each patient subsets of a data frame * quickly calculate summary statistics for each group * perform group-wise transformations like scaling or standardising It's already possible to do this with base R functions (like split and the apply family of functions), but plyr makes it all a bit easier with: * totally consistent names, arguments and outputs * convenient parallelisation through the foreach package * input from and output to data.frames, matrices and lists * progress bars to keep track of long running operations * built-in error recovery, and informative error messages * labels that are maintained across all transformations Considerable effort has been put into making plyr fast and memory efficient, and in many cases plyr is as fast as, or faster than, the built-in functions. You can find out more at http://had.co.nz/plyr/, including a 20 page introductory guide, http://had.co.nz/plyr/plyr-intro.pdf. You can ask questions about plyr (and data-manipulation in general) on the plyr mailing list. Sign up at http://groups.google.com/group/manipulatr Version 1.4 (2011-01-03) ------------------------------------------------------------------------------ * `count` now takes an additional parameter `wt_var` which allows you to compute weighted sums. This is as fast, or faster than, `tapply` or `xtabs`. * Really fix bug in `names.quoted` * `.` now captures the environment in which it was evaluated. This should fix an esoteric class of bugs which no-one probably ever encountered, but will form the basis for an improved version of `ggplot2::aes`. Version 1.3.1 (2010-12-30) ------------------------------------------------------------------------------ * Fix bug in `names.quoted` that interfered with ggplot2 Version 1.3 (2010-12-28) ------------------------------------------------------------------------------ NEW FEATURES * new function `mutate` that works like transform to add new columns or overwrite existing columns, but computes new columns iteratively so later transformations can use columns created by earlier transformations. (It's also about 10x faster) (Fixes #21) BUG FIXES * split column names are no longer coerced to valid R names. * `quickdf` now adds names if missing * `summarise` preserves variable names if explicit names not provided (Fixes #17) * `arrays` with names should be sorted correctly once again (also fixed a bug in the test case that prevented me from catching this automatically) * `m_ply` no longer possesses .parallel argument (mistakenly added) * `ldply` (and hence `adply` and `ddply`) now correctly passes on .parallel argument (Fixes #16) * `id` uses a better strategy for converting to integers, making it possible to use for cases with larger potential numbers of combinations -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages From hadley at rice.edu Tue Jan 4 15:16:02 2011 From: hadley at rice.edu (Hadley Wickham) Date: Tue, 4 Jan 2011 14:16:02 +0000 Subject: [R] [R-pkgs] reshape2 1.1 Message-ID: Reshape2 is a reboot of the reshape package. It's been over five years since the first release of the package, and in that time I've learned a tremendous amount about R programming, and how to work with data in R. Reshape2 uses that knowledge to make a new package for reshaping data that is much more focussed and much much faster. This version improves speed at the cost of functionality, so I have renamed it to `reshape2` to avoid causing problems for existing users. Based on user feedback I may reintroduce some of these features. What's new in `reshape2`: * considerably faster and more memory efficient thanks to a much better underlying algorithm that uses the power and speed of subsetting to the fullest extent, in most cases only making a single copy of the data. * cast is replaced by two functions depending on the output type: `dcast` produces data frames, and `acast` produces matrices/arrays. * multidimensional margins are now possible: `grand_row` and `grand_col` have been dropped: now the name of the margin refers to the variable that has its value set to (all). * some features have been removed such as the `|` cast operator, and the ability to return multiple values from an aggregation function. I'm reasonably sure both these operations are better performed by plyr. * a new cast syntax which allows you to reshape based on functions of variables (based on the same underlying syntax as plyr): * better development practices like namespaces and tests. Initial benchmarking has shown `melt` to be up to 10x faster, pure reshaping `cast` up to 100x faster, and aggregating `cast()` up to 10x faster. This work has been generously supported by BD (Becton Dickinson). Version 1.1 ----------- * `melt.data.frame` no longer turns characters into factors * All melt methods gain a `na.rm` and `value.name` arguments - these previously were only possessed by `melt.data.frame` (Fixes #5) -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages From Louisa_Lafrez at hotmail.com Tue Jan 4 15:37:47 2011 From: Louisa_Lafrez at hotmail.com (Louisa) Date: Tue, 4 Jan 2011 06:37:47 -0800 (PST) Subject: [R] Inverse Gaussian Distribution In-Reply-To: References: <1294085029867-3172533.post@n4.nabble.com> <1294145404452-3173468.post@n4.nabble.com> Message-ID: <1294151867396-3173579.post@n4.nabble.com> Thank you again David! I did not try it yet, cause neither the dataset nor R is on this computer. I'll try it in a few hours, as soon as possible, when I'm on my personal computer. I'll let you know if it works. I'm really curious! Thank you for your time! Best Wishes, Louisa -- View this message in context: http://r.789695.n4.nabble.com/Inverse-Gaussian-Distribution-tp3172533p3173579.html Sent from the R help mailing list archive at Nabble.com. From jsorkin at grecc.umaryland.edu Tue Jan 4 15:48:37 2011 From: jsorkin at grecc.umaryland.edu (John Sorkin) Date: Tue, 04 Jan 2011 09:48:37 -0500 Subject: [R] Page eject and clearing the console In-Reply-To: <4D2312CF.5030202@univie.ac.at> References: <4D2312CF.5030202@univie.ac.at> Message-ID: <4D22ECF5020000CB0007D05D@medicine.umaryland.edu> (1) I know that \n when used in cat, e.g. cat("\n") produces a line feed (i.e. skips to the next line). Is there any escape sequence that will go to the top of the next page? (2) I know that control L will clear the console. Is there an equivalent function or other means that can be used in R code to clear the console? Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} From alejandro.rodriguez.cuevas at gmail.com Tue Jan 4 16:15:46 2011 From: alejandro.rodriguez.cuevas at gmail.com (CALEF ALEJANDRO RODRIGUEZ CUEVAS) Date: Tue, 4 Jan 2011 09:15:46 -0600 Subject: [R] uroot Package and R 2.12.1 Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ligges at statistik.tu-dortmund.de Tue Jan 4 16:23:44 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Tue, 04 Jan 2011 16:23:44 +0100 Subject: [R] uroot Package and R 2.12.1 In-Reply-To: References: Message-ID: <4D233B80.30602@statistik.tu-dortmund.de> On 04.01.2011 16:15, CALEF ALEJANDRO RODRIGUEZ CUEVAS wrote: > Hello friends. > > I'm wondering what happened to package uroot. I worked quite well with > older versions of R, however with 2.12.1 version it simply doesn't work. > > The worst thing is that I look for it in the contributed packages and it > simply doesn't appear. It was removed from the main repository since it did not work and was not fixed by its maintainer. Given the compatible license: Feel free to get it from the archives, fix it, and re-upload a new version to CRAN. Best, Uwe Ligges > I want to develop ADF test with seasonal (centered) dummies, is there any > other possible package that contains this test? > > Thanks a lot. > > Regards > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ying.zhang at struq.com Tue Jan 4 16:52:12 2011 From: ying.zhang at struq.com (ying zhang) Date: Tue, 4 Jan 2011 15:52:12 -0000 Subject: [R] an error about JRI Message-ID: <01ac01cbac27$5bbb61a0$133224e0$@struq.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dieter.menne at menne-biomed.de Tue Jan 4 17:02:16 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Tue, 4 Jan 2011 08:02:16 -0800 (PST) Subject: [R] lattice: how to "center" a title? In-Reply-To: <8D33361D-F6CC-4485-91C5-81BC16107E6C@web.de> References: <8D33361D-F6CC-4485-91C5-81BC16107E6C@web.de> Message-ID: <1294156936042-3173800.post@n4.nabble.com> mhofert wrote: > > trellis.device("pdf", width = 5, height = 5) > print(xyplot(0 ~ 0, main = "This title is not 'centered' for the human's > eye", scales = list(alternating = c(1,1), tck = c(1,0)))) > dev.off() > > ... the title does not seem to be "centered" for the human's eye [although > it is centered when the plot (width) is considered with the y-axis label]. > This is because there is a y label, and centering is on the page (as you noted). One way around this would be to add a similar padding at the right side. See the example below, where I have exaggerated the effect. Try a padding of 5 instead. Dieter library(lattice) trellis.device("pdf", width = 5, height = 5) trellis.par.set(layout.widths = list(right.padding = 10)) print(xyplot(0 ~ 0, main = "This title is not 'centered' for the human's eye", scales = list(alternating = c(1,1), tck = c(1,0)))) dev.off() -- View this message in context: http://r.789695.n4.nabble.com/lattice-how-to-center-a-title-tp3173271p3173800.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Tue Jan 4 17:08:55 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 4 Jan 2011 11:08:55 -0500 Subject: [R] lattice: par.settings with standard.theme() + additional arguments? In-Reply-To: References: <75668119-4D35-4A72-A23F-1B624F2BA6AA@web.de> Message-ID: On Jan 4, 2011, at 8:39 AM, Marius Hofert wrote: > Dear David, > > this I already tried. But as you can see, the plot itself *is* > colored. However, I want to have color = FALSE, so, unfortunately, > this approach does not work... Quite right. I didn't see that until you pointed it out. I had many failed attempts at a solution and then searched for prior positngs using par.settings and standard.theme and found one by Ehlers earlier this year that seems to work when modified to your ends: xyplot(x ~ 1:10, type = "l", par.settings = modifyList(standard.theme(color = FALSE), list(par.xlab.text = list(cex = 5) ))) I had tried various constructions that I thought would be equivalent, but I think I was getting the levels of the list structure wrong. This also works: xyplot(x ~ 1:10, type = "l", par.setting=c(list(par.xlab.text = list(cex = 5), standard.theme(color = FALSE)) ) Notice the use of c() rather than list() to bind them together. When I looked at the output of standard.theme(color=FALSE), there was no par.xlab.text element so it seemed as though there should be no conflict and in fact there wasn't (except for the asynchrony between my brain and the R interpreter.) best; David. > > Cheers, > > Marius > > On 2011-01-04, at 14:32 , David Winsemius wrote: > >> >> On Jan 4, 2011, at 5:57 AM, Marius Hofert wrote: >> >>> Dear expeRts, >>> >>> I usually use par.settings = standard.theme(color = FALSE) to >>> create lattice graphics >>> without colors, so something like >>> >>> library(lattice) >>> x <- runif(10) >>> xyplot(x ~ 1:10, type = "l", par.settings = standard.theme(color = >>> FALSE)) >>> >>> Now I would like to use an additional component in par.settings. I >>> tried several things >>> like >>> >>> xyplot(x ~ 1:10, type = "l", par.settings = c(standard.theme(color >>> = FALSE), list(par.xlab.text = list(cex = 5, col = "blue")))) >>> >>> but it doesn't work. I know I could use lattice.options() but is >>> there a way to get it >>> right ("locally") with par.settings? >> >> Add it as a list element: >> >> xyplot(x ~ 1:10, type = "l", par.settings = >> list(standard.theme(color = FALSE), par.xlab.text = list(cex = 5, >> col = "blue"))) >> >> -- >> >> David Winsemius, MD >> West Hartford, CT >> > David Winsemius, MD West Hartford, CT From dieter.menne at menne-biomed.de Tue Jan 4 17:24:02 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Tue, 4 Jan 2011 08:24:02 -0800 (PST) Subject: [R] Resampling to find Confidence intervals In-Reply-To: References: Message-ID: <1294158242553-3173846.post@n4.nabble.com> Axolotl9250 wrote: > > ... > resampled_ecoli = sample(ecoli, 500, replace=T) > coefs = (coef(lm(MIC. ~ 1 + Challenge + Cleaner + Replicate, > data=resampled_ecoli))) > sd(coefs) > > ... > Below a simplified and self-consistent version of your code, and some changes Dieter # resample d = data.frame(x=rnorm(10)) d$y = d$x*3+rnorm(10,0.01) # if you do this, you only get ONE bootstrap sample d1 = d[sample(1:nrow(d),10,TRUE),] d1.coef = coef(lm(y~x,data=d1)) d1.coef # No error below, because you compute the sd of (Intercept) and slope # but result is wrong! sd(d1.coef) # We have to do this over and over # Check ?replicate for a more R-ish approach.... nsamples = 1000 allboot = NULL for (i in 1:1000) { d1 = d[sample(1:nrow(d),10,TRUE),] d1.coef = coef(lm(y~x,data=d1)) allboot = rbind(allboot,d1.coef) # Not very efficient, preallocate! } head(allboot) # display first of nsamples lines apply(allboot,2,mean) # Compute mean apply(allboot,2,sd) # compute sd # After you are sure you understood the above, you might try package boot. -- View this message in context: http://r.789695.n4.nabble.com/Resampling-to-find-Confidence-intervals-tp3172867p3173846.html Sent from the R help mailing list archive at Nabble.com. From andy_liaw at merck.com Tue Jan 4 17:24:03 2011 From: andy_liaw at merck.com (Liaw, Andy) Date: Tue, 4 Jan 2011 11:24:03 -0500 Subject: [R] randomForest speed improvements In-Reply-To: <1294097299183-3172834.post@n4.nabble.com> References: <1294084769056-3172523.post@n4.nabble.com> <1294097299183-3172834.post@n4.nabble.com> Message-ID: If you have multiple cores, one "poor man's solution" is to run separate forests in different R sessions, save the RF objects, load them into the same session and combine() them. You can do this less clumsily if you use things like Rmpi or other distributed computing packages. Another consideration is to increase nodesize (which reduces the sizes of trees). The problem with numeric predictors for tree-based algorithms is that the number of computations to find the best splitting point increases by that much _at each node_. Some algorithms try to save on this by using only certain quantiles. The current RF code doesn't do this. Andy > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of apresley > Sent: Monday, January 03, 2011 6:28 PM > To: r-help at r-project.org > Subject: Re: [R] randomForest speed improvements > > > I haven't tried changing the mtry or ntree at all ... though > I suppose with > only 6 variables, and tens-of-thousands of rows, we can > probably do less > than 500 tree's (the default?). > > Although tossing the forest does speed things up a bit, seems > to be about 15 > - 20% faster in some cases, I need to keep the forest to do > the prediction, > otherwise, it complains that there is no forest component in > the object. > > -- > Anthony > -- > View this message in context: > http://r.789695.n4.nabble.com/randomForest-speed-improvements- > tp3172523p3172834.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Notice: This e-mail message, together with any attachme...{{dropped:11}} From myotistwo at gmail.com Tue Jan 4 17:27:49 2011 From: myotistwo at gmail.com (Graham Smith) Date: Tue, 4 Jan 2011 16:27:49 +0000 Subject: [R] Cost-benefit/value for money analysis Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ripley at stats.ox.ac.uk Tue Jan 4 17:35:12 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Tue, 4 Jan 2011 16:35:12 +0000 (GMT) Subject: [R] an error about JRI In-Reply-To: <01ac01cbac27$5bbb61a0$133224e0$@struq.com> References: <01ac01cbac27$5bbb61a0$133224e0$@struq.com> Message-ID: AFAIK JRI is part of rJava (and you installed rJava), which has its own mailing list: plasea use it. (http://rosuda.org/lists.shtml, I believe). There are tricky things with JRI on multi-architecture platforms, so you do need to be sure you are using the right versions (and rJava 0.8-8 is needed for 64-bit Windows R, for example). On Tue, 4 Jan 2011, ying zhang wrote: > Hi everyone, I try to run my R script in Java, thus I installed JRI. and run > the example, I am using Eclipse on 64 bits windows 7. part of the example > code is as follows: > > > > public static void main(String[] args) { > > System.out.println("Creating Rengine (with arguments)"); > > Rengine re=new Rengine(args, false, null); > > System.out.println("Rengine created, waiting for R"); > > if (!re.waitForR()) { > > System.out.println("Cannot load R"); > > return; > > } > > However, everytime I run it. it teminated after print out "Creating Rengine > (with arguments)" never successfully print out "Rengine created, waiting for > R" > > > > I do not know what is right argument to input, I have tried to add > "--no-save" under the Program arguments of eclipse run configuration, but > still does not help. > > > > any suggestions? Many thanks > > > > > > Ying > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From dwinsemius at comcast.net Tue Jan 4 17:37:24 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 4 Jan 2011 11:37:24 -0500 Subject: [R] lattice: par.settings with standard.theme() + additional arguments? In-Reply-To: References: <75668119-4D35-4A72-A23F-1B624F2BA6AA@web.de> Message-ID: <7160ED63-9B65-43AA-9B85-E8AF3782EABD@comcast.net> On Jan 4, 2011, at 11:08 AM, David Winsemius wrote: > > On Jan 4, 2011, at 8:39 AM, Marius Hofert wrote: > >> Dear David, >> >> this I already tried. But as you can see, the plot itself *is* >> colored. However, I want to have color = FALSE, so, unfortunately, >> this approach does not work... > > Quite right. I didn't see that until you pointed it out. I had many > failed attempts at a solution and then searched for prior positngs > using par.settings and standard.theme and found one by Ehlers > earlier this year that seems to work when modified to your ends: > > xyplot(x ~ 1:10, type = "l", par.settings = > modifyList(standard.theme(color = FALSE), > list(par.xlab.text = list(cex = 5) ))) > > I had tried various constructions that I thought would be > equivalent, but I think I was getting the levels of the list > structure wrong. This also works: ### NO, it doesn't. I failed to notice that the command was incomplete so was looking at my prior plot. > > xyplot(x ~ 1:10, type = "l", par.setting=c(list(par.xlab.text = > list(cex = 5), standard.theme(color = FALSE)) ) > Proving that I still do not really understand why the modifyList approach works and what seemed to be equivalents do not. (Next paragraph is wrong.) > Notice the use of c() rather than list() to bind them together. When > I looked at the output of standard.theme(color=FALSE), there was no > par.xlab.text element so it seemed as though there should be no > conflict and in fact there wasn't (except for the asynchrony between > my brain and the R interpreter.) > > best; > David. >> >> Cheers, >> >> Marius >> >> On 2011-01-04, at 14:32 , David Winsemius wrote: >> >>> >>> On Jan 4, 2011, at 5:57 AM, Marius Hofert wrote: >>> >>>> Dear expeRts, >>>> >>>> I usually use par.settings = standard.theme(color = FALSE) to >>>> create lattice graphics >>>> without colors, so something like >>>> >>>> library(lattice) >>>> x <- runif(10) >>>> xyplot(x ~ 1:10, type = "l", par.settings = standard.theme(color >>>> = FALSE)) >>>> >>>> Now I would like to use an additional component in par.settings. >>>> I tried several things >>>> like >>>> >>>> xyplot(x ~ 1:10, type = "l", par.settings = >>>> c(standard.theme(color = FALSE), list(par.xlab.text = list(cex = >>>> 5, col = "blue")))) >>>> >>>> but it doesn't work. I know I could use lattice.options() but is >>>> there a way to get it >>>> right ("locally") with par.settings? >>> >>> Add it as a list element: >>> >>> xyplot(x ~ 1:10, type = "l", par.settings = >>> list(standard.theme(color = FALSE), par.xlab.text = list(cex = 5, >>> col = "blue"))) >>> >>> -- >>> >>> David Winsemius, MD >>> West Hartford, CT >>> >> > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From joonair at gmail.com Tue Jan 4 17:46:16 2011 From: joonair at gmail.com (joonR) Date: Tue, 4 Jan 2011 08:46:16 -0800 (PST) Subject: [R] plot without points overlap Message-ID: <1294159576846-3173894.post@n4.nabble.com> Hi, I'm trying to plot a grid of points of different dimensions using the simple plot() function. I want to plot the points such that they DO NOT overlap, I guess there should be a way to set a maximum distance between the points, but I cannot find it. Can you help? Thanks a lot! g PS: Is it possible to produce device regions of different dimensions? (i.e. a rectangular one with height > width) -- View this message in context: http://r.789695.n4.nabble.com/plot-without-points-overlap-tp3173894p3173894.html Sent from the R help mailing list archive at Nabble.com. From m_hofert at web.de Tue Jan 4 17:48:38 2011 From: m_hofert at web.de (Marius Hofert) Date: Tue, 4 Jan 2011 17:48:38 +0100 Subject: [R] lattice: par.settings with standard.theme() + additional arguments? In-Reply-To: <7160ED63-9B65-43AA-9B85-E8AF3782EABD@comcast.net> References: <75668119-4D35-4A72-A23F-1B624F2BA6AA@web.de> <7160ED63-9B65-43AA-9B85-E8AF3782EABD@comcast.net> Message-ID: <897F54BC-DDBF-41EF-B778-3DE7C6D3F004@web.de> Dear David, that's funny to read. I guess we did pretty much the same. I also thought I got the list structure wrong and I also tried c()... (since I recently learned from Gabor that a list is only a vector of mode list).... and I also searched for posts with exactly the same words [but wasn't as successful as you :-)]. Thanks for the modifyList()-trick! Cheers, Marius On 2011-01-04, at 17:37 , David Winsemius wrote: > > On Jan 4, 2011, at 11:08 AM, David Winsemius wrote: > >> >> On Jan 4, 2011, at 8:39 AM, Marius Hofert wrote: >> >>> Dear David, >>> >>> this I already tried. But as you can see, the plot itself *is* colored. However, I want to have color = FALSE, so, unfortunately, this approach does not work... >> >> Quite right. I didn't see that until you pointed it out. I had many failed attempts at a solution and then searched for prior positngs using par.settings and standard.theme and found one by Ehlers earlier this year that seems to work when modified to your ends: >> >> xyplot(x ~ 1:10, type = "l", par.settings = modifyList(standard.theme(color = FALSE), >> list(par.xlab.text = list(cex = 5) ))) >> >> I had tried various constructions that I thought would be equivalent, but I think I was getting the levels of the list structure wrong. This also works: > ### NO, it doesn't. I failed to notice that the command was incomplete so was looking at my prior plot. >> >> xyplot(x ~ 1:10, type = "l", par.setting=c(list(par.xlab.text = list(cex = 5), standard.theme(color = FALSE)) ) >> > Proving that I still do not really understand why the modifyList approach works and what seemed to be equivalents do not. (Next paragraph is wrong.) > >> Notice the use of c() rather than list() to bind them together. When I looked at the output of standard.theme(color=FALSE), there was no par.xlab.text element so it seemed as though there should be no conflict and in fact there wasn't (except for the asynchrony between my brain and the R interpreter.) >> >> best; >> David. >>> >>> Cheers, >>> >>> Marius >>> >>> On 2011-01-04, at 14:32 , David Winsemius wrote: >>> >>>> >>>> On Jan 4, 2011, at 5:57 AM, Marius Hofert wrote: >>>> >>>>> Dear expeRts, >>>>> >>>>> I usually use par.settings = standard.theme(color = FALSE) to create lattice graphics >>>>> without colors, so something like >>>>> >>>>> library(lattice) >>>>> x <- runif(10) >>>>> xyplot(x ~ 1:10, type = "l", par.settings = standard.theme(color = FALSE)) >>>>> >>>>> Now I would like to use an additional component in par.settings. I tried several things >>>>> like >>>>> >>>>> xyplot(x ~ 1:10, type = "l", par.settings = c(standard.theme(color = FALSE), list(par.xlab.text = list(cex = 5, col = "blue")))) >>>>> >>>>> but it doesn't work. I know I could use lattice.options() but is there a way to get it >>>>> right ("locally") with par.settings? >>>> >>>> Add it as a list element: >>>> >>>> xyplot(x ~ 1:10, type = "l", par.settings = list(standard.theme(color = FALSE), par.xlab.text = list(cex = 5, col = "blue"))) >>>> >>>> -- >>>> >>>> David Winsemius, MD >>>> West Hartford, CT >>>> >>> >> >> David Winsemius, MD >> West Hartford, CT >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > From jsorkin at grecc.umaryland.edu Tue Jan 4 17:54:28 2011 From: jsorkin at grecc.umaryland.edu (John Sorkin) Date: Tue, 04 Jan 2011 11:54:28 -0500 Subject: [R] Page eject and clearing the console Message-ID: <4D230A74020000CB0007D0B6@medicine.umaryland.edu> (1) I know that \n when used in cat, e.g. cat("\n") produces a line feed (i.e. skips to the next line). Is there any escape sequence that will go to the top of the next page? (2) I know that control L will clear the console. Is there an equivalent function or other means that can be used in R code to clear the console? Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} From wwwhsd at gmail.com Tue Jan 4 17:58:57 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Tue, 4 Jan 2011 14:58:57 -0200 Subject: [R] Page eject and clearing the console In-Reply-To: <4D230A74020000CB0007D0B6@medicine.umaryland.edu> References: <4D230A74020000CB0007D0B6@medicine.umaryland.edu> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sarah.goslee at gmail.com Tue Jan 4 18:00:58 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Tue, 4 Jan 2011 12:00:58 -0500 Subject: [R] Page eject and clearing the console In-Reply-To: <4D230A74020000CB0007D0B6@medicine.umaryland.edu> References: <4D230A74020000CB0007D0B6@medicine.umaryland.edu> Message-ID: Hi John, I don't understand what you mean by "top of the next page", or rather, how that differs from clearing the screen. And for the latter, that is dependent on OS, and on GUI/console usage, and has been discussed several times on the list. The easiest solution is to invoke the system command if running in a console. For linux, that would be: system("clear") A Google search using the exact words from your question found several more options, including a discussion of doing this in Windows. http://www.google.com/search?q=R+code+to+clear+the+console I can't test them for you, but since I don't know if you're using Windows or not it may not matter. Sarah On Tue, Jan 4, 2011 at 11:54 AM, John Sorkin wrote: > (1) I know that \n when used in cat, e.g. cat("\n") produces a line feed (i.e. skips to the next line). Is there any escape sequence that will go to the top of the next page? > (2) I know that control L will clear the console. Is there an equivalent function or other means that can be used in R code to clear the console? > > Thanks, > John > -- Sarah Goslee http://www.functionaldiversity.org From diasandre at gmail.com Tue Jan 4 17:58:31 2011 From: diasandre at gmail.com (ADias) Date: Tue, 4 Jan 2011 08:58:31 -0800 (PST) Subject: [R] Help with "For" instruction In-Reply-To: <1294140667706-3173386.post@n4.nabble.com> References: <1294120817274-3173074.post@n4.nabble.com> <929370234F6A43EDB0971FE30AB25367@Aragorn> <1294140667706-3173386.post@n4.nabble.com> Message-ID: <1294160311199-3173914.post@n4.nabble.com> Hi, Still with the above problem: But for instance, i have a data base with 30 variables and I created an object each with one varibale missing: DataBase - has 30 variables DataBase1 has 29 variables with the 1st variable gone DataBase2 has 29 variables with the 2nd variable gone for(i in 1:length(database)) assign(paste("database",i,sep=""),database[-i]) Now, I wish to create the 30 distance matrix: for (i in 1:length(database)) assign(paste("distancematrix",i,sep=""), dist(database[i])) But doing like this - database[i] - I am just refering to the 1st value on the object database and not to the entire database i. How do I do this? thanks Regards, A.Dias -- View this message in context: http://r.789695.n4.nabble.com/Help-with-For-instruction-tp3173074p3173914.html Sent from the R help mailing list archive at Nabble.com. From eduardo.oliveirahorta at gmail.com Tue Jan 4 18:05:22 2011 From: eduardo.oliveirahorta at gmail.com (Eduardo de Oliveira Horta) Date: Tue, 4 Jan 2011 15:05:22 -0200 Subject: [R] Saving objects inside a list In-Reply-To: <77EB52C6DD32BA4D87471DCD70C8D70003C2DF80@NA-PA-VBE03.na.tibco.com> References: <4D23077B.20807@mines-ales.fr> <77EB52C6DD32BA4D87471DCD70C8D70003C2DF80@NA-PA-VBE03.na.tibco.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ggrothendieck at gmail.com Tue Jan 4 18:05:53 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Tue, 4 Jan 2011 12:05:53 -0500 Subject: [R] Page eject and clearing the console In-Reply-To: <4D22ECF5020000CB0007D05D@medicine.umaryland.edu> References: <4D2312CF.5030202@univie.ac.at> <4D22ECF5020000CB0007D05D@medicine.umaryland.edu> Message-ID: On Tue, Jan 4, 2011 at 9:48 AM, John Sorkin wrote: > (1) I know that \n when used in cat, e.g. cat("\n") produces a line feed (i.e. skips to the next line). Is there any escape sequence that will go to the top of the next page? > (2) I know that control L will clear the console. Is there an equivalent function or other means that can be used in R code to clear the console? > Its always a good idea to look through the archives before posting. I had posted this 5 years ago. It uses RDCOMClient: http://www.mail-archive.com/r-help at stat.math.ethz.ch/msg57802.html and also posted a second version using rcom in the same thread. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From jwiley.psych at gmail.com Tue Jan 4 18:08:26 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Tue, 4 Jan 2011 09:08:26 -0800 Subject: [R] plot without points overlap In-Reply-To: <1294159576846-3173894.post@n4.nabble.com> References: <1294159576846-3173894.post@n4.nabble.com> Message-ID: Hi, You can set the device region in inches using the "pin" argument (see ?par maybe halfway down or so). You can also set the aspect ratio in plot(), but I am not sure that is really what you want (see ?plot.window for that). Two Examples ####### par(pin = c(2, 4)) plot(1:10) dev.off() plot(1:10, asp = 2) ####### Hope that helps, Josh On Tue, Jan 4, 2011 at 8:46 AM, joonR wrote: > > Hi, > > I'm trying to plot a grid of points of different dimensions using the simple > plot() function. > I want to plot the points such that they DO NOT overlap, > I guess there should be a way to set a maximum distance between the points, > but I cannot find it. > > Can you help? > Thanks a lot! > > g > > PS: Is it possible to produce device regions of different dimensions? > (i.e. a rectangular one with height > width) > -- > View this message in context: http://r.789695.n4.nabble.com/plot-without-points-overlap-tp3173894p3173894.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From m_hofert at web.de Tue Jan 4 18:09:04 2011 From: m_hofert at web.de (mhofert) Date: Tue, 4 Jan 2011 09:09:04 -0800 (PST) Subject: [R] lattice: how to "center" a title? In-Reply-To: <1294156936042-3173800.post@n4.nabble.com> References: <8D33361D-F6CC-4485-91C5-81BC16107E6C@web.de> <1294156936042-3173800.post@n4.nabble.com> Message-ID: <1294160944219-3173931.post@n4.nabble.com> Dear Dieter, many thanks, exactly what I was looking for. Cheers, Marius -- View this message in context: http://r.789695.n4.nabble.com/lattice-how-to-center-a-title-tp3173271p3173931.html Sent from the R help mailing list archive at Nabble.com. From msharp at sfbr.org Tue Jan 4 18:32:04 2011 From: msharp at sfbr.org (Mark Sharp) Date: Tue, 4 Jan 2011 11:32:04 -0600 Subject: [R] Calendar in R-program In-Reply-To: <1294149743785-3173566.post@n4.nabble.com> References: <1294149743785-3173566.post@n4.nabble.com> Message-ID: <3C1E0CC5-D58B-4883-A6F6-BB09F40F27AA@sfbr.org> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From f.harrell at vanderbilt.edu Tue Jan 4 18:39:26 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Tue, 4 Jan 2011 09:39:26 -0800 (PST) Subject: [R] packagename:::functionname vs. importFrom In-Reply-To: References: <1294091145769-3172684.post@n4.nabble.com> <1294109427406-3172984.post@n4.nabble.com> Message-ID: <1294162766395-3174003.post@n4.nabble.com> Thanks Luke. By "the namespace from which you import is loaded when your package is" I take it that you are saying that all such referenced packages are loaded up front, which is not what I hoped. And it's too bad you can't import unexported objects, as that rather defeats the purpose of importFrom. Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/packagename-functionname-vs-importFrom-tp3172684p3174003.html Sent from the R help mailing list archive at Nabble.com. From f.harrell at vanderbilt.edu Tue Jan 4 18:43:39 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Tue, 4 Jan 2011 09:43:39 -0800 (PST) Subject: [R] packagename:::functionname vs. importFrom In-Reply-To: <4D22C54F.9090803@fhcrc.org> References: <1294091145769-3172684.post@n4.nabble.com> <1294109427406-3172984.post@n4.nabble.com> <4D22C54F.9090803@fhcrc.org> Message-ID: <1294163019569-3174011.post@n4.nabble.com> Thanks Hadley, Luke, Martin, and Bill. Bill captured the essence of my reasons for needing an unexported function. Other reasons include occasional overrides of dispatching rules, and providing parallel versions of some functions that exist in other packages to make them easier to use in some sense. I could ask the package creator to export those functions but it would not make sense to him to do so. Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/packagename-functionname-vs-importFrom-tp3172684p3174011.html Sent from the R help mailing list archive at Nabble.com. From diasandre at gmail.com Tue Jan 4 18:55:26 2011 From: diasandre at gmail.com (=?ISO-8859-1?Q?Andr=E9_Dias?=) Date: Tue, 4 Jan 2011 17:55:26 +0000 Subject: [R] Help with "For" instruction In-Reply-To: References: <1294120817274-3173074.post@n4.nabble.com> <929370234F6A43EDB0971FE30AB25367@Aragorn> <1294140667706-3173386.post@n4.nabble.com> <1294160311199-3173914.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From benjamin.ward at bathspa.org Tue Jan 4 18:56:35 2011 From: benjamin.ward at bathspa.org (Ben Ward) Date: Tue, 4 Jan 2011 17:56:35 +0000 Subject: [R] Resampling to find Confidence intervals In-Reply-To: <1294158242553-3173846.post@n4.nabble.com> References: <1294158242553-3173846.post@n4.nabble.com> Message-ID: Ok I'll check I understand: So it's using sample, to resample d once, 10 values, because the rnorm has 10 values, with replacement (I assume thats the TRUE part). Then a for loop has this to resample the data - in the loop's case its 1000 times. Then it does a lm to get the coefficients and add them to d1.coef. I'm guessing that the allboot bit with rbind, which is null at the start of the loop, is the collection of d1.coef values, as I think that without it, every cycle of the loop the d1.coef from the previous cycle round the loop would be gone? On 04/01/2011 16:24, Dieter Menne wrote: Axolotl9250 wrote: >> ... >> resampled_ecoli = sample(ecoli, 500, replace=T) >> coefs = (coef(lm(MIC. ~ 1 + Challenge + Cleaner + Replicate, >> data=resampled_ecoli))) >> sd(coefs) >> >> ... >> > Below a simplified and self-consistent version of your code, and some > changes > > Dieter > > # resample > d = data.frame(x=rnorm(10)) > d$y = d$x*3+rnorm(10,0.01) > > # if you do this, you only get ONE bootstrap sample > d1 = d[sample(1:nrow(d),10,TRUE),] > d1.coef = coef(lm(y~x,data=d1)) > d1.coef > # No error below, because you compute the sd of (Intercept) and slope > # but result is wrong! > sd(d1.coef) > > # We have to do this over and over > # Check ?replicate for a more R-ish approach.... > nsamples = 1000 > allboot = NULL > for (i in 1:1000) { > d1 = d[sample(1:nrow(d),10,TRUE),] > d1.coef = coef(lm(y~x,data=d1)) > allboot = rbind(allboot,d1.coef) # Not very efficient, preallocate! > } > head(allboot) # display first of nsamples lines > apply(allboot,2,mean) # Compute mean > apply(allboot,2,sd) # compute sd > # After you are sure you understood the above, you might try package boot. > > > > > From sarah.goslee at gmail.com Tue Jan 4 18:58:39 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Tue, 4 Jan 2011 12:58:39 -0500 Subject: [R] Help with "For" instruction In-Reply-To: References: <1294120817274-3173074.post@n4.nabble.com> <929370234F6A43EDB0971FE30AB25367@Aragorn> <1294140667706-3173386.post@n4.nabble.com> <1294160311199-3173914.post@n4.nabble.com> Message-ID: 2011/1/4 Andr? Dias : > hi > > how do I exactly use the get(). I am reading the help for get() but the way > I am using it causes an error/ > So how are you using it? It's so much easier to explain what you're doing wrong if I know what you're doing. Without a reproducible example I can't show you exactly, but something like: for (i in 1:length(database)) assign(paste("distancematrix",i,sep=""), dist(get(paste("database", i, sep="")))) get() is the counterpart of assign(), though there are better (more R-ish) ways of doing what you want. Sarah > thanks > ADias > > 2011/1/4 Sarah Goslee >> >> With get(). >> >> On Tue, Jan 4, 2011 at 11:58 AM, ADias wrote: >> > >> > Hi, >> > >> > Still with the above problem: >> > >> > But for instance, i have a data base with 30 variables and I created an >> > object each with one varibale missing: >> > >> > DataBase - has 30 variables >> > DataBase1 has 29 variables with the 1st variable gone >> > DataBase2 has 29 variables with the 2nd variable gone >> > >> > for(i in 1:length(database)) >> > assign(paste("database",i,sep=""),database[-i]) >> > >> > >> > Now, I wish to create the 30 distance matrix: >> > >> > for (i in 1:length(database)) >> > assign(paste("distancematrix",i,sep=""), >> > dist(database[i])) >> > >> > But doing like this - database[i] - I am just refering to the 1st value >> > on >> > the object database and not to the entire database i. >> > >> > How do I do this? >> > >> > thanks >> > Regards, >> > A.Dias >> > -- >> -- -- Sarah Goslee http://www.functionaldiversity.org From Louisa_Lafrez at hotmail.com Tue Jan 4 18:47:33 2011 From: Louisa_Lafrez at hotmail.com (Louisa) Date: Tue, 4 Jan 2011 09:47:33 -0800 (PST) Subject: [R] Inverse Gaussian Distribution In-Reply-To: <1294151867396-3173579.post@n4.nabble.com> References: <1294085029867-3172533.post@n4.nabble.com> <1294145404452-3173468.post@n4.nabble.com> <1294151867396-3173579.post@n4.nabble.com> Message-ID: <1294163253253-3174015.post@n4.nabble.com> Dear David, It works! Thank you so much for your help! Louisa -- View this message in context: http://r.789695.n4.nabble.com/Inverse-Gaussian-Distribution-tp3172533p3174015.html Sent from the R help mailing list archive at Nabble.com. From Greg.Snow at imail.org Tue Jan 4 19:03:41 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Tue, 4 Jan 2011 11:03:41 -0700 Subject: [R] Help with "For" instruction In-Reply-To: References: <1294120817274-3173074.post@n4.nabble.com> <929370234F6A43EDB0971FE30AB25367@Aragorn> <1294140667706-3173386.post@n4.nabble.com> <1294160311199-3173914.post@n4.nabble.com> Message-ID: If you had followed David's advice and put everything into a list or other structure instead of using the assign function (see fortune(236)) then you could just access the list element instead of needing get. In the long run (or even medium and short run) life will be much easier for you if you learn to use proper data structures and not programmatically create global variables. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Andr? Dias > Sent: Tuesday, January 04, 2011 10:55 AM > To: Sarah Goslee > Cc: r-help at r-project.org > Subject: Re: [R] Help with "For" instruction > > hi > > how do I exactly use the get(). I am reading the help for get() but the > way > I am using it causes an error/ > > thanks > ADias > > 2011/1/4 Sarah Goslee > > > With get(). > > > > On Tue, Jan 4, 2011 at 11:58 AM, ADias wrote: > > > > > > Hi, > > > > > > Still with the above problem: > > > > > > But for instance, i have a data base with 30 variables and I > created an > > > object each with one varibale missing: > > > > > > DataBase - has 30 variables > > > DataBase1 has 29 variables with the 1st variable gone > > > DataBase2 has 29 variables with the 2nd variable gone > > > > > > for(i in 1:length(database)) > > assign(paste("database",i,sep=""),database[-i]) > > > > > > > > > Now, I wish to create the 30 distance matrix: > > > > > > for (i in 1:length(database)) > > > assign(paste("distancematrix",i,sep=""), > > > dist(database[i])) > > > > > > But doing like this - database[i] - I am just refering to the 1st > value > > on > > > the object database and not to the entire database i. > > > > > > How do I do this? > > > > > > thanks > > > Regards, > > > A.Dias > > > -- > > -- > > Sarah Goslee > > http://www.functionaldiversity.org > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From diasandre at gmail.com Tue Jan 4 19:05:35 2011 From: diasandre at gmail.com (=?ISO-8859-1?Q?Andr=E9_Dias?=) Date: Tue, 4 Jan 2011 18:05:35 +0000 Subject: [R] Help with "For" instruction In-Reply-To: References: <1294120817274-3173074.post@n4.nabble.com> <929370234F6A43EDB0971FE30AB25367@Aragorn> <1294140667706-3173386.post@n4.nabble.com> <1294160311199-3173914.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sarah.goslee at gmail.com Tue Jan 4 19:12:13 2011 From: sarah.goslee at gmail.com (Sarah Goslee) Date: Tue, 4 Jan 2011 13:12:13 -0500 Subject: [R] Help with "For" instruction In-Reply-To: References: <1294120817274-3173074.post@n4.nabble.com> <929370234F6A43EDB0971FE30AB25367@Aragorn> <1294140667706-3173386.post@n4.nabble.com> <1294160311199-3173914.post@n4.nabble.com> Message-ID: You need to swap the get and paste commands - paste() creates the string filename that get() acts on. As already explained, using a list is much nicer. Sarah 2011/1/4 Andr? Dias : > Hi > > I was doing > > for (i in 1:length(database)) > assign(paste("distancematrix",i,sep=""), dist(paste(get("database", i, > sep=""))))) > > but i really did not know what I was doing. I will try?your way. But I still > don't understnad how the get function works. > > What woud be more r-ish then get() ? > > thanks > ADias > -- Sarah Goslee http://www.functionaldiversity.org From spencer.graves at structuremonitoring.com Tue Jan 4 19:14:18 2011 From: spencer.graves at structuremonitoring.com (Spencer Graves) Date: Tue, 04 Jan 2011 10:14:18 -0800 Subject: [R] Calendar in R-program In-Reply-To: <3C1E0CC5-D58B-4883-A6F6-BB09F40F27AA@sfbr.org> References: <1294149743785-3173566.post@n4.nabble.com> <3C1E0CC5-D58B-4883-A6F6-BB09F40F27AA@sfbr.org> Message-ID: <4D23637A.7030408@structuremonitoring.com> If all you want is a calendar "Date", the simplest may be the Date class in the base package. Try help('as.Date'). Dates and times can be extremely difficult to use for many reasons. For example, some months have 30 days, others have 31, and February usually has 28 days, but every 4th year, it has 29, except if the year is a century year, except ... . Times involve arithmetic to multiple bases. In addition, there are occasional leap seconds introduced by irregularities in the rotation of the earth. Then don't forget timezones and holidays, which vary between countries and sometimes between regions within a country. Holidays further depend on the occupation. Financial markets have their own rules. This complexity is met by a wide range of alternative systems in R for coding dates and times. A great overview of the options appeared in R News a few years ago. To find it, go to "r-project.org" -> "The R Journal" -> Archive -> "Table of Contents (all issues)" -> Search for "date". The second match should be "Gabor Grothendieck and Thomas Petzoldt. R Help Desk: Date and time classes in R. R News, 4(1):29-32, June 2004". There have been many additions since 2004. Probably the quickest way to find other options is to use the sos package. library(sos) (cndr <- ???calendar) #found 238 matches; retrieving 12 pages # This will open a table in a web browser showing the 238 matches # sorted by package. # A function from the "lubridate" package is match # 229 on this list. The "sos" package also has a vignette, which provides examples of how to combine searches, write results to an Excel file with the first sheet being a summary by package, etc. Hope this helps. Spencer On 1/4/2011 9:32 AM, Mark Sharp wrote: > Look at the lubridate package from Hadley Wickham for great basic routines for handling date objects. > > Mark > R. Mark Sharp, Ph.D. > msharp at sfbr.org > > > > > On Jan 4, 2011, at 8:02 AM, LOON88 wrote: > > > Hey. > I have to do calendar in program R. I was looking for examples on this forum > but havent found it. Can someone help me in this thing ? I would be really > appreciate for that. Calendar should be the same as we have in Windows but I > really dont know how to begin it. Hope u can show me the best way to do it. > Cheers > -- > View this message in context: http://r.789695.n4.nabble.com/Calendar-in-R-program-tp3173566p3173566.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San Jos?, CA 95126 ph: 408-655-4567 From jsorkin at grecc.umaryland.edu Tue Jan 4 19:19:35 2011 From: jsorkin at grecc.umaryland.edu (John Sorkin) Date: Tue, 04 Jan 2011 13:19:35 -0500 Subject: [R] Page eject and clearing the console In-Reply-To: References: <4D230A74020000CB0007D0B6@medicine.umaryland.edu> Message-ID: <4D231E67020000CB0007D0D8@medicine.umaryland.edu> I have received help on one of my questions (thank you Henrique Jorge and ), viz. how I can clear the console from an R program. I have not yet received help on how I can skip to the top of the next page, i.e. cat("\n") skips to the next line, is there an equivalent way to skip to the top of the next page? Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Henrique Dallazuanna 1/4/2011 11:58 AM >>> Take a look on: http://tolstoy.newcastle.edu.au/R/e11/help/10/09/8463.html On Tue, Jan 4, 2011 at 2:54 PM, John Sorkin wrote: > (1) I know that \n when used in cat, e.g. cat("\n") produces a line feed > (i.e. skips to the next line). Is there any escape sequence that will go to > the top of the next page? > (2) I know that control L will clear the console. Is there an equivalent > function or other means that can be used in R code to clear the console? > > Thanks, > John > > > John David Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > Confidentiality Statement: > This email message, including any attachments, is for\...{{dropped:25}} From dwinsemius at comcast.net Tue Jan 4 19:27:37 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 4 Jan 2011 13:27:37 -0500 Subject: [R] Page eject and clearing the console In-Reply-To: <4D231E67020000CB0007D0D8@medicine.umaryland.edu> References: <4D230A74020000CB0007D0B6@medicine.umaryland.edu> <4D231E67020000CB0007D0D8@medicine.umaryland.edu> Message-ID: On Jan 4, 2011, at 1:19 PM, John Sorkin wrote: > I have received help on one of my questions (thank you Henrique Jorge > and ), viz. how I can clear the console from an R program. > I have not yet received help on how I can skip to the top of the next > page, i.e. cat("\n") skips to the next line, is there an equivalent > way > to skip to the top of the next page? "\n" does NOT "skip to the next line". It is a character and it is interpreted by some sort of program, say a a plotting program or a word-processor as a line feed. You need to specify what sort of program you intend to do this "skipping-to-next-page" action and also provide the character sequence that that program uses to signal that action. (There are not any pages in R except perhaps multi-page plots but you seem to be in character mode at the moment.) -- David. > Thanks, > John > > > > > John David Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> > Henrique Dallazuanna 1/4/2011 11:58 AM >>> > Take a look on: > http://tolstoy.newcastle.edu.au/R/e11/help/10/09/8463.html > > On Tue, Jan 4, 2011 at 2:54 PM, John Sorkin > wrote: > >> (1) I know that \n when used in cat, e.g. cat("\n") produces a line > feed >> (i.e. skips to the next line). Is there any escape sequence that will > go to >> the top of the next page? >> (2) I know that control L will clear the console. Is there an > equivalent >> function or other means that can be used in R code to clear the > console? >> >> Thanks, >> John >> >> >> John David Sorkin M.D., Ph.D. >> Chief, Biostatistics and Informatics >> University of Maryland School of Medicine Division of Gerontology >> Baltimore VA Medical Center >> 10 North Greene Street >> GRECC (BT/18/GR) >> Baltimore, MD 21201-1524 >> (Phone) 410-605-7119 >> (Fax) 410-605-7913 (Please call phone number above prior to faxing) >> >> Confidentiality Statement: >> This email message, including any attachments, is for\...{{dropped: >> 25}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Tue Jan 4 19:55:27 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 4 Jan 2011 13:55:27 -0500 Subject: [R] Page eject and clearing the console In-Reply-To: References: <4D230A74020000CB0007D0B6@medicine.umaryland.edu> <4D231E67020000CB0007D0D8@medicine.umaryland.edu> Message-ID: <4829DE90-4B64-475A-BDB4-895A8097D85B@comcast.net> On Jan 4, 2011, at 1:27 PM, David Winsemius wrote: > > On Jan 4, 2011, at 1:19 PM, John Sorkin wrote: > >> I have received help on one of my questions (thank you Henrique Jorge >> and ), viz. how I can clear the console from an R program. >> I have not yet received help on how I can skip to the top of the next >> page, i.e. cat("\n") skips to the next line, is there an equivalent >> way >> to skip to the top of the next page? > > "\n" does NOT "skip to the next line". It is a character and it is > interpreted by some sort of program, say a a plotting program or a > word-processor as a line feed. You need to specify what sort of > program you intend to do this "skipping-to-next-page" action and > also provide the character sequence that that program uses to signal > that action. (There are not any pages in R except perhaps multi-page > plots but you seem to be in character mode at the moment.) > It has occurred to me that you may be asking for something that will give the illusion of "clearing the screen" but will in fact be just "printing" a page of blank space on a console display, scrolling would ahve been the name I would have given it. In which case: scroll <- function(lines=40) cat(rep("\n", lines)) scroll() > -- > David. > > >> Thanks, >> John >> >> >> >> >> John David Sorkin M.D., Ph.D. >> Chief, Biostatistics and Informatics >> University of Maryland School of Medicine Division of Gerontology >> Baltimore VA Medical Center >> 10 North Greene Street >> GRECC (BT/18/GR) >> Baltimore, MD 21201-1524 >> (Phone) 410-605-7119 >> (Fax) 410-605-7913 (Please call phone number above prior to >> faxing)>>> >> Henrique Dallazuanna 1/4/2011 11:58 AM >>> >> Take a look on: >> http://tolstoy.newcastle.edu.au/R/e11/help/10/09/8463.html >> >> On Tue, Jan 4, 2011 at 2:54 PM, John Sorkin >> wrote: >> >>> (1) I know that \n when used in cat, e.g. cat("\n") produces a line >> feed >>> (i.e. skips to the next line). Is there any escape sequence that >>> will >> go to >>> the top of the next page? >>> (2) I know that control L will clear the console. Is there an >> equivalent >>> function or other means that can be used in R code to clear the >> console? >>> >>> Thanks, >>> John >>> >>> >>> John David Sorkin M.D., Ph.D. >>> Chief, Biostatistics and Informatics >>> University of Maryland School of Medicine Division of Gerontology >>> Baltimore VA Medical Center >>> 10 North Greene Street >>> GRECC (BT/18/GR) >>> Baltimore, MD 21201-1524 >>> (Phone) 410-605-7119 >>> (Fax) 410-605-7913 (Please call phone number above prior to faxing) >>> >>> Confidentiality Statement: >>> This email message, including any attachments, is for\...{{dropped: >>> 25}} >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From gunter.berton at gene.com Tue Jan 4 20:05:52 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Tue, 4 Jan 2011 11:05:52 -0800 Subject: [R] Page eject and clearing the console In-Reply-To: <4829DE90-4B64-475A-BDB4-895A8097D85B@comcast.net> References: <4D230A74020000CB0007D0B6@medicine.umaryland.edu> <4D231E67020000CB0007D0D8@medicine.umaryland.edu> <4829DE90-4B64-475A-BDB4-895A8097D85B@comcast.net> Message-ID: Perhaps merely rephrasing David's comments, "page" is not a meaningful physical entity -- it depends on font size, line spacing, etc. and the physical "size" of the output surface, which has no meaning for an "infinitely" (or at least up to tyhe screen buffer's limit) scrollable screen viewing area. But maybe David's "scroll" function is what you had in mind. -- Bert On Tue, Jan 4, 2011 at 10:55 AM, David Winsemius wrote: > > On Jan 4, 2011, at 1:27 PM, David Winsemius wrote: > >> >> On Jan 4, 2011, at 1:19 PM, John Sorkin wrote: >> >>> I have received help on one of my questions (thank you Henrique Jorge >>> and ), viz. how I can clear the console from an R program. >>> I have not yet received help on how I can skip to the top of the next >>> page, i.e. cat("\n") skips to the next line, is there an equivalent way >>> to skip to the top of the next page? >> >> "\n" does NOT "skip to the next line". It is a character and it is >> interpreted by some sort of program, say a a plotting program or a >> word-processor as a line feed. You need to specify what sort of program you >> intend to do this "skipping-to-next-page" action and also provide the >> character sequence that that program uses to signal that action. (There are >> not any pages in R except perhaps multi-page plots but you seem to be in >> character mode at the moment.) >> > > It has occurred to me that you may be asking for something that will give > the illusion of "clearing the screen" but will in fact be just "printing" a > page of blank space on a console display, scrolling would ahve been the name > I would have given it. In which case: > > scroll <- function(lines=40) cat(rep("\n", lines)) > scroll() > >> -- >> David. >> >> >>> Thanks, >>> John >>> >>> >>> >>> >>> John David Sorkin M.D., Ph.D. >>> Chief, Biostatistics and Informatics >>> University of Maryland School of Medicine Division of Gerontology >>> Baltimore VA Medical Center >>> 10 North Greene Street >>> GRECC (BT/18/GR) >>> Baltimore, MD 21201-1524 >>> (Phone) 410-605-7119 >>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> >>> Henrique Dallazuanna 1/4/2011 11:58 AM >>> >>> Take a look on: >>> http://tolstoy.newcastle.edu.au/R/e11/help/10/09/8463.html >>> >>> On Tue, Jan 4, 2011 at 2:54 PM, John Sorkin >>> wrote: >>> >>>> (1) I know that \n when used in cat, e.g. cat("\n") produces a line >>> >>> feed >>>> >>>> (i.e. skips to the next line). Is there any escape sequence that will >>> >>> go to >>>> >>>> the top of the next page? >>>> (2) I know that control L will clear the console. Is there an >>> >>> equivalent >>>> >>>> function or other means that can be used in R code to clear the >>> >>> console? >>>> >>>> Thanks, >>>> John >>>> >>>> >>>> John David Sorkin M.D., Ph.D. >>>> Chief, Biostatistics and Informatics >>>> University of Maryland School of Medicine Division of Gerontology >>>> Baltimore VA Medical Center >>>> 10 North Greene Street >>>> GRECC (BT/18/GR) >>>> Baltimore, MD 21201-1524 >>>> (Phone) 410-605-7119 >>>> (Fax) 410-605-7913 (Please call phone number above prior to faxing) >>>> >>>> Confidentiality Statement: >>>> This email message, including any attachments, is for\...{{dropped:25}} >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> David Winsemius, MD >> West Hartford, CT >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Bert Gunter Genentech Nonclinical Biostatistics From aquanyc at gmail.com Tue Jan 4 19:42:45 2011 From: aquanyc at gmail.com (rivercode) Date: Tue, 4 Jan 2011 10:42:45 -0800 (PST) Subject: [R] XTS : merge.xts seems to have problem with character vectors Message-ID: <1294166565334-3174125.post@n4.nabble.com> Hi, Please can you tell me what I am doing wrong. When trying to merge two xts objects, one of which has multiple character vectors for columns...I am just getting NAs. > str(t) POSIXct[1:1], format: "2011-01-04 11:45:37" > y2 = xts(matrix(c(letters[1:10]),5), order.by=as.POSIXct(c(t + 1:5))) > names(y2) = c(1,2) > y2 1 2 2011-01-04 11:45:38 "a" "f" 2011-01-04 11:45:39 "b" "g" 2011-01-04 11:45:40 "c" "h" 2011-01-04 11:45:41 "d" "i" 2011-01-04 11:45:42 "e" "j" > y1 = xts(c(1:5), order.by=as.POSIXct(c(t + 1:5))) > y1 [,1] 2011-01-04 11:45:38 1 2011-01-04 11:45:39 2 2011-01-04 11:45:40 3 2011-01-04 11:45:41 4 2011-01-04 11:45:42 5 > merge(y1, y2) y1 X1 X2 2011-01-04 11:45:38 1 NA NA 2011-01-04 11:45:39 2 NA NA 2011-01-04 11:45:40 3 NA NA 2011-01-04 11:45:41 4 NA NA 2011-01-04 11:45:42 5 NA NA Warning message: In merge.xts(y1, y2) : NAs introduced by coercion Why do I lose the character columns ? Cheers, Chris -- View this message in context: http://r.789695.n4.nabble.com/XTS-merge-xts-seems-to-have-problem-with-character-vectors-tp3174125p3174125.html Sent from the R help mailing list archive at Nabble.com. From egregory2007 at yahoo.com Tue Jan 4 19:54:19 2011 From: egregory2007 at yahoo.com (Erik Gregory) Date: Tue, 4 Jan 2011 10:54:19 -0800 (PST) Subject: [R] Navigating web pages using R Message-ID: <512422.96077.qm@web37407.mail.mud.yahoo.com> R-Help, I'm trying to obtain some data from a webpage which masks the URL from the user, so an explicit URL will not work. For example, when one navigates to the web page the URL looks something like: http://137.113.141.205/rpt34s.php?flags=1 (changed for privacy, but i'm not sure you could access it anyways since it's internal to the agency I work for). The site has three drop-down menus for "Site", "Month," and "Year". When a combination is selected of these, the resulting URL is always http://137.113.141.205/rpt34s (nothing changes, except "flags=1" is dropped, so what I need to be able to do is write something that will navigate to the original URL, then select some combination of "Site", "Month", and "Year," and then submit the query to the site to navigate to the page with the data. Is this a capability that R has as a language? Unfortunately, I'm unfamiliar with html or php programming, so if this question belongs in a forum on that I apologize. I'm trying to centralize all of my code for my analysis in R! Thank you, -Erik Gregory Student Assistant, California EPA CSU Sacramento, Mathematics From josh.m.ulrich at gmail.com Tue Jan 4 20:51:30 2011 From: josh.m.ulrich at gmail.com (Joshua Ulrich) Date: Tue, 4 Jan 2011 13:51:30 -0600 Subject: [R] XTS : merge.xts seems to have problem with character vectors In-Reply-To: <1294166565334-3174125.post@n4.nabble.com> References: <1294166565334-3174125.post@n4.nabble.com> Message-ID: On Tue, Jan 4, 2011 at 12:42 PM, rivercode wrote: > > Hi, > > Please can you tell me what I am doing wrong. ?When trying to merge two xts > objects, one of which has multiple character vectors for columns...I am just > getting NAs. > >> str(t) > ?POSIXct[1:1], format: "2011-01-04 11:45:37" > >> y2 = xts(matrix(c(letters[1:10]),5), order.by=as.POSIXct(c(t + 1:5))) >> names(y2) = c(1,2) >> y2 > ? ? ? ? ? ? ? ? ? ?1 ? 2 > 2011-01-04 11:45:38 "a" "f" > 2011-01-04 11:45:39 "b" "g" > 2011-01-04 11:45:40 "c" "h" > 2011-01-04 11:45:41 "d" "i" > 2011-01-04 11:45:42 "e" "j" > >> y1 = xts(c(1:5), order.by=as.POSIXct(c(t + 1:5))) >> y1 > ? ? ? ? ? ? ? ? ? ?[,1] > 2011-01-04 11:45:38 ? ?1 > 2011-01-04 11:45:39 ? ?2 > 2011-01-04 11:45:40 ? ?3 > 2011-01-04 11:45:41 ? ?4 > 2011-01-04 11:45:42 ? ?5 > >> merge(y1, y2) > ? ? ? ? ? ? ? ? ? ?y1 X1 X2 > 2011-01-04 11:45:38 ?1 NA NA > 2011-01-04 11:45:39 ?2 NA NA > 2011-01-04 11:45:40 ?3 NA NA > 2011-01-04 11:45:41 ?4 NA NA > 2011-01-04 11:45:42 ?5 NA NA > > Warning message: > In merge.xts(y1, y2) : NAs introduced by coercion > > Why do I lose the character columns ? > Because zoo / xts objects are matrices with an ordered index (a time index in the case of xts). You can't mix types in a matrix. -- Joshua Ulrich | FOSS Trading: www.fosstrading.com > Cheers, > Chris > -- > View this message in context: http://r.789695.n4.nabble.com/XTS-merge-xts-seems-to-have-problem-with-character-vectors-tp3174125p3174125.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From jsorkin at grecc.umaryland.edu Tue Jan 4 20:52:18 2011 From: jsorkin at grecc.umaryland.edu (John Sorkin) Date: Tue, 04 Jan 2011 14:52:18 -0500 Subject: [R] Page eject and clearing the console In-Reply-To: References: <4D230A74020000CB0007D0B6@medicine.umaryland.edu> <4D231E67020000CB0007D0D8@medicine.umaryland.edu> <4829DE90-4B64-475A-BDB4-895A8097D85B@comcast.net> Message-ID: <4D233422020000CB0007D113@medicine.umaryland.edu> I am looking for something which will emulate the following: Back in the old days, when printing to a line printer, the first two characters in a line controlled printing. For example, a line starting with 1h1 starts printing at the top of the next page. 1h+ indicates overprinting 1h0 results in double In R "\n" skips to the next line. Is there some escape sequence that will start printing at the top of the next page? Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Bert Gunter 1/4/2011 2:05 PM >>> Perhaps merely rephrasing David's comments, "page" is not a meaningful physical entity -- it depends on font size, line spacing, etc. and the physical "size" of the output surface, which has no meaning for an "infinitely" (or at least up to tyhe screen buffer's limit) scrollable screen viewing area. But maybe David's "scroll" function is what you had in mind. -- Bert On Tue, Jan 4, 2011 at 10:55 AM, David Winsemius wrote: > > On Jan 4, 2011, at 1:27 PM, David Winsemius wrote: > >> >> On Jan 4, 2011, at 1:19 PM, John Sorkin wrote: >> >>> I have received help on one of my questions (thank you Henrique Jorge >>> and ), viz. how I can clear the console from an R program. >>> I have not yet received help on how I can skip to the top of the next >>> page, i.e. cat("\n") skips to the next line, is there an equivalent way >>> to skip to the top of the next page? >> >> "\n" does NOT "skip to the next line". It is a character and it is >> interpreted by some sort of program, say a a plotting program or a >> word-processor as a line feed. You need to specify what sort of program you >> intend to do this "skipping-to-next-page" action and also provide the >> character sequence that that program uses to signal that action. (There are >> not any pages in R except perhaps multi-page plots but you seem to be in >> character mode at the moment.) >> > > It has occurred to me that you may be asking for something that will give > the illusion of "clearing the screen" but will in fact be just "printing" a > page of blank space on a console display, scrolling would ahve been the name > I would have given it. In which case: > > scroll <- function(lines=40) cat(rep("\n", lines)) > scroll() > >> -- >> David. >> >> >>> Thanks, >>> John >>> >>> >>> >>> >>> John David Sorkin M.D., Ph.D. >>> Chief, Biostatistics and Informatics >>> University of Maryland School of Medicine Division of Gerontology >>> Baltimore VA Medical Center >>> 10 North Greene Street >>> GRECC (BT/18/GR) >>> Baltimore, MD 21201-1524 >>> (Phone) 410-605-7119 >>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> >>> Henrique Dallazuanna 1/4/2011 11:58 AM >>> >>> Take a look on: >>> http://tolstoy.newcastle.edu.au/R/e11/help/10/09/8463.html >>> >>> On Tue, Jan 4, 2011 at 2:54 PM, John Sorkin >>> wrote: >>> >>>> (1) I know that \n when used in cat, e.g. cat("\n") produces a line >>> >>> feed >>>> >>>> (i.e. skips to the next line). Is there any escape sequence that will >>> >>> go to >>>> >>>> the top of the next page? >>>> (2) I know that control L will clear the console. Is there an >>> >>> equivalent >>>> >>>> function or other means that can be used in R code to clear the >>> >>> console? >>>> >>>> Thanks, >>>> John >>>> >>>> >>>> John David Sorkin M.D., Ph.D. >>>> Chief, Biostatistics and Informatics >>>> University of Maryland School of Medicine Division of Gerontology >>>> Baltimore VA Medical Center >>>> 10 North Greene Street >>>> GRECC (BT/18/GR) >>>> Baltimore, MD 21201-1524 >>>> (Phone) 410-605-7119 >>>> (Fax) 410-605-7913 (Please call phone number above prior to faxing) >>>> >>>> Confidentiality Statement: >>>> This email message, including any attachments, is for\...{{dropped:25}} >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> David Winsemius, MD >> West Hartford, CT >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Bert Gunter Genentech Nonclinical Biostatistics ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} From sbigelow at fs.fed.us Tue Jan 4 21:02:18 2011 From: sbigelow at fs.fed.us (Seth W Bigelow) Date: Tue, 4 Jan 2011 12:02:18 -0800 Subject: [R] R implementation of S-distribution Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Eric.Hu at gilead.com Tue Jan 4 21:10:58 2011 From: Eric.Hu at gilead.com (Eric Hu) Date: Tue, 4 Jan 2011 12:10:58 -0800 Subject: [R] Non-uniformly distributed plot In-Reply-To: References: <8565720D-1A0D-425C-A593-131FEE00DEF6@comcast.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Sebastien.Bihorel at cognigencorp.com Tue Jan 4 21:21:00 2011 From: Sebastien.Bihorel at cognigencorp.com (Sebastien Bihorel) Date: Tue, 04 Jan 2011 15:21:00 -0500 Subject: [R] R command execution from shell Message-ID: <4D23812C.609@cognigencorp.com> Dear R-users, Is there a way I can ask R to execute the "write("hello world",file="hello.txt")" command directly from the UNIX shell, instead of having to save this command to a .R file and execute this file with R CMD BATCH? Thank you Sebastien From pomchip at free.fr Tue Jan 4 21:22:06 2011 From: pomchip at free.fr (=?ISO-8859-1?Q?S=E9bastien_Bihorel?=) Date: Tue, 4 Jan 2011 15:22:06 -0500 Subject: [R] Listing of available functions In-Reply-To: <9BD89916-C371-44C3-8D7A-C47D3F093250@comcast.net> References: <9BD89916-C371-44C3-8D7A-C47D3F093250@comcast.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Greg.Snow at imail.org Tue Jan 4 21:33:20 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Tue, 4 Jan 2011 13:33:20 -0700 Subject: [R] Page eject and clearing the console In-Reply-To: <4D233422020000CB0007D113@medicine.umaryland.edu> References: <4D230A74020000CB0007D0B6@medicine.umaryland.edu> <4D231E67020000CB0007D0D8@medicine.umaryland.edu> <4829DE90-4B64-475A-BDB4-895A8097D85B@comcast.net> <4D233422020000CB0007D113@medicine.umaryland.edu> Message-ID: Well the things most like cat('\n') for starting a new page would be cat('\013') or cat('\014') (vertical tab and form feed), however on all the terminals I tried they don't do anything since page is not a concept on a terminal. However if you outputted one of those into a file and interpreted the file in a context with pages, then they may do what you want. The next closest idea to what you are saying that I can find is the wdPageBreak function in the R2wd package. This will insert a page break into a word document that is being created from R, there pages make sense and this will start the next part on a new page. To be any more help we need to know what context you are using this in and what you mean by "page". Are you preparing stuff to be printed on paper (or at least to a pdf doc where pages make sense?) or is there some behavior in the terminal that you want to see? For the last we need to know OS and possibly how you are running R within that OS (gui vs. terminal, etc.). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of John Sorkin > Sent: Tuesday, January 04, 2011 12:52 PM > To: David Winsemius; Bert Gunter > Cc: r-help at r-project.org > Subject: Re: [R] Page eject and clearing the console > > I am looking for something which will emulate the following: > Back in the old days, when printing to a line printer, the first two > characters in a line controlled printing. For example, a line starting > with > 1h1 starts printing at the top of the next page. > 1h+ indicates overprinting > 1h0 results in double > In R "\n" skips to the next line. Is there some escape sequence that > will start printing at the top of the next page? > Thanks, > John > > > > > > John David Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> > Bert Gunter 1/4/2011 2:05 PM >>> > Perhaps merely rephrasing David's comments, > > "page" is not a meaningful physical entity -- it depends on font size, > line spacing, etc. and the physical "size" of the output surface, > which has no meaning for an "infinitely" (or at least up to tyhe > screen buffer's limit) scrollable screen viewing area. > > But maybe David's "scroll" function is what you had in mind. > > -- Bert > > On Tue, Jan 4, 2011 at 10:55 AM, David Winsemius > wrote: > > > > On Jan 4, 2011, at 1:27 PM, David Winsemius wrote: > > > >> > >> On Jan 4, 2011, at 1:19 PM, John Sorkin wrote: > >> > >>> I have received help on one of my questions (thank you Henrique > Jorge > >>> and ), viz. how I can clear the console from an R program. > >>> I have not yet received help on how I can skip to the top of the > next > >>> page, i.e. cat("\n") skips to the next line, is there an equivalent > way > >>> to skip to the top of the next page? > >> > >> "\n" does NOT "skip to the next line". It is a character and it is > >> interpreted by some sort of program, say a a plotting program or a > >> word-processor as a line feed. You need to specify what sort of > program you > >> intend to do this "skipping-to-next-page" action and also provide > the > >> character sequence that that program uses to signal that action. > (There are > >> not any pages in R except perhaps multi-page plots but you seem to > be in > >> character mode at the moment.) > >> > > > > It has occurred to me that you may be asking for something that will > give > > the illusion of "clearing the screen" but will in fact be just > "printing" a > > page of blank space on a console display, scrolling would ahve been > the name > > I would have given it. In which case: > > > > scroll <- function(lines=40) cat(rep("\n", lines)) > > scroll() > > > >> -- > >> David. > >> > >> > >>> Thanks, > >>> John > >>> > >>> > >>> > >>> > >>> John David Sorkin M.D., Ph.D. > >>> Chief, Biostatistics and Informatics > >>> University of Maryland School of Medicine Division of Gerontology > >>> Baltimore VA Medical Center > >>> 10 North Greene Street > >>> GRECC (BT/18/GR) > >>> Baltimore, MD 21201-1524 > >>> (Phone) 410-605-7119 > >>> (Fax) 410-605-7913 (Please call phone number above prior to > faxing)>>> > >>> Henrique Dallazuanna 1/4/2011 11:58 AM >>> > >>> Take a look on: > >>> http://tolstoy.newcastle.edu.au/R/e11/help/10/09/8463.html > >>> > >>> On Tue, Jan 4, 2011 at 2:54 PM, John Sorkin > >>> wrote: > >>> > >>>> (1) I know that \n when used in cat, e.g. cat("\n") produces a > line > >>> > >>> feed > >>>> > >>>> (i.e. skips to the next line). Is there any escape sequence that > will > >>> > >>> go to > >>>> > >>>> the top of the next page? > >>>> (2) I know that control L will clear the console. Is there an > >>> > >>> equivalent > >>>> > >>>> function or other means that can be used in R code to clear the > >>> > >>> console? > >>>> > >>>> Thanks, > >>>> John > >>>> > >>>> > >>>> John David Sorkin M.D., Ph.D. > >>>> Chief, Biostatistics and Informatics > >>>> University of Maryland School of Medicine Division of Gerontology > >>>> Baltimore VA Medical Center > >>>> 10 North Greene Street > >>>> GRECC (BT/18/GR) > >>>> Baltimore, MD 21201-1524 > >>>> (Phone) 410-605-7119 > >>>> (Fax) 410-605-7913 (Please call phone number above prior to > faxing) > >>>> > >>>> Confidentiality Statement: > >>>> This email message, including any attachments, is > for\...{{dropped:25}} > >>> > >>> ______________________________________________ > >>> R-help at r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >> > >> David Winsemius, MD > >> West Hartford, CT > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > David Winsemius, MD > > West Hartford, CT > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Bert Gunter > Genentech Nonclinical Biostatistics > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. > > Confidentiality Statement: > This email message, including any attachments, is for > th...{{dropped:6}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From murdoch.duncan at gmail.com Tue Jan 4 21:36:36 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Tue, 04 Jan 2011 15:36:36 -0500 Subject: [R] R command execution from shell In-Reply-To: <4D23812C.609@cognigencorp.com> References: <4D23812C.609@cognigencorp.com> Message-ID: <4D2384D4.50401@gmail.com> On 04/01/2011 3:21 PM, Sebastien Bihorel wrote: > Dear R-users, > > Is there a way I can ask R to execute the "write("hello > world",file="hello.txt")" command directly from the UNIX shell, instead > of having to save this command to a .R file and execute this file with R > CMD BATCH? Yes. Some versions of R support the -e option on the command line to execute a particular command. It's not always easy to work out the escapes so your shell passes all the quotes through... An alternative is to echo the command into the shell, e.g. echo 'cat("hello")' | R --slave (where the outer ' ' are just for bash). Duncan Murdoch From benjamin.ward at bathspa.org Tue Jan 4 21:44:50 2011 From: benjamin.ward at bathspa.org (Ben Ward) Date: Tue, 4 Jan 2011 20:44:50 +0000 Subject: [R] Resampling to find Confidence intervals In-Reply-To: References: <1294158242553-3173846.post@n4.nabble.com> Message-ID: You mentioned the boot package, I've just stumbled across a package called simpleboot, with a function lm.boot. Would this be suitable - it says I can sample cases from the origional dataset, as well as from the residuals of a model. Not all the options I understand but I assume the defaults might be suitable for what I'm doing? On 04/01/2011 17:56, Ben Ward wrote: > Ok I'll check I understand: > So it's using sample, to resample d once, 10 values, because the rnorm > has 10 values, with replacement (I assume thats the TRUE part). > Then a for loop has this to resample the data - in the loop's case its > 1000 times. Then it does a lm to get the coefficients and add them to > d1.coef. I'm guessing that the allboot bit with rbind, which is null > at the start of the loop, is the collection of d1.coef values, as I > think that without it, every cycle of the loop the d1.coef from the > previous cycle round the loop would be gone? > > On 04/01/2011 16:24, Dieter Menne wrote: > > Axolotl9250 wrote: > >>> ... >>> resampled_ecoli = sample(ecoli, 500, replace=T) >>> coefs = (coef(lm(MIC. ~ 1 + Challenge + Cleaner + Replicate, >>> data=resampled_ecoli))) >>> sd(coefs) >>> >>> ... >>> >> Below a simplified and self-consistent version of your code, and some >> changes >> >> Dieter >> >> # resample >> d = data.frame(x=rnorm(10)) >> d$y = d$x*3+rnorm(10,0.01) >> >> # if you do this, you only get ONE bootstrap sample >> d1 = d[sample(1:nrow(d),10,TRUE),] >> d1.coef = coef(lm(y~x,data=d1)) >> d1.coef >> # No error below, because you compute the sd of (Intercept) and slope >> # but result is wrong! >> sd(d1.coef) >> >> # We have to do this over and over >> # Check ?replicate for a more R-ish approach.... >> nsamples = 1000 >> allboot = NULL >> for (i in 1:1000) { >> d1 = d[sample(1:nrow(d),10,TRUE),] >> d1.coef = coef(lm(y~x,data=d1)) >> allboot = rbind(allboot,d1.coef) # Not very efficient, preallocate! >> } >> head(allboot) # display first of nsamples lines >> apply(allboot,2,mean) # Compute mean >> apply(allboot,2,sd) # compute sd >> # After you are sure you understood the above, you might try package >> boot. >> >> >> >> >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From lucam1968 at gmail.com Tue Jan 4 19:08:33 2011 From: lucam1968 at gmail.com (Luca Meyer) Date: Tue, 4 Jan 2011 19:08:33 +0100 Subject: [R] error in calling source(): invalid multibyte character in parser In-Reply-To: References: <25DFCF6A-7BEE-430E-B009-10183B43429D@gmail.com> Message-ID: It works fine, thanks. I was just wondering is there is anyway to include automatically the command you suggest as a default when I open R. Thanks, Luca Il giorno 03/gen/2011, alle ore 08.36, Phil Spector ha scritto: > Luca - > What happens why you type > > Sys.setlocale('LC_ALL','C') > > before issuing the source command? > > - Phil Spector > Statistical Computing Facility > Department of Statistics > UC Berkeley > spector at stat.berkeley.edu > > > On Mon, 3 Jan 2011, Luca Meyer wrote: > >> Being italians when writing comments/instructions we use accented letters - like ?, ?, ?, etc.... when running R scripts using such characters I get and error saying: >> >> invalid multibyte character in parser >> >> I have been looking at the help and searched the r-help archives but I haven't find anything that I could intelligibly apply to my case. >> >> Can anyone suggest a fix for this error? >> >> Thanks, >> Luca >> >> Mr. Luca Meyer >> www.lucameyer.com >> IBM SPSS Statistics release 19.0.0 >> R version 2.12.1 (2010-12-16) >> Mac OS X 10.6.5 (10H574) - kernel Darwin 10.5.0 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> From Sebastien.Bihorel at cognigencorp.com Tue Jan 4 22:00:06 2011 From: Sebastien.Bihorel at cognigencorp.com (Sebastien Bihorel) Date: Tue, 04 Jan 2011 16:00:06 -0500 Subject: [R] R command execution from shell In-Reply-To: <4D2384D4.50401@gmail.com> References: <4D23812C.609@cognigencorp.com> <4D2384D4.50401@gmail.com> Message-ID: <4D238A56.9010608@cognigencorp.com> Thank you That is exactly what I was looking for. Sebastien Duncan Murdoch wrote: > On 04/01/2011 3:21 PM, Sebastien Bihorel wrote: >> Dear R-users, >> >> Is there a way I can ask R to execute the "write("hello >> world",file="hello.txt")" command directly from the UNIX shell, instead >> of having to save this command to a .R file and execute this file with R >> CMD BATCH? > > Yes. Some versions of R support the -e option on the command line to > execute a particular command. It's not always easy to work out the > escapes so your shell passes all the quotes through... An alternative > is to echo the command into the shell, e.g. > > echo 'cat("hello")' | R --slave > > (where the outer ' ' are just for bash). > > Duncan Murdoch From ripley at stats.ox.ac.uk Tue Jan 4 23:04:28 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Tue, 4 Jan 2011 22:04:28 +0000 (GMT) Subject: [R] error in calling source(): invalid multibyte character in parser In-Reply-To: <09AA819A-0A64-4862-B3D6-7CDDAA40DBC4@gmail.com> References: <25DFCF6A-7BEE-430E-B009-10183B43429D@gmail.com> <09AA819A-0A64-4862-B3D6-7CDDAA40DBC4@gmail.com> Message-ID: On Tue, 4 Jan 2011, Luca Meyer wrote: > How would I go by doing that? I have tried with: > > source("file.R", encoding="it_IT.UTF-8") > > But I get > > Error in file(file, "r", encoding = encoding) : > unsupported conversion from 'it_IT.UTF-8' to '' Well, that is not the value I suggested -- so what not simply follow what you were asked to try? > > Thanks, > Luca > > PS: "it_IT.UTF-8" is what I get under locale when I run sessionInfo() > > Il giorno 03/gen/2011, alle ore 09.48, Prof Brian Ripley ha scritto: > >> On Mon, 3 Jan 2011, peter dalgaard wrote: >> >>> >>> On Jan 3, 2011, at 08:32 , Luca Meyer wrote: >>> >>>> Being italians when writing comments/instructions we use accented letters - like ?, ?, ?, etc.... when running R scripts using such characters I get and error saying: >>>> >>>> invalid multibyte character in parser >>>> >>>> I have been looking at the help and searched the r-help archives but I haven't find anything that I could intelligibly apply to my case. >>>> >>>> Can anyone suggest a fix for this error? >>> >>> The most likely cause is that your scripts are written in an "8 bit ASCII" encoding (Latin-1 or -9, most likely), while R is running in a UTF8 locale. If that is the cause, the fix is to standardize things to use the same locale. You can convert the encoding of your source file using the iconv utility (in a Terminal window). >> >> Or use the 'encoding' argument of source() to tell R what the >> encoding is, e.g. encoding="latin1" or "latin-9" (the inconsistency >> being in the iconv used on Macs, not in R). >> >>> >>> -pd >>> >>>> >>>> Thanks, >>>> Luca >>>> >>>> Mr. Luca Meyer >>>> www.lucameyer.com >>>> IBM SPSS Statistics release 19.0.0 >>>> R version 2.12.1 (2010-12-16) >>>> Mac OS X 10.6.5 (10H574) - kernel Darwin 10.5.0 >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> -- >>> Peter Dalgaard >>> Center for Statistics, Copenhagen Business School >>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark >>> Phone: (+45)38153501 >>> Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> -- >> Brian D. Ripley, ripley at stats.ox.ac.uk >> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >> University of Oxford, Tel: +44 1865 272861 (self) >> 1 South Parks Road, +44 1865 272866 (PA) >> Oxford OX1 3TG, UK Fax: +44 1865 272595 > > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From ripley at stats.ox.ac.uk Tue Jan 4 23:21:46 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Tue, 4 Jan 2011 22:21:46 +0000 (GMT) Subject: [R] R command execution from shell In-Reply-To: <4D2384D4.50401@gmail.com> References: <4D23812C.609@cognigencorp.com> <4D2384D4.50401@gmail.com> Message-ID: On Tue, 4 Jan 2011, Duncan Murdoch wrote: > On 04/01/2011 3:21 PM, Sebastien Bihorel wrote: >> Dear R-users, >> >> Is there a way I can ask R to execute the "write("hello >> world",file="hello.txt")" command directly from the UNIX shell, instead >> of having to save this command to a .R file and execute this file with R >> CMD BATCH? > > Yes. Some versions of R support the -e option on the command line to execute > a particular command. It's not always easy to work out the escapes so your > shell passes all the quotes through... An alternative is to echo the command > into the shell, e.g. > > echo 'cat("hello")' | R --slave > > (where the outer ' ' are just for bash). It is marginally preferable to use Rscript in place of 'R --slave'. I think in all known shells Rscript -e "write('hello world', file = 'hello.txt')" will work. (If not, shQuote() will not work for that shell, but this does work in sh+clones, csh+clones, zsh and Windows' cmd.exe.) -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From erich.neuwirth at univie.ac.at Tue Jan 4 15:30:20 2011 From: erich.neuwirth at univie.ac.at (Erich Neuwirth) Date: Tue, 04 Jan 2011 15:30:20 +0100 Subject: [R] [R-pkgs] ENmisc_1.0 Message-ID: <4D232EFC.1040602@univie.ac.at> ENmisc contains two utility function mtapply is a hybrid of mapply and tapply. It evaluates summary function for each cell of data defined by a fixed set of factor values. The reason I wrote it was to be able to compute weighted means (using wtd.mean from Hmisc) to groups of data defined by factors, but it accepts any multivariate function as the function argument wtd.boxplot does what the name makes you expect, it computes and draws boxplots for weighted data. It behaves like boxplot, but accepts an additional argument, weights. _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages From xie at yihui.name Tue Jan 4 16:55:20 2011 From: xie at yihui.name (Yihui Xie) Date: Tue, 4 Jan 2011 09:55:20 -0600 Subject: [R] [R-pkgs] Package animation update (v2.0-0) Message-ID: Hi, The animation package 2.0-0 is on CRAN now. This version is a milestone of the animation package. It includes a new function saveHTML() which uses a much more elegant interface and is consistent in syntax with other save*() functions such as saveMovie(), saveSWF() and saveLatex(). Lots of demos have been added to demonstrate the flexibility of this package, e.g. now we can get the snapshots of rgl 3D plots and insert them into LaTeX with a single call to saveLatex(), or even into Sweave documents. There are some funny demos in this version too, e.g. demo('fireworks') to set fireworks, or demo('CLEvsLAL') which is a "replay" of an NBA game between Cavaliers and Lakers on 2009. Have fun! ? ? ? ? ? ? ? ? ?CHANGES IN animation VERSION 2.0-0 NEW FEATURES ? ?o a new demo 'Xmas2': Merry Christmas with snowflakes (see ?demo('Xmas2', 'animation'); thanks, Jing Jiao) ? ?o a new function saveHTML() to insert animations into HTML pages ?(this was designed to replace the old ani.start() and ani.stop(); ?the output is much more appealing; the JavaScript is based on the ?SciAnimator library and jQuery) ? ?o ani.options() gained a new option 'autoplay' to indicate whether ?to autoplay the animation in the HTML page created by saveHTML() ? ?o in fact ani.options() was rewritten, but this should not have ?any influence on users; the usage is the same ? ?o ani.options() gained a new option 'use.dev' to decide whether to ?use the graphics device provided in ani.options('ani.dev') when ?calling saveHTML(), saveLatex(), saveMovie() and saveSWF() ? ?o ani.options() has a couple of hidden options ('convert', ?'swftools', 'img.fmt') which can be useful too; see ?ani.options for ?details ? ?o a new function ani.pause(): it is a wrapper to ?Sys.sleep(interval) but it will not pause when called in a ?non-interactive graphics device (usually the off-screen devices); ?this is the recommended way to specify the pause in the animation ?now -- all the functions in this package have been adjusted to use ?ani.pause() ? ?o a new demo('pollen') to show the hidden 'structure' in a large ?data (requires the rgl package) ? ?o a new demo('CLEvsLAL') to `replay' the NBA game between CLE and ?LAL on 2009 Christmas (with a new dataset 'CLELAL09') ? ?o a new demo('fireworks') to set fireworks using R (thanks, ?Weicheng Zhu) ? ?o saveLatex() can work with the rgl package to produce 3D animations ?in a PDF document now; see demo('rgl_animation') ? ?o a new demo('rgl_animation') to demonstrate how to insert rgl 3D ?animations into a LaTeX document and compile to PDF ? ?o a new demo('use_Cairo') to show how to use the Cairo device in ?this package to obtain high-quality output ? ?o a new demo('Sweave_animation') to show how to insert animations ?into Sweave documents ? ?o a new demo('game_of_life') to demonstrate the (amusing) Game of ?Life (thanks, Linlin Yan) SIGNIFICANT CHANGES ? ?o the documentation of this package has been tremendously revised; ?hopefully it is more clear to read now ? ?o several arguments in saveMovie(), im.convert(), saveSWF() and ?saveLatex() were removed, because they can be specified by ?ani.options(); this can simplify the usage of these functions MINOR CHANGES ? ?o the argument 'para' in saveMovie() was removed; the argument ?'ani.first' was also removed from all the save*() functions, because ?this can be written in 'expr' and there is no need to provide an ?additional argument ? ?o the path of the output in im.convert() and gm.convert() will be ?quoted, because sometimes users might supply a path containing ?spaces (thanks, Phalkun Chheng) ? ?o the option 'filename' in ani.options() was renamed to 'htmlfile' ?so that the meaning of this option is more clear; 'footer' was ?renamed to 'verbose' too ? ?o ani.options() can accept any arguments now ? ?o im.convert() and gm.convert() will no longer stop() when the ?convert utility cannot be found; instead, they only issue warnings; ?a hidden option ani.options('convert') can be used to specify the ?location of convert.exe in ImageMagick ? ?o saveMovie(), saveHTML(), saveSWF() and saveLatex() will try to ?open the output if ani.options('autobrowse') is TRUE; and they will ?keep the current working directory untouched when evaluating 'expr' ?(i.e. ?'expr' will be evaluated under getwd()) Regards, Yihui -- Yihui Xie Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages From anthony at resolution.com Wed Jan 5 00:30:23 2011 From: anthony at resolution.com (apresley) Date: Tue, 4 Jan 2011 15:30:23 -0800 (PST) Subject: [R] randomForest speed improvements In-Reply-To: References: <1294084769056-3172523.post@n4.nabble.com> <1294097299183-3172834.post@n4.nabble.com> Message-ID: <1294183823780-3174621.post@n4.nabble.com> Andy, Thanks for the reply. I had no idea I could combine them back ... that actually will work pretty well. We can have several "worker threads" load up the RF's on different machines and/or cores, and then re-assemble them. RMPI might be an option down the road, but would be a bit of overhead for us now. Using the method of combine() ... I was able to drastically reduce the amount of time to build randomForest objects. IE, using about 25,000 rows (6 columns), it takes maybe 5 minutes on my laptop. Using 5 randomForest objects (each with 5k rows), and then combining them, takes < 1 minute. -- Anthony -- View this message in context: http://r.789695.n4.nabble.com/randomForest-speed-improvements-tp3172523p3174621.html Sent from the R help mailing list archive at Nabble.com. From nevil.amos at gmail.com Wed Jan 5 00:50:19 2011 From: nevil.amos at gmail.com (Nevil Amos) Date: Wed, 05 Jan 2011 10:50:19 +1100 Subject: [R] how to keep keep matching column in output of merge In-Reply-To: References: <4D232130.1020801@sci.monash.edu.au> Message-ID: <4D23B23B.6000309@sci.monash.edu.au> Apologies, it is there! On 5/01/2011 1:26 AM, Sarah Goslee wrote: > Hi Nevil, > > We really need an example here of what you're doing, since > merge() does keep the id column by default. > > >> x<- data.frame(id = c("a", "b", "c", NA), x=c(1,2,3,4)) >> y<- data.frame(id1 = c(NA, "a", "d", "c"), y=c(101, 102, 103, 104)) >> merge(x, y) > id x y > 1 a 1 102 > 2 b 2 101 > 3 c 3 104 > 4 d 4 103 > > Sarah > > On Tue, Jan 4, 2011 at 8:31 AM, Nevil Amos wrote: >> How do I keep the linking column[s] in a merge()? >> I need to use the values again in a further merge. >> >> thanks >> >> Nevil Amos >> From arrayprofile at yahoo.com Wed Jan 5 01:19:30 2011 From: arrayprofile at yahoo.com (array chip) Date: Tue, 4 Jan 2011 16:19:30 -0800 (PST) Subject: [R] RData size Message-ID: <898741.41363.qm@web56301.mail.re3.yahoo.com> Hi, I noticed a Rdata size issue that's puzzling to me. Attached find 2 example datasets in text file. Both are 100x5, so the sizes for both text file are the same. However, when I read them into R, the sizes are much different: tt<-as.matrix(read.table("tt.txt",header=T,row.names=1)) save(tt,file='tt.RData') tt.big<-as.matrix(read.table("tt.big.txt",header=T,row.names=1)) save(tt.big,file='tt.big.RData') "tt.RData" is 2KB while "tt.big.RData" is 5KB. This is not a big deal with the example datasets, but my real datasets are much larger, the difference is 35MB vs. 1MB for the RData objects. The difference between the 2 datasets above is that "tt.big" is a smoothed version of "tt", so there are a lot less unique values in tt.big than tt, I guess this is the reason for the difference in sizes of RData objects, can anyone confirm? Thanks John -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: tt.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: tt.big.txt URL: From jholtman at gmail.com Wed Jan 5 02:36:46 2011 From: jholtman at gmail.com (jim holtman) Date: Tue, 4 Jan 2011 20:36:46 -0500 Subject: [R] RData size In-Reply-To: <898741.41363.qm@web56301.mail.re3.yahoo.com> References: <898741.41363.qm@web56301.mail.re3.yahoo.com> Message-ID: You can confirm it very easy by adding 'compress = FALSE' to the save function. You will see that the .RData files are the same size since you are not compressing them. On Tue, Jan 4, 2011 at 7:19 PM, array chip wrote: > Hi, I noticed a Rdata size issue that's puzzling to me. Attached find 2 example > datasets in text file. Both are 100x5, so the sizes for both text file are the > same. However, when I read them into R, the sizes are much different: > > tt<-as.matrix(read.table("tt.txt",header=T,row.names=1)) > save(tt,file='tt.RData') > > tt.big<-as.matrix(read.table("tt.big.txt",header=T,row.names=1)) > save(tt.big,file='tt.big.RData') > > "tt.RData" is 2KB while "tt.big.RData" is 5KB. This is not a big deal with the > example datasets, but my real datasets are much larger, the difference is 35MB > vs. 1MB for the RData objects. > > > The difference between the 2 datasets above is that "tt.big" is a smoothed > version of "tt", so there are a lot less unique values in tt.big than tt, I > guess this is the reason for the difference in sizes of RData objects, can > anyone confirm? > > > Thanks > > John > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From bbolker at gmail.com Wed Jan 5 03:03:35 2011 From: bbolker at gmail.com (Ben Bolker) Date: Wed, 5 Jan 2011 02:03:35 +0000 (UTC) Subject: [R] Cost-benefit/value for money analysis References: Message-ID: Graham Smith gmail.com> writes: > > I assume this has a "proper" name, but I don't know what it is and wondered > if anyone knew of a package that might do the following, or something > similar. > > As an example, assume I have borrowed and read 10 books on R , and I have > subjectively given each of them a "value" score in terms of how useful I > think they are. I also know how much each costs in terms of money. > > What I would like to do is to calculate the costs of every possible > combination of the 10 books, and plot the total monetary value for each of > these possible combination with their associated subjective value totals, > to help decide which combination of books represents the best value for > money. > > I know that some specialist decision analysis software does this sort of > thing, but was hoping R might have an appropriate package. Perhaps you can specify your question more precisely, or differently. The way I interpret it, if there are no interactions in price (e.g. you get a discount for buying more than one book at a time) or in value (e.g. you learn more from one book having read another), then you get the best value/price ratio by taking only the book with the highest value/price. (If you take no books at all, your value/price ratio is undefined.) The algebra below shows that combining a lower value/price book with a higher one always lowers your overall value/price ratio. If you redefine your problem, you might find the combn() or expand.grid() functions, along with various versions of apply(), to be useful. If you have too large a search space you might take a look at the simulated annealing (SANN) option of optim(). =================== if a1/b1 > a2/b2 (1) and a1, b1, a2, b2 > 0 show a1/b1 > (a1+a2)/(b1+b2) (2) i.e. a1/b1 - (a1+a2)/(b1+b2) > 0 or (a1(b1+b2)-(a1+a2)b1)/(b1+b2) = (a1*b2-a2*b1)/(b1+b2) > 0 the numerator is (a1*b2-a2*b1): (1) implies that a1*b2>a2*b1 so the numerator is positive qed From dwinsemius at comcast.net Wed Jan 5 03:18:41 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 4 Jan 2011 21:18:41 -0500 Subject: [R] Cost-benefit/value for money analysis In-Reply-To: References: Message-ID: <7DFDB350-42F8-4968-BCD6-48E9CFF7EF3B@comcast.net> On Jan 4, 2011, at 9:03 PM, Ben Bolker wrote: > Graham Smith gmail.com> writes: > >> >> I assume this has a "proper" name, but I don't know what it is and >> wondered >> if anyone knew of a package that might do the following, or something >> similar. >> >> As an example, assume I have borrowed and read 10 books on R , and >> I have >> subjectively given each of them a "value" score in terms of how >> useful I >> think they are. I also know how much each costs in terms of money. >> >> What I would like to do is to calculate the costs of every possible >> combination of the 10 books, and plot the total monetary value for >> each of >> these possible combination with their associated subjective value >> totals, >> to help decide which combination of books represents the best value >> for >> money. >> >> I know that some specialist decision analysis software does this >> sort of >> thing, but was hoping R might have an appropriate package. > > Perhaps you can specify your question more precisely, or differently. > The way I interpret it, if there are no interactions in price > (e.g. you get a discount for buying more than one book at a time) > or in value (e.g. you learn more from one book having read another), > then you get the best value/price ratio by taking only the book with > the highest value/price. (If you take no books at all, your value/ > price > ratio is undefined.) The algebra below shows that combining a lower > value/price book with a higher one always lowers your overall value/ > price > ratio. I think a similar argument "at the margins" would show that even if the task were specified as maximal value with a budget, simply ordering by the value/price and buying until the cumsum of the price was greater than budget would solve the alternate statement of the problem. I suppose there might be situations where there were marginal choices of buying two books whose value/price was less than marginally maximal because two other marginally maximal choices would break the budget. This sounds like a homework problem and I don't see any student effort yet. Search terms include: "decision analysis" , "cost- benefit analysis", or "utility theory". -- David. > > If you redefine your problem, you might find the combn() or > expand.grid() functions, along with various versions of apply(), to > be useful. If you have too large a search space you might take a look > at the simulated annealing (SANN) option of optim(). > > =================== > if a1/b1 > a2/b2 (1) > > and a1, b1, a2, b2 > 0 > > show > > a1/b1 > (a1+a2)/(b1+b2) (2) > > i.e. > > a1/b1 - (a1+a2)/(b1+b2) > 0 > > or > (a1(b1+b2)-(a1+a2)b1)/(b1+b2) = > (a1*b2-a2*b1)/(b1+b2) > 0 > > the numerator is (a1*b2-a2*b1): > > (1) implies that a1*b2>a2*b1 > so the numerator is positive > > qed > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From eduardo.oliveirahorta at gmail.com Wed Jan 5 03:56:28 2011 From: eduardo.oliveirahorta at gmail.com (Eduardo de Oliveira Horta) Date: Wed, 5 Jan 2011 00:56:28 -0200 Subject: [R] Adding lines in ggplot2 Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jwiley.psych at gmail.com Wed Jan 5 04:15:14 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Tue, 4 Jan 2011 19:15:14 -0800 Subject: [R] Adding lines in ggplot2 In-Reply-To: References: Message-ID: Hi Eduardo, To shamelessly borrow from the Princess Bride: ?Why do you use a data frame? Are you touched in the head? "Oh no. It's just they're terribly practical. I think everyone will be using them in the future.? Using data frames is what Hadley intended, and once you get used to it, it is not nearly as restrictive as you might think. This does what you want, I believe. Rather than creating extraneous variables, I simply perform various transformations on 'x' within the plotting code. require(ggplot2) dfm <- data.frame(x = 1:10) qplot(x = x, y = sqrt(x), data = dfm, geom = "line", colour = I("darkgreen")) + geom_line(aes(x = x, y = log(x)), colour = "red") Cheers, Josh On Tue, Jan 4, 2011 at 6:56 PM, Eduardo de Oliveira Horta wrote: > Hello, > > this is probably a recurrent question, but I couldn't find any answers that > didn't involve the expression "data frame"... so perhaps I'm looking for > something new here. > > I wanted to find a code equivalent to > >> x=sqrt(1:10) >> y=log(1:10) >> plot(1:10, x, type="lines", col="darkgreen") >> lines(1:10, y, col="red") > > to use with ggplot2. I've tried > >> x=sqrt(1:10) >> y=log(1:10) >> qplot(1:10, x, geom="line", colour=I("darkgreen")) ##### note you would also need a + after the qplot() code to add geom_line() ##### >> geom_line(1:10, y, colour="red") > Error: ggplot2 doesn't know how to deal with data of class numeric > > but it seems that the "data frame restriction" is really very restrictive > here. Any solutions that don't imply using as.data.frame to my data? > > Thanks in advance, and best regards! > > Eduardo Horta > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From djmuser at gmail.com Wed Jan 5 04:28:04 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Tue, 4 Jan 2011 19:28:04 -0800 Subject: [R] Adding lines in ggplot2 In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From gunter.berton at gene.com Wed Jan 5 05:39:43 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Tue, 4 Jan 2011 20:39:43 -0800 Subject: [R] Adding lines in ggplot2 In-Reply-To: References: Message-ID: Dennis: Can't speak to ggplot2, but your comments regarding lattice are not quite correct. Many if not all of lattice's basic plot functions are generic, which means that one has essentially complete latitude to define plotting methods for arbitrary data structures. For example, there is an xyplot.ts method for time series -- class ts -- data. Of course, for most lattice methods, the data do naturally come in a data frame, and a standard lattice argument is to give a frame from which to pull the data. But this is not required. -- Bert > > Please explain to me how > > df <- data.frame(x, y, index = 1:10) > qplot(index, x, geom = 'line', ...) > > is 'very restrictive'. Lattice and ggplot2 are *structured* graphics systems > - to get the gains that they provide, there are some costs. I don't perceive > organization of data into a data frame as being restrictive - in fact, if > you learn how to construct data for input into ggplot2 to simplify the code > for labeling variables and legends, the data frame requirement is actually a > benefit rather than a restriction. Moreover, one can use the plyr and > reshape(2) packages to reshape or condense data frames to provide even more > flexibility and freedom to produce ggplot2 and lattice graphics. In > addition, the documentation for ggplot2 is quite explicit about requiring > data frames for input, so it is behaving as documented. The complexity (and > interaction) of the graphics code probably has something to do with that. > > Since Josh left you a quote, I'll supply another, from Prof. Steve Vardeman > in a class I took with him a long time ago: > "There is no free lunch in statistics: in order to get something, you've got > to give something up." > > In this case, if you want the nice infrastructure provided by ggplot2, you > have to create a data frame for input. > > Dennis > >> >> Thanks in advance, and best regards! >> >> Eduardo Horta >> >> ? ? ? ?[[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Bert Gunter Genentech Nonclinical Biostatistics From aaditya.nanduri at gmail.com Wed Jan 5 06:39:40 2011 From: aaditya.nanduri at gmail.com (Aaditya Nanduri) Date: Tue, 4 Jan 2011 23:39:40 -0600 Subject: [R] R not recognized in command line Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From diasandre at gmail.com Wed Jan 5 01:26:42 2011 From: diasandre at gmail.com (ADias) Date: Tue, 4 Jan 2011 16:26:42 -0800 (PST) Subject: [R] List to a summary table Message-ID: <1294187202044-3174698.post@n4.nabble.com> Hi Suppose you have the code below. The result I get from the cat function is from the avgs object. Now, I have 30 diferent objects like this and I wish to make a summary table, something like: Avgs1 Avgs2 Avgs3 i= 2 average= 0.515983i i= 2 average= 0.746983 i= 2 average= 0.2665983 i= 3 average= 0.5135953 i= 3 average= 0.7345953 i= 3 average= 0.23455953 i= 4 average= 0.4998128 i= 4 average= 0.7233128 i= 4 average= 0.21398128 > library(cluster) > d<-hclust(dist(iris[,-5])) > > avgs<-sapply(1:20,function(x) + summary(silhouette(cutree(d,x), + dist(iris[,-5])))) > # str(avgs) > > # print out the average widths > for (i in 2:length(avgs)){ # ignore first item + cat('i=', i, 'average=', avgs[[i]]$avg.width, '\n') + } i= 2 average= 0.515983 i= 3 average= 0.5135953 i= 4 average= 0.4998128 i= 5 average= 0.346174 i= 6 average= 0.3382031 i= 7 average= 0.3297649 i= 8 average= 0.324025 i= 9 average= 0.3191681 i= 10 average= 0.3028503 i= 11 average= 0.3072648 i= 12 average= 0.2834498 i= 13 average= 0.2776717 i= 14 average= 0.2855396 i= 15 average= 0.2745142 i= 16 average= 0.2578903 i= 17 average= 0.2531909 i= 18 average= 0.2473504 i= 19 average= 0.2484205 i= 20 average= 0.2545357 thanks A.Dias -- View this message in context: http://r.789695.n4.nabble.com/List-to-a-summary-table-tp3174698p3174698.html Sent from the R help mailing list archive at Nabble.com. From eric.linder at navteq.com Tue Jan 4 22:06:55 2011 From: eric.linder at navteq.com (Linder, Eric) Date: Tue, 4 Jan 2011 15:06:55 -0600 Subject: [R] update.views("Spatial") does not seem to be able to find RPyGeo package Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From frodo.jedi at yahoo.com Tue Jan 4 23:37:07 2011 From: frodo.jedi at yahoo.com (Frodo Jedi) Date: Tue, 4 Jan 2011 14:37:07 -0800 (PST) Subject: [R] t-test or ANOVA...who wins? Help please! Message-ID: <183012.52698.qm@web57908.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From lucam1968 at gmail.com Tue Jan 4 19:45:42 2011 From: lucam1968 at gmail.com (Luca Meyer) Date: Tue, 4 Jan 2011 19:45:42 +0100 Subject: [R] error in calling source(): invalid multibyte character in parser In-Reply-To: References: <25DFCF6A-7BEE-430E-B009-10183B43429D@gmail.com> Message-ID: <09AA819A-0A64-4862-B3D6-7CDDAA40DBC4@gmail.com> How would I go by doing that? I have tried with: source("file.R", encoding="it_IT.UTF-8") But I get Error in file(file, "r", encoding = encoding) : unsupported conversion from 'it_IT.UTF-8' to '' Thanks, Luca PS: "it_IT.UTF-8" is what I get under locale when I run sessionInfo() Il giorno 03/gen/2011, alle ore 09.48, Prof Brian Ripley ha scritto: > On Mon, 3 Jan 2011, peter dalgaard wrote: > >> >> On Jan 3, 2011, at 08:32 , Luca Meyer wrote: >> >>> Being italians when writing comments/instructions we use accented letters - like ?, ?, ?, etc.... when running R scripts using such characters I get and error saying: >>> >>> invalid multibyte character in parser >>> >>> I have been looking at the help and searched the r-help archives but I haven't find anything that I could intelligibly apply to my case. >>> >>> Can anyone suggest a fix for this error? >> >> The most likely cause is that your scripts are written in an "8 bit ASCII" encoding (Latin-1 or -9, most likely), while R is running in a UTF8 locale. If that is the cause, the fix is to standardize things to use the same locale. You can convert the encoding of your source file using the iconv utility (in a Terminal window). > > Or use the 'encoding' argument of source() to tell R what the encoding is, e.g. encoding="latin1" or "latin-9" (the inconsistency being in the iconv used on Macs, not in R). > >> >> -pd >> >>> >>> Thanks, >>> Luca >>> >>> Mr. Luca Meyer >>> www.lucameyer.com >>> IBM SPSS Statistics release 19.0.0 >>> R version 2.12.1 (2010-12-16) >>> Mac OS X 10.6.5 (10H574) - kernel Darwin 10.5.0 >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> -- >> Peter Dalgaard >> Center for Statistics, Copenhagen Business School >> Solbjerg Plads 3, 2000 Frederiksberg, Denmark >> Phone: (+45)38153501 >> Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 From luke-tierney at uiowa.edu Tue Jan 4 22:14:39 2011 From: luke-tierney at uiowa.edu (luke-tierney at uiowa.edu) Date: Tue, 4 Jan 2011 15:14:39 -0600 Subject: [R] packagename:::functionname vs. importFrom In-Reply-To: <1294162766395-3174003.post@n4.nabble.com> References: <1294091145769-3172684.post@n4.nabble.com> <1294109427406-3172984.post@n4.nabble.com> <1294162766395-3174003.post@n4.nabble.com> Message-ID: On Tue, 4 Jan 2011, Frank Harrell wrote: > > Thanks Luke. By "the namespace from which you import is loaded when your > package is" I take it that you are saying that all such referenced packages > are loaded up front, which is not what I hoped. That is what happens. You can use conditional imports in the NAMESPACE but that isn't appropriate for all setting. Using :: is reasonable if you want to use a fuction only if its package is available. Using ::: is code is a really bad idea for reasons already explained inthis thread. > And it's too bad you can't > import unexported objects, as that rather defeats the purpose of importFrom. The purpose of importFrom is to avoid a full import. So in no sense does this "defeat the purpose of importFrom". Best, luke > > Frank > > > ----- > Frank Harrell > Department of Biostatistics, Vanderbilt University > -- Luke Tierney Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke at stat.uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu From maj at stats.waikato.ac.nz Wed Jan 5 01:02:46 2011 From: maj at stats.waikato.ac.nz (Murray Jorgensen) Date: Wed, 05 Jan 2011 13:02:46 +1300 Subject: [R] Converting Fortran or C++ etc to R Message-ID: <4D23B526.2020906@stats.waikato.ac.nz> I'm going to try my hand at converting some Fortran programs to R. Does anyone know of any good articles giving hints at such tasks? I will post a selective summary of my gleanings. Cheers, Murray -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: maj at waikato.ac.nz Fax 7 838 4155 Phone +64 7 838 4773 wk Home +64 7 825 0441 Mobile 021 0200 8350 From marchywka at hotmail.com Tue Jan 4 21:10:55 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Tue, 4 Jan 2011 15:10:55 -0500 Subject: [R] Navigating web pages using R In-Reply-To: <512422.96077.qm@web37407.mail.mud.yahoo.com> References: <512422.96077.qm@web37407.mail.mud.yahoo.com> Message-ID: > Date: Tue, 4 Jan 2011 10:54:19 -0800 > From: egregory2007 at yahoo.com > To: r-help at r-project.org > Subject: [R] Navigating web pages using R > > R-Help, > > I'm trying to obtain some data from a webpage which masks the URL from the user, > so an explicit URL will not work. For example, when one navigates to the web > page the URL looks something like: > http://137.113.141.205/rpt34s.php?flags=1 (changed for privacy, but i'm not sure > you could access it anyways since it's internal to the agency I work for). LOL, presuming you are not a disgruntled employee, it is always amusing to see some entity with a fancy cryptic web design drink their own Koolaid :) This is the most annoying kind of code to write, especially when there is no reason such as revenue model to make it hard to get. I've posted in other forums about the general need for an API if you are providing data to others in a non-hostile setting. > The site has three drop-down menus for "Site", "Month," and "Year". When a > combination is selected of these, the resulting URL is > always http://137.113.141.205/rpt34s (nothing changes, except "flags=1" is > dropped, so what I need to be able to do is write something that will navigate > to the original URL, then select some combination of "Site", "Month", and > "Year," and then submit the query to the site to navigate to the page with the > data. > Is this a capability that R has as a language? Unfortunately, I'm unfamiliar > with html or php programming, so if this question belongs in a forum on that I > apologize. I'm trying to centralize all of my code for my analysis in R! I'm sure that ultimately you can code this in R but for digging out what you need there may be better approaches. First I would try to contact the page author or determine if there is a better way to get the same data. Failing that, you may be able to find a "form" section in the html and copy that. Firefox is supposed to have something called "firebug" to let you see what the page does but I've never actually used that. Generally I use linux or cygwin command line tools to diagnose this junk, R may support some of these features but this is a common issue outside of R too and so it may be worth while learning the other tools. If all else fails, downloading a local copy of the page etc, you may be able to do a packet capture and just see what it does by brute force. >From what I have seen, the R tools are pretty much named after the linux tools, curl for example. > > Thank you, > -Erik Gregory > Student Assistant, California EPA > CSU Sacramento, Mathematics > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From seva.jurica at gmail.com Tue Jan 4 23:36:53 2011 From: seva.jurica at gmail.com (Jurica Seva) Date: Tue, 4 Jan 2011 17:36:53 -0500 Subject: [R] Print plot to pdf, jpg or any other format when using scatter3d error In-Reply-To: <4D232123.8010908@gmail.com> References: <4D227511.3040009@gmail.com> <4D232123.8010908@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From seva.jurica at gmail.com Tue Jan 4 23:43:36 2011 From: seva.jurica at gmail.com (Jurica Seva) Date: Tue, 04 Jan 2011 17:43:36 -0500 Subject: [R] Error in M[, 1] : incorrect number of dimensions when trying to plot hexbin Message-ID: <4D23A298.1070208@foi.hr> Hello again, I am trying to plot out activity regions the user did on the screen via plotting a hexbin. I have 20 users whose information i want to plot out and it stops in the user 16 with the message Error in M[, 1] : incorrect number of dimensions. Any advice would be appreciated as i dont see why this error occurs on the data for the 16th user. All the other users get plotted out without a problem. I am using the R 2.12.1 on W7 if needed. The user data is in the csv file attached to this mail. The R code is: #libraries library(RMySQL) library(hexbin) #user data mycon <- dbConnect(MySQL(), user='root',dbname='test',host='localhost',password='') rsUser01 <- dbSendQuery(mycon, "select a.userID,a.sessionID,a.actionTaken,a.timelineMSEC,a.durationMSEC,b.X,b.Y,b.Rel__dist_,b.Total_dist_ from `actiontimes` as a , `ulogdata` as b where a.originalRECNO = b.RECNO") user01 <- fetch(rsUser01, n= -1) user01[1,1] #list of users mycon <- dbConnect(MySQL(), user='root',dbname='test',host='localhost',password='') listUser <- dbSendQuery(mycon, "select distinct userID from ulogdata") userList <- fetch(listUser, n= -1) par(mfrow=c(5,3)) tmp6=rep("UserScatterActionRegion.pdf") tmp7=paste(tmp6,sep="") tmp7 #pdf(tmp7) jpeg(file="UserScatterActionRegion.jpeg") for (i in 1:20){ x<-subset(user01 ,userID == userList[i,],select=c(X,Y)) x bin<-hexbin(x$X, x$Y, xbins=100) part1=rep("Action Regions") nameFull=paste(userList[i,],part1,sep="") nameFull plot(bin, main=nameFull) } dev.off() From singhi at cs.ucr.edu Wed Jan 5 06:08:07 2011 From: singhi at cs.ucr.edu (Indrajeet Singh) Date: Tue, 4 Jan 2011 21:08:07 -0800 Subject: [R] unique limited to 536870912 Message-ID: Hi I am using R with igraph to analyze an edgelist that is greater than the said amount. Does anyone know a way around this? Thanks Inder From smriti.sebastuan at gmail.com Wed Jan 5 06:16:44 2011 From: smriti.sebastuan at gmail.com (smriti Sebastian) Date: Wed, 5 Jan 2011 10:46:44 +0530 Subject: [R] multipanel plots Message-ID: hi, i have attached a doc file..Is this graph can be plotted using R?Plz help regards, smriti From deepayan.sarkar at gmail.com Wed Jan 5 07:36:54 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Wed, 5 Jan 2011 12:06:54 +0530 Subject: [R] lattice: how to "center" a title? In-Reply-To: <1294156936042-3173800.post@n4.nabble.com> References: <8D33361D-F6CC-4485-91C5-81BC16107E6C@web.de> <1294156936042-3173800.post@n4.nabble.com> Message-ID: On Tue, Jan 4, 2011 at 9:32 PM, Dieter Menne wrote: > > > mhofert wrote: >> >> trellis.device("pdf", width = 5, height = 5) >> ? print(xyplot(0 ~ 0, main = "This title is not 'centered' for the human's >> eye", scales = list(alternating = c(1,1), tck = c(1,0)))) >> dev.off() >> >> ... the title does not seem to be "centered" for the human's eye [although >> it is centered when the plot (width) is considered with the y-axis label]. >> > > This is because there is a y label, and centering is on the page (as you > noted). One way around this would be to add a similar padding at the right > side. See the example below, where I have exaggerated the effect. Try a > padding of 5 instead. You can also use 'xlab.top' instead of 'main' with the version of lattice available at r-forge (to be uploaded to CRAN soon-ish). -Deepayan From m_hofert at web.de Wed Jan 5 08:17:20 2011 From: m_hofert at web.de (Marius Hofert) Date: Wed, 5 Jan 2011 08:17:20 +0100 Subject: [R] lattice: how to "center" a title? In-Reply-To: References: <8D33361D-F6CC-4485-91C5-81BC16107E6C@web.de> <1294156936042-3173800.post@n4.nabble.com> Message-ID: <1FAD0930-C25D-46A4-BA17-B8EEEC40353F@web.de> Very nice, thanks a lot. Cheers, Marius From lamprianou at yahoo.com Wed Jan 5 08:25:38 2011 From: lamprianou at yahoo.com (Iasonas Lamprianou) Date: Tue, 4 Jan 2011 23:25:38 -0800 (PST) Subject: [R] variance inflation factors Message-ID: <828838.50819.qm@web120603.mail.ne1.yahoo.com> Dear all I run a regression model with three predictors. When I try the Variance Inflation Factors command from Rcmdr menue, I get the message vif(LinearModel.4) ERROR: attempt to set an attribute on NULL and get no results. I know that there is high multicolinearity, but why does it not work? Maybe a bug? I use Windows Vista and everything else seems to work OK. Thank you Dr. Iasonas Lamprianou Assistant Professor (Educational Research and Evaluation) Department of Education Sciences European University-Cyprus P.O. Box 22006 1516 Nicosia Cyprus Tel.: +357-22-713178 Fax: +357-22-590539 Honorary Research Fellow Department of Education The University of Manchester Oxford Road, Manchester M13 9PL, UK Tel. 0044 161 275 3485 iasonas.lamprianou at manchester.ac.uk From krishna at primps.com.sg Wed Jan 5 08:46:21 2011 From: krishna at primps.com.sg (SNV Krishna) Date: Wed, 5 Jan 2011 15:46:21 +0800 Subject: [R] how to subset unique factor combinations from a data frame. In-Reply-To: References: <5863F0E77BB746C7834DB95DD25FBA20@primpsg><3C14C468291E41F2986AD3B30DBDA276@primpsg> Message-ID: <43696A80876F43BAACDDD6F635D5B7ED@primpsg> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From deepayan.sarkar at gmail.com Wed Jan 5 09:54:45 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Wed, 5 Jan 2011 14:24:45 +0530 Subject: [R] lattice: strip panel function question In-Reply-To: <1291639978.2042.12.camel@5-hkg-09-0007> References: <1291624349.2042.9.camel@5-hkg-09-0007> <3CBFCFB1FEFFA841BA83ADF2F2A9C6FAF7E0F5@mango-data1.Mango.local> <1291639978.2042.12.camel@5-hkg-09-0007> Message-ID: On Mon, Dec 6, 2010 at 6:22 PM, Maarten van Iterson wrote: > Thanks Chris Campbell, > > I didn't though about that. > > Cheers, > Maarten > > On Mon, 2010-12-06 at 10:08 +0000, Chris Campbell wrote: >> data$subjectID <- paste(data$groups, data$subjects) # create a >> character >> label >> >> xyplot(responses~time|subjectID, groups = groups, data = data, >> aspect="xy") Another option is xyplot(responses~time | groups:subjects, data = data, aspect="xy") -Deepayan From tal.galili at gmail.com Wed Jan 5 10:15:41 2011 From: tal.galili at gmail.com (Tal Galili) Date: Wed, 5 Jan 2011 11:15:41 +0200 Subject: [R] t-test or ANOVA...who wins? Help please! In-Reply-To: <183012.52698.qm@web57908.mail.re3.yahoo.com> References: <183012.52698.qm@web57908.mail.re3.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From myotistwo at gmail.com Wed Jan 5 10:30:29 2011 From: myotistwo at gmail.com (Graham Smith) Date: Wed, 5 Jan 2011 09:30:29 +0000 Subject: [R] Cost-benefit/value for money analysis In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From allane at cybaea.com Wed Jan 5 10:46:28 2011 From: allane at cybaea.com (Allan Engelhardt) Date: Wed, 05 Jan 2011 09:46:28 +0000 Subject: [R] openNLP package error In-Reply-To: <4CF7D9C1.7090402@cognition.uni-freiburg.de> References: <4CF7D9C1.7090402@cognition.uni-freiburg.de> Message-ID: <4D243DF4.6020207@cybaea.com> Apologies that I am late on this thread. On 02/12/10 17:39, Sascha Wolfer wrote: > I seem to have a problem with the openNLP package, I'm actually stuck > in the very beginning. Here's what I did: > > install.packages("openNLP") > > install.packages("openNLPmodels.de", repos = > "http://datacube.wu.ac.at/", type = "source") > > > library(openNLPmodels.de) > > library(openNLP) > > So I installed the main package as well as the supplementary german > model. Now, I try to use the "sentDetect" function: > > > s <- c("Das hier ist ein Satz. Und hier ist noch einer - sogar mit > Gedankenstrich. Ist das nicht toll?") > > sentDetect(s, language = "de", model = "openNLPmodels.de") > > I get the following error message which I can't make any sense of: > > Fehler in .jnew("opennlp/maxent/io/SuffixSensitiveGISModelReader", > .jnew("java.io.File", : > java.io.FileNotFoundException: openNLPmodels.de (No such file or > directory) The correct syntax seems to be sentDetect(s, model = system.file("models", "de-sent.bin", package = "openNLPmodels.de")) but unfortunately I get Error in .jcall(.jnew("opennlp/maxent/io/SuffixSensitiveGISModelReader", : java.io.UTFDataFormatException: malformed input around byte 48 YMMV. But you get the idea on the syntax of the model= argument. This "works": sentDetect(s, model = system.file("models", "sentdetect", "EnglishSD.bin.gz", package = "openNLPmodels.en")) # [1] "Das hier ist ein Satz. " # [2] "Und hier ist noch einer - sogar mit Gedankenstrich. " # [3] "Ist das nicht toll?" Hope this helps you a little. Allan From ana.carbonell at ba.ieo.es Wed Jan 5 09:20:35 2011 From: ana.carbonell at ba.ieo.es (Aina Carbonell) Date: Wed, 5 Jan 2011 09:20:35 +0100 Subject: [R] bwplot Message-ID: <0838E01493845742A4D4039EA34EB1C1C1BBEA@ieopalma2.ba.ieo.es> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From hlochner at oldmutual.com Wed Jan 5 09:17:54 2011 From: hlochner at oldmutual.com (Hein) Date: Wed, 5 Jan 2011 00:17:54 -0800 (PST) Subject: [R] How to use S-Plus functions in R Message-ID: <1294215474639-3174963.post@n4.nabble.com> Hi I am very new to R. I used to work in S-Plus a lot but that was years ago. I wrote a large number of functions that I now want to view and edit in R. I know I have to tell R where the functions are but I have no idea how. The functions are stored on my laptop's c-drive. I tried everything I could find e.g. library(myfilepath), source(myfilepath) etc. but nothing seems to work. Hein -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-S-Plus-functions-in-R-tp3174963p3174963.html Sent from the R help mailing list archive at Nabble.com. From lcn918 at gmail.com Wed Jan 5 08:26:01 2011 From: lcn918 at gmail.com (lcn) Date: Wed, 5 Jan 2011 15:26:01 +0800 Subject: [R] Navigating web pages using R In-Reply-To: References: <512422.96077.qm@web37407.mail.mud.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From thomas.carrie at bnpparibas.com Wed Jan 5 09:55:58 2011 From: thomas.carrie at bnpparibas.com (thomas.carrie at bnpparibas.com) Date: Wed, 5 Jan 2011 09:55:58 +0100 Subject: [R] What are the necessary Oracle software to install and run ROracle ? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From lcn918 at gmail.com Wed Jan 5 08:33:06 2011 From: lcn918 at gmail.com (lcn) Date: Wed, 5 Jan 2011 15:33:06 +0800 Subject: [R] Converting Fortran or C++ etc to R In-Reply-To: <4D23B526.2020906@stats.waikato.ac.nz> References: <4D23B526.2020906@stats.waikato.ac.nz> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From henri.leblond at fr.thalesgroup.com Wed Jan 5 08:31:24 2011 From: henri.leblond at fr.thalesgroup.com (Henri Leblond) Date: Wed, 05 Jan 2011 08:31:24 +0100 Subject: [R] R(D) Com under R1070 Message-ID: <18656_1294212685_4D241E4D_18656_96717_1_4D241E4C.2040103@fr.thalesgroup.com> I get the same trouble Please finally did you succeed fixing this trouble ? Henri From M.van_iterson.HG at lumc.nl Wed Jan 5 10:57:20 2011 From: M.van_iterson.HG at lumc.nl (Maarten van Iterson) Date: Wed, 5 Jan 2011 10:57:20 +0100 Subject: [R] lattice: strip panel function question In-Reply-To: References: <1291624349.2042.9.camel@5-hkg-09-0007> <3CBFCFB1FEFFA841BA83ADF2F2A9C6FAF7E0F5@mango-data1.Mango.local> <1291639978.2042.12.camel@5-hkg-09-0007> Message-ID: <1294221440.2017.46.camel@5-hkg-09-0007> Thanks, Deepayan, that solution is even more elegant! Maarten On Wed, 2011-01-05 at 14:24 +0530, Deepayan Sarkar wrote: > On Mon, Dec 6, 2010 at 6:22 PM, Maarten van Iterson > wrote: > > Thanks Chris Campbell, > > > > I didn't though about that. > > > > Cheers, > > Maarten > > > > On Mon, 2010-12-06 at 10:08 +0000, Chris Campbell wrote: > >> data$subjectID <- paste(data$groups, data$subjects) # create a > >> character > >> label > >> > >> xyplot(responses~time|subjectID, groups = groups, data = data, > >> aspect="xy") > > > Another option is > > xyplot(responses~time | groups:subjects, data = data, aspect="xy") > > -Deepayan -- Maarten van Iterson Center for Human and Clinical Genetics Leiden University Medical Center (LUMC) Research Building, Einthovenweg 20 Room S-04-038 Phone: 071-526 9439 E-mail: M.van_iterson.HG at lumc.nl --------------- Postal address: Postzone S-04-P Postbus 9600 2300 RC Leiden The Netherlands From mdsumner at gmail.com Wed Jan 5 12:01:14 2011 From: mdsumner at gmail.com (Michael Sumner) Date: Wed, 5 Jan 2011 22:01:14 +1100 Subject: [R] [R-downunder] Converting Fortran or C++ etc to R In-Reply-To: <4D23B526.2020906@stats.waikato.ac.nz> References: <4D23B526.2020906@stats.waikato.ac.nz> Message-ID: Hi Murray, at first I thought you meant compiling existing Fortran or C++ for use in R with .Fortran() and so on, but do you mean literal conversion from Fortran to just pure R code? I'm assuming pure R code for the rest of this: I've tried with some fairly simple C++ and C code, and that's been fairly easy - there are a lot of details you can ignore and just try to figure out the algorithm. It's nice if you have running software so you can compare outputs, but I did once eventually figure out some Pascal code from an old text book - it had enough actual example data printed in the book that allowed me eventually to figure it out. There were people around me who had once compiled Pascal, but it didn't sound like it was going to be much fun. Sometimes C and C++ chunks can be copied over directly and used with very few changes, but it will just depend. Good luck, and I would just jump in the deep end and send in questions if you get stuck. Cheers, Mike. On Wed, Jan 5, 2011 at 11:02 AM, Murray Jorgensen wrote: > I'm going to try my hand at converting some Fortran programs to R. Does > anyone know of any good articles giving hints at such tasks? I will post a > selective summary of my gleanings. > > Cheers, ?Murray > -- > Dr Murray Jorgensen ? ? ?http://www.stats.waikato.ac.nz/Staff/maj.html > Department of Statistics, University of Waikato, Hamilton, New Zealand > Email: maj at waikato.ac.nz ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Fax 7 838 4155 > Phone ?+64 7 838 4773 wk ? ?Home +64 7 825 0441 ? Mobile 021 0200 8350 > -- > R-downunder at stat.auckland.ac.nz > http://www.stat.auckland.ac.nz/r-downunder > > To unsubscribe send an email to R-downunder-unsubscribe at stat.auckland.ac.nz > -- Michael Sumner Institute for Marine and Antarctic Studies, University of Tasmania Hobart, Australia e-mail: mdsumner at gmail.com From djmuser at gmail.com Wed Jan 5 12:02:52 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Wed, 5 Jan 2011 03:02:52 -0800 Subject: [R] Adding lines in ggplot2 In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From petr.pikal at precheza.cz Wed Jan 5 12:04:30 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Wed, 5 Jan 2011 12:04:30 +0100 Subject: [R] how to subset unique factor combinations from a data frame. In-Reply-To: <43696A80876F43BAACDDD6F635D5B7ED@primpsg> References: <5863F0E77BB746C7834DB95DD25FBA20@primpsg><3C14C468291E41F2986AD3B30DBDA276@primpsg> <43696A80876F43BAACDDD6F635D5B7ED@primpsg> Message-ID: Hi You probably did not notice xtabs I mentioned before. as.data.frame(xtabs(~x+xx)) > u <- as.data.frame(table(x, xx)) > head(u) x xx Freq 1 A a 18 2 B a 27 3 C a 30 4 D a 30 5 E a 27 6 F a 18 > > v<-as.data.frame(xtabs(~x+xx)) > head(v) x xx Freq 1 A a 18 2 B a 27 3 C a 30 4 D a 30 5 E a 27 6 F a 18 Regards Petr r-help-bounces at r-project.org napsal dne 05.01.2011 08:46:21: > Hi Dennis, > > It worked! this is what I am looking for. Many thanks. > > Rgds, > > SNVK > _____ > > From: Dennis Murphy [mailto:djmuser at gmail.com] > Sent: Tuesday, January 04, 2011 9:07 PM > To: SNV Krishna > Cc: r-help at r-project.org > Subject: Re: [R] how to subset unique factor combinations from a data frame. > > > Hi: > > Did you try something like > > summdf <- as.data.frame(with(df, table(Commodity, Attribute, Unit))) > > > ? > The rows of the table should represent the unique combinations of the three > variables.... > > Here's a simple toy example to illustrate: > > x <- sample(LETTERS[1:6], 1000, replace = TRUE) > > xx <- sample(letters[1:6], 1000, replace = TRUE) > > u <- as.data.frame(table(x, xx)) > > dim(u) > [1] 36 3 > > head(u) > x xx Freq > 1 A a 26 > 2 B a 29 > 3 C a 25 > 4 D a 25 > 5 E a 27 > 6 F a 29 > > HTH, > Dennis > > > On Tue, Jan 4, 2011 at 2:19 AM, SNV Krishna wrote: > > > Hi, > > Sorry that my example is not clear. I will give an example of what each > variable holds. I hope this clearly explains the case. > > Names of the dataframe (df) and description > > Year :- Year is calendar year, from 1980 to 2010 > > Country :- is the country name, total no. (levels) of countries is ~ 190 > > Commodity :- Crude oil, Sugar, Rubber, Coffee .... No. (levels) of > commodities is 20 > > Attribute: - Production, Consumption, Stock, Import, Export... Levels ~ 20 > > Unit :- this is actually not a factor. It describes the unit of Attribute. > Say the unit for Coffee (commodity) - Production (attribute) is 60 kgs. > While the unit for Crude oil - Production is 1000 barrels > > Value :- value > > > tail(df, n = 10) // example data// > > Year Country Commodity Attribute Unit > Value > 1991 United Kingdom Wheat, Durum Total Supply (1000 MT) 70 > 1991 United Kingdom Wheat, Durum TY Exports (1000 MT) 0 > 1991 United Kingdom Wheat, Durum TY Imp. from U (1000 MT) 0 > 1991 United Kingdom Wheat, Durum TY Imports (1000 MT) 60 > 1991 United Kingdom Wheat, Durum Yield (MT/HA) 5 > > Wish this is clear. Any suggestion > > Regards, > > SNVK > > -----Original Message----- > From: Petr PIKAL [mailto:petr.pikal at precheza.cz] > Sent: Tuesday, January 04, 2011 4:06 PM > To: SNV Krishna > Cc: r-help at r-project.org > Subject: Odp: [R] how to subset unique factor combinations from a data > frame. > > Hi > > r-help-bounces at r-project.org napsal dne 04.01.2011 05:21:25: > > > Hi All > > > > I have these questions and request members expert view on this. > > > > a) I have a dataframe (df) with five factors (identity variables) and > value > > (measured value). The id variables are Year, Country, Commodity, > Attribute, > > Unit. Value is a value for each combination of this. > > > > I would like to get just the unique combination of Commodity, > > Attribute > and > > Unit. I just need the unique factor combination into a dataframe or a > table. > > I know aggregate and subset but dont how to use them in this context. > > aggregate(Value, list(Comoditiy, Atribute, Unit), function) > > > > > b) Is it possible to inclue non- aggregate columns with aggregate > function > > > > say in the above case > aggregate(Value ~ Commodity + Attribute, data > > = > df, > > FUN = count). The use of count(Value) is just a round about to return > the > > combinations of Commodity & Attribute, and I would like to include > 'Unit' > > column in the returned data frame? > > Hm. Maybe xtabs? But without any example it is only a guess. > > > > > c) Is it possible to subset based on unique combination, some thing > > like this. > > > > > subset(df, unique(Commodity), select = c(Commodity, Attribute, Unit)). > I > > know this is not correct as it returns an error 'subset needs a > > logical evaluation'. Trying various ways to accomplish the task. > > > > Probably sqldf package has tools for doing it but I do not use it so you > have to try yourself. > > df[Comodity==something, c("Commodity", "Attribute", "Unit")] > > can be other way. > > Anyway your explanation is ambiguous. Let say you have three rows with the > same Commodity. Which row do you want to select? > > Regards > Petr > > > > will be grateful for any ideas and help > > > > Regards, > > > > SNVK > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From petr.pikal at precheza.cz Wed Jan 5 12:09:39 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Wed, 5 Jan 2011 12:09:39 +0100 Subject: [R] Odp: bwplot In-Reply-To: <0838E01493845742A4D4039EA34EB1C1C1BBEA@ieopalma2.ba.ieo.es> References: <0838E01493845742A4D4039EA34EB1C1C1BBEA@ieopalma2.ba.ieo.es> Message-ID: Hi r-help-bounces at r-project.org napsal dne 05.01.2011 09:20:35: > I'm trying use the function bwplot, but I receive a message that the > function is not found. I charged the lattice, sm, and Hmrsc, package but Can you please explain how one can **charge** packages? I never did it. Besides did you start with library(lattice) before trying to issue bwplot(anything...) Regards Petr > without success. That I trying to do is an unique box-plot with in the > x-axes two levels Season and Area, and in the y axis abundance. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From petr.pikal at precheza.cz Wed Jan 5 12:18:16 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Wed, 5 Jan 2011 12:18:16 +0100 Subject: [R] Odp: List to a summary table In-Reply-To: <1294187202044-3174698.post@n4.nabble.com> References: <1294187202044-3174698.post@n4.nabble.com> Message-ID: Hi r-help-bounces at r-project.org napsal dne 05.01.2011 01:26:42: > > Hi > > Suppose you have the code below. The result I get from the cat function is > from the avgs object. Now, I have 30 diferent objects like this and I wish I would stick to list and do not make 30 objects a.g. avg.width from avgs object can be extracted by as.numeric(unlist(sapply(avgs, function(x) x[4])))[-1] Regards Petr > to make a summary table, something like: > > Avgs1 Avgs2 > Avgs3 > > i= 2 average= 0.515983i i= 2 average= 0.746983 i= 2 > average= 0.2665983 > i= 3 average= 0.5135953 i= 3 average= 0.7345953 i= 3 > average= 0.23455953 > i= 4 average= 0.4998128 i= 4 average= 0.7233128 i= 4 > average= 0.21398128 > > > > library(cluster) > > d<-hclust(dist(iris[,-5])) > > > > avgs<-sapply(1:20,function(x) > + summary(silhouette(cutree(d,x), > + dist(iris[,-5])))) > > # str(avgs) > > > > > # print out the average widths > > for (i in 2:length(avgs)){ # ignore first item > + cat('i=', i, 'average=', avgs[[i]]$avg.width, '\n') > + } > i= 2 average= 0.515983 > i= 3 average= 0.5135953 > i= 4 average= 0.4998128 > i= 5 average= 0.346174 > i= 6 average= 0.3382031 > i= 7 average= 0.3297649 > i= 8 average= 0.324025 > i= 9 average= 0.3191681 > i= 10 average= 0.3028503 > i= 11 average= 0.3072648 > i= 12 average= 0.2834498 > i= 13 average= 0.2776717 > i= 14 average= 0.2855396 > i= 15 average= 0.2745142 > i= 16 average= 0.2578903 > i= 17 average= 0.2531909 > i= 18 average= 0.2473504 > i= 19 average= 0.2484205 > i= 20 average= 0.2545357 > > > thanks > A.Dias > -- > View this message in context: http://r.789695.n4.nabble.com/List-to-a-summary- > table-tp3174698p3174698.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From djmuser at gmail.com Wed Jan 5 12:27:31 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Wed, 5 Jan 2011 03:27:31 -0800 Subject: [R] Cost-benefit/value for money analysis In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From myotistwo at gmail.com Wed Jan 5 12:29:41 2011 From: myotistwo at gmail.com (Graham Smith) Date: Wed, 5 Jan 2011 11:29:41 +0000 Subject: [R] Cost-benefit/value for money analysis In-Reply-To: <7DFDB350-42F8-4968-BCD6-48E9CFF7EF3B@comcast.net> References: <7DFDB350-42F8-4968-BCD6-48E9CFF7EF3B@comcast.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From b.rowlingson at lancaster.ac.uk Wed Jan 5 12:32:26 2011 From: b.rowlingson at lancaster.ac.uk (Barry Rowlingson) Date: Wed, 5 Jan 2011 11:32:26 +0000 Subject: [R] Converting Fortran or C++ etc to R In-Reply-To: References: <4D23B526.2020906@stats.waikato.ac.nz> Message-ID: On Wed, Jan 5, 2011 at 7:33 AM, lcn wrote: > As for your actual requirement to do the "convertion", I guess there'd not > exist any quick ways. You have to be both familiar with R and the other > language to make the rewrite work. To make the rewrite work _well_ is the bigger problem! The easiest way to big performance wins is going to be spotting vectorisation possibilities in the Fortran code. Any time you see a DO K=1,N loop then look to see if its just a single vector operation in R. Another way to big wins is to write test code, so you can check if your R code gives the same results as the Fortran (C/C++) code at every stage of the rewrite. Don't just write it all in one go and then hope it works! Small steps.... Barry From myotistwo at gmail.com Wed Jan 5 12:52:27 2011 From: myotistwo at gmail.com (Graham Smith) Date: Wed, 5 Jan 2011 11:52:27 +0000 Subject: [R] Cost-benefit/value for money analysis In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Mihai.Mirauta at bafin.de Wed Jan 5 12:58:42 2011 From: Mihai.Mirauta at bafin.de (Mihai.Mirauta at bafin.de) Date: Wed, 5 Jan 2011 12:58:42 +0100 Subject: [R] How to save graphs out of ACF ? Message-ID: <16C7D76555968E4EB7A3E3E4B29008A7257F592F0C@BABVMX03.office.dir> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Achim.Zeileis at uibk.ac.at Wed Jan 5 12:59:36 2011 From: Achim.Zeileis at uibk.ac.at (Achim Zeileis) Date: Wed, 5 Jan 2011 12:59:36 +0100 (CET) Subject: [R] update.views("Spatial") does not seem to be able to find RPyGeo package In-Reply-To: References: Message-ID: On Tue, 4 Jan 2011, Linder, Eric wrote: > I have this problem with loading RPyGeo package when using update.views. > How can I fix this. Only by changing the operating system. You are using Linux but the RPyGeo package require Windows, see http://CRAN.R-project.org/package=RPyGeo update.views() (or actually the underlying call to install.packages()) just informs you about this through a warning. hth, Z > I have tried to use other CRAN mirrors with the same result. > Below is a copy of my session. > ---------------------session----------------------- > R version 2.12.1 (2010-12-16) > Copyright (C) 2010 The R Foundation for Statistical Computing > ISBN 3-900051-07-0 > Platform: i486-pc-linux-gnu (32-bit) > > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type 'license()' or 'licence()' for distribution details. > > R is a collaborative project with many contributors. > Type 'contributors()' for more information and > 'citation()' on how to cite R or R packages in publications. > > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > [Previously saved workspace restored] > >> library(ctv) >> update.views('Spatial') > --- Please select a CRAN mirror for use in this session --- > Loading Tcl/Tk interface ... done > Warning message: > In update.views("Spatial") : > The following packages are not available: RPyGeo >> > ---------------------session----------------------- > > > > > > > > The information contained in this communication may be C...{{dropped:11}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From murdoch.duncan at gmail.com Wed Jan 5 13:49:46 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Wed, 05 Jan 2011 07:49:46 -0500 Subject: [R] How to use S-Plus functions in R In-Reply-To: <1294215474639-3174963.post@n4.nabble.com> References: <1294215474639-3174963.post@n4.nabble.com> Message-ID: <4D2468EA.9090105@gmail.com> On 11-01-05 3:17 AM, Hein wrote: > > Hi > > I am very new to R. I used to work in S-Plus a lot but that was years ago. > I wrote a large number of functions that I now want to view and edit in R. > I know I have to tell R where the functions are but I have no idea how. The > functions are stored on my laptop's c-drive. I tried everything I could > find e.g. library(myfilepath), source(myfilepath) etc. but nothing seems to > work. > > Hein Save their source as text, and source that. R can't read the binary S-Plus objects for recent S-Plus versions. Since R and S-Plus are not identical, you may need some modifications to the functions to get them to work in R: so test carefully. Duncan Murdoch From murdoch.duncan at gmail.com Wed Jan 5 13:54:31 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Wed, 05 Jan 2011 07:54:31 -0500 Subject: [R] Print plot to pdf, jpg or any other format when using scatter3d error In-Reply-To: References: <4D227511.3040009@gmail.com> <4D232123.8010908@gmail.com> Message-ID: <4D246A07.1080400@gmail.com> On 11-01-04 5:36 PM, Jurica Seva wrote: > Thank you, Duncan, it works now with rgl.snapshot (i did have to upgrade > to 2.12.1). Is there any way to manipulate the size of the created > image? The created plots are a bit small (256*256) Sure, they're the size of the window: it's just a snapshot. Just make it bigger (by mouse, or using par3d(windowRect= ...)) before taking the snapshot. Duncan Murdoch > > Thank you for your help once again :) > > Best, > Jurica > > On Tue, Jan 4, 2011 at 8:31 AM, Duncan Murdoch > wrote: > > On 03/01/2011 8:17 PM, Jurica Seva wrote: > > Hi, > > I have been trying to output my graphs to a file (jpeg, pdf, ps, it > doesnt matter) but i cant seem to be able to get it to output. > > > > As Uwe said, you are using rgl graphics, not base graphics. So none > of the standard devices work, you need to use the tools built into > rgl. Attach that package, and then read ?rgl.postscript (for > graphics in various vector formats, not just Postscript) and > ?rgl.snapshot (for bitmapped graphics). > > Some notes: > - For a while rgl.snapshot wasn't working in the Windows builds > with R 2.12.1; that is now fixed, so you should update rgl before > getting frustrated. > - rgl.snapshot just takes a copy of the graphics buffer that is > showing on screen, so it is limited to the size you can display > - rgl.postscript does a better job for the parts of an image that > it can handle, but it is not a perfect OpenGL emulator, so it > doesn't always include all components of a graph properly. > > Duncan Murdoch > > I tried a > few things but none of them worked and am lost as what to do > now. I am > using the scatter3d function, and it prints out the graphs on tot he > screen without any problems, but when it comes to writing them > to a file > i cant make it work. Is there any other way of producing > 3dimensional > graphs (they dont have to be rotatable/interactive after the > print out)? > > The code is fairly simple and is listed down : > > #libraries > library(RMySQL) > library(rgl) > library(scatterplot3d) > library(Rcmdr) > > ############################################################################## > #database connection > mycon<- dbConnect(MySQL(), > user='root',dbname='test',host='localhost',password='') > #distinct sessions > rsSessionsU01<- dbSendQuery(mycon, "select distinct sessionID from > actiontimes where userID = 'ID01'") > sessionU01<-fetch(rsSessionsU01) > sessionU01[2,] > > #user01 data > mycon<- dbConnect(MySQL(), > user='root',dbname='test',host='localhost',password='') > rsUser01<- dbSendQuery(mycon, "select > a.userID,a.sessionID,a.actionTaken,a.timelineMSEC,a.durationMSEC,b.X,b.Y,b.Rel__dist_,b.Total_dist_ > from `actiontimes` as a , `ulogdata` as b where a.originalRECNO = > b.RECNO and a.userID='ID01'") > user01<- fetch(rsUser01, n= -1) > user01[1,1] > > #plot loop > > for (i in 1:10){ > > userSubset<-subset(user01,sessionID == > sessionU01[i,],select=c(timelineMSEC,X,Y)) > userSubset > x<-as.numeric(userSubset$X) > y<-as.numeric(userSubset$Y) > scatter3d(x,y,userSubset$timeline,xlim = c(0,1280), ylim = > c(0,1024), > zlim=c(0,1800000),type="h",main=sessionU01[i,],sub=sessionU01[i,]) > tmp6=rep(".ps") > tmp7=paste(sessionU01[i,],tmp6,sep="") > tmp7 > rgl.postscript(tmp7,"ps",drawText=FALSE) > #pdf(file=tmp7) > #dev.print(file=tmp7, device=pdf, width=600) > #dev.off(2) > } > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > From djmuser at gmail.com Wed Jan 5 14:01:36 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Wed, 5 Jan 2011 05:01:36 -0800 Subject: [R] Adding lines in ggplot2 In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Thierry.ONKELINX at inbo.be Wed Jan 5 14:00:56 2011 From: Thierry.ONKELINX at inbo.be (ONKELINX, Thierry) Date: Wed, 5 Jan 2011 14:00:56 +0100 Subject: [R] Adding lines in ggplot2 In-Reply-To: References: Message-ID: <3DB16098F738284D8DBEB2FC369916385BE820@inboexch.inbo.be> Dear Eduardo, This a solution that you seem to want n <- 1:10 x <- sqrt(n) y <- log(n) qplot(n, x, geom="line", colour="darkgreen") + geom_line(data = data.frame(n , x = y), colour="red") But please compare it with the solution (code + result) below. Formatting the data.frame might be a bit more work, but formatting your graph is much easier. n <- 1:10 dataset <- rbind( data.frame(Number = n, Function = "sqrt", Result = sqrt(n)), data.frame(Number = n, Function = "log", Result = log(n)) ) #Using the default colours ggplot(dataset, aes(x = Number, y = Result, colour = Function)) + geom_line() #Using user-specified colours ggplot(dataset, aes(x = Number, y = Result, colour = Function)) + geom_line() + scale_colour_manual(values = c(sqrt = "darkgreen", log = "red")) Think about the gain when you want to display much more than 2 lines... dataset <- expand.grid(Number = n, Power = seq(0, 2, length = 21)) dataset$Result <- dataset$Number ^ dataset$Power ggplot(dataset, aes(x = Number, y = Result, colour = factor(Power))) + geom_line() HTH, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics & Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey > -----Oorspronkelijk bericht----- > Van: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] Namens Eduardo de Oliveira Horta > Verzonden: woensdag 5 januari 2011 3:56 > Aan: r-help > Onderwerp: [R] Adding lines in ggplot2 > > Hello, > > this is probably a recurrent question, but I couldn't find > any answers that didn't involve the expression "data > frame"... so perhaps I'm looking for something new here. > > I wanted to find a code equivalent to > > > x=sqrt(1:10) > > y=log(1:10) > > plot(1:10, x, type="lines", col="darkgreen") lines(1:10, y, > col="red") > > to use with ggplot2. I've tried > > > x=sqrt(1:10) > > y=log(1:10) > > qplot(1:10, x, geom="line", colour=I("darkgreen")) > geom_line(1:10, y, > > colour="red") > Error: ggplot2 doesn't know how to deal with data of class numeric > > but it seems that the "data frame restriction" is really very > restrictive here. Any solutions that don't imply using > as.data.frame to my data? > > Thanks in advance, and best regards! > > Eduardo Horta > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From inpost at gmail.com Wed Jan 5 14:11:52 2011 From: inpost at gmail.com (e-letter) Date: Wed, 5 Jan 2011 13:11:52 +0000 Subject: [R] dotchart for matrix data Message-ID: Readers, The following commands were applied, to create a dot chart with black dots and blue squares for data: > library(lattice) > testdot category values 1 b 44 2 c 51 3 d 65 4 a 10 5 b 64 6 c 71 7 d 49 8 a 27 dotplot(category~values,col=c("black","black","black","black","blue","blue","blue","blue"),bg=c("black","black","black","black","blue","blue","blue","blue"),pch=c(21,21,21,21,22,22,22,22),xlab=NULL, data=testdot) The resultant graph shows correctly coloured points, but not filled, only the border is coloured. The documentation for the command 'pch' (?pch) indicates that the commands shown above should show appropriately coloured solid symbols. What is causing this error please? From jholtman at gmail.com Wed Jan 5 14:15:19 2011 From: jholtman at gmail.com (jim holtman) Date: Wed, 5 Jan 2011 08:15:19 -0500 Subject: [R] unique limited to 536870912 In-Reply-To: References: Message-ID: Could it be that you are running on a 32-bit version of R? 536870912 * 4 = 2GB if those were integers which would use up all of memory. You never did show what your error message was or what system you were using. On Wed, Jan 5, 2011 at 12:08 AM, Indrajeet Singh wrote: > Hi > I am using R with igraph to analyze an edgelist that is greater than the said amount. Does anyone know a way around this? > > Thanks > Inder > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From bbolker at gmail.com Wed Jan 5 14:28:50 2011 From: bbolker at gmail.com (Ben Bolker) Date: Wed, 5 Jan 2011 13:28:50 +0000 (UTC) Subject: [R] How to save graphs out of ACF ? References: <16C7D76555968E4EB7A3E3E4B29008A7257F592F0C@BABVMX03.office.dir> Message-ID: bafin.de> writes: > > Hi, > > I want to save the autocorrelation plots resulting out of ACF (acf(ts)), not just by using the "Save as" > command in the R Gui but using some sort of code, which allows me to chose the format and the path. > Thank you, > > Mihai for example: a <- acf(runif(10)) pdf("acf.pdf") plot(a) dev.off() From marc_schwartz at me.com Wed Jan 5 14:31:09 2011 From: marc_schwartz at me.com (Marc Schwartz) Date: Wed, 05 Jan 2011 07:31:09 -0600 Subject: [R] What are the necessary Oracle software to install and run ROracle ? In-Reply-To: References: Message-ID: On Jan 5, 2011, at 2:55 AM, thomas.carrie at bnpparibas.com wrote: > Hello, > > I am running Linux, I have downloaded > > instantclient-basiclite-linux32-11.2.0.2.0.zip > instantclient-sqlplus-linux32-11.2.0.2.0.zip > instantclient-sdk-linux32-11.2.0.2.0.zip > instantclient-precomp-linux32-11.2.0.2.0.zip > > All these tarballs are unzipped in /usr/local/lib/instantclient, I have > added this path in the library path of the host. > > I can run sqlplus and proc, they do not complain about missing symbol. > > Then I install ROracle : install.packages("ROracle") > > Compilation step is OK > But when the test step tries to load the ROracle.so library, it fails : > > ** testing if installed package can be loaded > Error in dyn.load(file, DLLpath = DLLpath, ...) : > unable to load shared library > '/opt/R-2.11.1/lib/R/library/ROracle/libs/ROracle.so': > /opt/R-2.11.1/lib/R/library/ROracle/libs/ROracle.so: undefined symbol: > sqlprc > > Here is my list of lib in instantclient directory : > $ find -name "*.*o" -o -name "*.a" > ./libsqlplusic.so > ./sdk/demo/procobdemo.pco > ./cobsqlintf.o > ./libociicus.so > ./libnnz11.so > ./libocijdbc11.so > ./libsqlplus.so > > Do I need so more lib ? From which Oracle tarball ? > > Thanks for help If you have not, read through the INSTALL file for the package: http://cran.r-project.org/web/packages/ROracle/INSTALL Past postings with similar issues regarding the inability to load shared libs would suggest that compiling and installing the package outside of R from the CLI using 'R CMD INSTALL ...' rather than from within R using install.packages("ROracle"), may resolve the issue. Also, be sure you are running all of this as root, since installation to default locations will require root privileges. Two more things to consider: 1. R 2.12.1 is the current version of R. If you can, I would recommend updating from 2.11.1. 2. Be sure that you don't have a conflict between 32 and 64 bit versions of R and the Oracle tool chain. All components need to be one or the other. You seem to be using 32 bit versions of the Oracle components above. Check: .Machine$sizeof.pointer in R to see if you are running 32 or 64 bit R. If the former, the above will return 4, if the latter, 8. Another alternative would be to consider using Prof. Ripley's RODBC package and connecting to Oracle via ODBC. If you need further assistance, I would suggest subscribing and posting to r-sig-db or contacting the package author directly. More info on the list is here: https://stat.ethz.ch/mailman/listinfo/r-sig-db HTH, Marc Schwartz From jwiley.psych at gmail.com Wed Jan 5 14:51:30 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Wed, 5 Jan 2011 05:51:30 -0800 Subject: [R] R not recognized in command line In-Reply-To: References: Message-ID: Hi Aaditya, I assume you are running some variant of Windows and by the "prompt in DOS" you are using cmd.exe. Perhaps you are already, but from your examples it looks like either A) you are not in the same directory as R or B) are not adding the path to R in the command. For example, on Windows I always install R under C:\R\ so for me inside cmd.exe: C:\directory> C:\R\R-devel\bin\x64\R [[[R starts here]]] alternately you could switch directories over and then just type "R" at the console: C:\directory> cd C:\R\R-devel\bin\x64\ C:\R\R-devel\bin\x64> R [[[R starts here]]] or since you have set the environment variables: C:\directory> %R_HOME%\bin\x64\R [[[R starts here]]] Alternately, edit the PATH environment variable in Windows and add the path to R (i.e., R_HOME\bin\i386\ or whatever it is for you), and you should be able to just enter "R" at the command prompt and have it start. Cheers, Josh On Tue, Jan 4, 2011 at 9:39 PM, Aaditya Nanduri wrote: > Hello all, > > I recently installed rpy2 so that I could use R through Python. > > However, R was not recognized in the command line. > > So I decided to add it to the PATH variables. But it just doesnt work.... > And what I mean by it doesnt work is : No matter what I type at the prompt > in DOS- be it R, Rcmd, R CMD, Rscript- it is not recognized as a command. > > Path variables used : > 1. %R_HOME% --> C:\Program Files\R\R 2.12.1\ > 2. %R_HOME%\bin > 3. %R_HOME%\bin\i386 > 4. Some Batchscripts I found online that recognize the R.exe in \bin\i386 > but only if I run the batch file...its not natively recognized (if I were to > type 'R' at the prompt in DOS, its not recognized) > > I would appreciate any help in this matter. > Or should I do something else so that I can try rpy2? > > Python version 2.6.6 > R 2.12.1 > rpy2 2.0.8 > > > -- > Aaditya Nanduri > aaditya.nanduri at gmail.com > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From f.harrell at vanderbilt.edu Wed Jan 5 15:02:47 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Wed, 5 Jan 2011 06:02:47 -0800 (PST) Subject: [R] packagename:::functionname vs. importFrom In-Reply-To: References: <1294091145769-3172684.post@n4.nabble.com> <1294109427406-3172984.post@n4.nabble.com> <1294162766395-3174003.post@n4.nabble.com> Message-ID: <1294236167859-3175567.post@n4.nabble.com> Thanks very much Luke for clarifying. Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/packagename-functionname-vs-importFrom-tp3172684p3175567.html Sent from the R help mailing list archive at Nabble.com. From Sebastien.Bihorel at cognigencorp.com Wed Jan 5 15:23:40 2011 From: Sebastien.Bihorel at cognigencorp.com (Sebastien Bihorel) Date: Wed, 05 Jan 2011 09:23:40 -0500 Subject: [R] Stop and call objects Message-ID: <4D247EEC.4030004@cognigencorp.com> Dear R-users, Let's consider the following snippet: f <- function(x) tryCatch(sum(x),error=function(e) stop(e)) f('a') As expected, the last call returns an error message: Error in sum(x) : invalid 'type' (character) of argument My questions are the following: 1- can I easily ask the stop function to reference the "f" function in addition to "sum(x)" in the error message? 2- If not, I guess I would have to extract the call and message objects from e, coerce the call as a character object, build a custom string, and pass it to the stop function using call.=F. How can I coerce a call object to a character and maintain the "aspect" of the printed call (i.e. "sum(x)" instead of the character vector "sum" "x" returned by as.character(e$call))? Thank you Sebastien From eduardo.oliveirahorta at gmail.com Wed Jan 5 15:29:06 2011 From: eduardo.oliveirahorta at gmail.com (Eduardo de Oliveira Horta) Date: Wed, 5 Jan 2011 12:29:06 -0200 Subject: [R] Adding lines in ggplot2 In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Sebastien.Bihorel at cognigencorp.com Wed Jan 5 15:37:09 2011 From: Sebastien.Bihorel at cognigencorp.com (Sebastien Bihorel) Date: Wed, 05 Jan 2011 09:37:09 -0500 Subject: [R] R command execution from shell In-Reply-To: References: <4D23812C.609@cognigencorp.com> <4D2384D4.50401@gmail.com> Message-ID: <4D248215.20607@cognigencorp.com> Thank you for this alternative. Both seem to work on my systems. Sebastien Prof Brian Ripley wrote: > On Tue, 4 Jan 2011, Duncan Murdoch wrote: > >> On 04/01/2011 3:21 PM, Sebastien Bihorel wrote: >>> Dear R-users, >>> >>> Is there a way I can ask R to execute the "write("hello >>> world",file="hello.txt")" command directly from the UNIX shell, instead >>> of having to save this command to a .R file and execute this file >>> with R >>> CMD BATCH? >> >> Yes. Some versions of R support the -e option on the command line to >> execute a particular command. It's not always easy to work out the >> escapes so your shell passes all the quotes through... An >> alternative is to echo the command into the shell, e.g. >> >> echo 'cat("hello")' | R --slave >> >> (where the outer ' ' are just for bash). > > It is marginally preferable to use Rscript in place of 'R --slave'. > I think in all known shells > > Rscript -e "write('hello world', file = 'hello.txt')" > > will work. (If not, shQuote() will not work for that shell, but this > does work in sh+clones, csh+clones, zsh and Windows' cmd.exe.) > From dwinsemius at comcast.net Wed Jan 5 15:46:10 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 5 Jan 2011 09:46:10 -0500 Subject: [R] dotchart for matrix data In-Reply-To: References: Message-ID: <624B15A2-417D-40F1-A4FD-CF1FE24E4727@comcast.net> On Jan 5, 2011, at 8:11 AM, e-letter wrote: > Readers, > > The following commands were applied, to create a dot chart with black > dots and blue squares for data: > >> library(lattice) >> testdot > category values > 1 b 44 > 2 c 51 > 3 d 65 > 4 a 10 > 5 b 64 > 6 c 71 > 7 d 49 > 8 a 27 > > dotplot > (category > ~ > values > ,col > = > c > ("black > ","black > ","black > ","black > ","blue > ","blue > ","blue > ","blue > "),bg > = > c > ("black > ","black > ","black > ","black > ","blue > ","blue","blue","blue"),pch=c(21,21,21,21,22,22,22,22),xlab=NULL, > data=testdot) > > The resultant graph shows correctly coloured points, but not filled, > only the border is coloured. The documentation for the command 'pch' > (?pch) indicates that the commands shown above should show > appropriately coloured solid symbols. What is causing this error > please? There is no pch command. It is a graphical parameter. If you are looking at the "points" help page then you are not looking at documentation that necessarily applies to a lattice function like dotplot. After first looking at ?dotplot, then ?panel.dotplot, and then because it says the points are done with panel.xyplot, my guess is that you need to add a fill =TRUE or a fill= option. -- David Winsemius, MD West Hartford, CT From andy_liaw at merck.com Wed Jan 5 16:20:40 2011 From: andy_liaw at merck.com (Liaw, Andy) Date: Wed, 5 Jan 2011 10:20:40 -0500 Subject: [R] randomForest speed improvements In-Reply-To: <1294183823780-3174621.post@n4.nabble.com> References: <1294084769056-3172523.post@n4.nabble.com><1294097299183-3172834.post@n4.nabble.com> <1294183823780-3174621.post@n4.nabble.com> Message-ID: Note that that isn't exactly what I recommended. If you look at the example in the help page for combine(), you'll see that it is combining RF objects trained on the same data; i.e., instead of having one RF with 500 trees, you can combine five RFs trained on the same data with 100 trees each into one 500-tree RF. The way you are using combine() is basically using sample size to limit tree size, which you can do by playing with the nodesize argument in randomForest() as I suggested previously. Either way is fine as long as you don't see prediction performance degrading. Andy > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of apresley > Sent: Tuesday, January 04, 2011 6:30 PM > To: r-help at r-project.org > Subject: Re: [R] randomForest speed improvements > > > Andy, > > Thanks for the reply. I had no idea I could combine them > back ... that > actually will work pretty well. We can have several "worker > threads" load > up the RF's on different machines and/or cores, and then > re-assemble them. > RMPI might be an option down the road, but would be a bit of > overhead for us > now. > > Using the method of combine() ... I was able to drastically reduce the > amount of time to build randomForest objects. IE, using > about 25,000 rows > (6 columns), it takes maybe 5 minutes on my laptop. Using 5 > randomForest > objects (each with 5k rows), and then combining them, takes < > 1 minute. > > -- > Anthony > -- > View this message in context: > http://r.789695.n4.nabble.com/randomForest-speed-improvements- > tp3172523p3174621.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Notice: This e-mail message, together with any attachme...{{dropped:11}} From pcmckann at gmail.com Wed Jan 5 16:25:46 2011 From: pcmckann at gmail.com (Patrick McKann) Date: Wed, 5 Jan 2011 09:25:46 -0600 Subject: [R] get() within a command, specifically lmer Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From murdoch.duncan at gmail.com Wed Jan 5 16:41:20 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Wed, 05 Jan 2011 10:41:20 -0500 Subject: [R] R not recognized in command line In-Reply-To: References: Message-ID: <4D249120.2000108@gmail.com> On 11-01-05 8:51 AM, Joshua Wiley wrote: > Hi Aaditya, > > I assume you are running some variant of Windows and by the "prompt in > DOS" you are using cmd.exe. > > Perhaps you are already, but from your examples it looks like either > A) you are not in the same directory as R or B) are not adding the > path to R in the command. For example, on Windows I always install R > under C:\R\ so for me inside cmd.exe: > > C:\directory> C:\R\R-devel\bin\x64\R > > [[[R starts here]]] > > alternately you could switch directories over and then just type "R" > at the console: > > C:\directory> cd C:\R\R-devel\bin\x64\ > C:\R\R-devel\bin\x64> R > > [[[R starts here]]] > > or since you have set the environment variables: > > C:\directory> %R_HOME%\bin\x64\R > > [[[R starts here]]] > > Alternately, edit the PATH environment variable in Windows and add the > path to R (i.e., R_HOME\bin\i386\ or whatever it is for you), and you > should be able to just enter "R" at the command prompt and have it > start. Editing the PATH is probably the best approach, but a lot of people get it wrong because of misunderstanding how it works: - If you change PATH in one process the changes won't propagate anywhere else, and will be lost as soon as you close that process. That could be a cmd window, or an R session, or just about any other process that lets you change environment variables. - If you want to make global changes to the PATH, you need to do it in the control panel "System|Advanced|Environment variables" entries. - Often it is good enough to use a more Unix-like approach, and only make the change at startup of the cmd processor. You use the /k option when starting cmd if you want to run something on startup. Duncan Murdoch > > Cheers, > > Josh > > On Tue, Jan 4, 2011 at 9:39 PM, Aaditya Nanduri > wrote: >> Hello all, >> >> I recently installed rpy2 so that I could use R through Python. >> >> However, R was not recognized in the command line. >> >> So I decided to add it to the PATH variables. But it just doesnt work.... >> And what I mean by it doesnt work is : No matter what I type at the prompt >> in DOS- be it R, Rcmd, R CMD, Rscript- it is not recognized as a command. >> >> Path variables used : >> 1. %R_HOME% --> C:\Program Files\R\R 2.12.1\ >> 2. %R_HOME%\bin >> 3. %R_HOME%\bin\i386 >> 4. Some Batchscripts I found online that recognize the R.exe in \bin\i386 >> but only if I run the batch file...its not natively recognized (if I were to >> type 'R' at the prompt in DOS, its not recognized) >> >> I would appreciate any help in this matter. >> Or should I do something else so that I can try rpy2? >> >> Python version 2.6.6 >> R 2.12.1 >> rpy2 2.0.8 >> >> >> -- >> Aaditya Nanduri >> aaditya.nanduri at gmail.com >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > From benjamin.ward at bathspa.org Wed Jan 5 16:48:46 2011 From: benjamin.ward at bathspa.org (Ben Ward) Date: Wed, 5 Jan 2011 15:48:46 +0000 Subject: [R] Simulation - Natrual Selection Message-ID: Hi, I've been modelling some data over the past few days, of my work, repeatedly challenging microbes to a certain concentration of cleaner, until the required concentration to inhibit or kill them increaces, at which point they are challenged to a slightly higher concentration each day. I'm doing ths for two different cleaners and I'm collecting the required concentration to kill them as a percentage, the challenge number, the cleaner as a two level variable, and the lineage theyre in, because I have several different lineages. I'm expecting the values to rise for one cleaner but not the other as they aqquire resistance for one but not the other. Which has happened, but I have wide variation because one linage aqquired a very dramatic change which has made it immune to 50%, whereas the others, have exhibited a much more gradual increace, and so I have very weak p values for the cleaner variable, because it is secondary to the challenge vector, which has the most explanatory power, because without time and these challenges, the selection would no happen. I was using two bacterium species, but one was keen on giving hight erratic results, and insisted on becoming cross contaminated, BUT if I include it's data, It shoves cleaner over the p0.05 threshold, so i may just be having a problem with lack of data. So I've been asking about bootstrapping, which I plan to do to my cases, and thenfit a model to see what the confidence is like then. I assume if I bootstrap then it will re-select whole cases, and not jumble everything up, otherwise a microbe (totake the most extreme value as an example) with 50% concentration tolerance at the beginning, would make no sense at all. I'm also planning on doing models lineage by lineage, rather than putting them into one whole, just to have a look at what happens. But what I really wanted to know from this email, was if there's a package or function for natrual selection simulation I could make use of, to see if I can simulate the experiment. I want to start with a distribution of concentration tolerance values, taken from the inhibitory concentration values from my first lot of microbes, back when term began. Draw 3000 from this. Then values in that draw that fall below the exposure concentration I did in my experiment, are removed, or have a high chance of being removed. Then, from what is left, a draw is made again - or perhaps a copy operation (rather than a random draw) until I have 3000 again, rather than have all exactly the same concentration, then a value can be added to some of them, that increaces their concentration tolerance slightly, but not by a great deal, except in a few individuals, where it may be increaced dramatically(some sort of exponential dstribution perhaps). Then when the distribution of this simulated population of microbes has reached the next concentration (possibly the mean or mode of the distribution) (I have a series of 1 in 2 dilutions, so 100% 50%, 25% and so on), then they move on to the next concentration. I know it's probably quite a heavy thing, it was just a thought that came to me, if anybody has any experience in this area of R or knows of something that allows this to be done, please let me know. Thanks, Ben. From ying.zhang at struq.com Wed Jan 5 16:51:27 2011 From: ying.zhang at struq.com (ying zhang) Date: Wed, 5 Jan 2011 15:51:27 -0000 Subject: [R] rShowMessage "Fatal error: unable to open the base package Message-ID: <01d701cbacf0$6ac22cf0$404686d0$@struq.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ggrothendieck at gmail.com Wed Jan 5 16:59:00 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Wed, 5 Jan 2011 10:59:00 -0500 Subject: [R] R not recognized in command line In-Reply-To: <4D249120.2000108@gmail.com> References: <4D249120.2000108@gmail.com> Message-ID: On Wed, Jan 5, 2011 at 10:41 AM, Duncan Murdoch wrote: > On 11-01-05 8:51 AM, Joshua Wiley wrote: >> >> Hi Aaditya, >> >> I assume you are running some variant of Windows and by the "prompt in >> DOS" you are using cmd.exe. >> >> Perhaps you are already, but from your examples it looks like either >> A) you are not in the same directory as R or B) are not adding the >> path to R in the command. ?For example, on Windows I always install R >> under C:\R\ so for me inside cmd.exe: >> >> C:\directory> ?C:\R\R-devel\bin\x64\R >> >> [[[R starts here]]] >> >> alternately you could switch directories over and then just type "R" >> at the console: >> >> C:\directory> ?cd C:\R\R-devel\bin\x64\ >> C:\R\R-devel\bin\x64> ?R >> >> [[[R starts here]]] >> >> or since you have set the environment variables: >> >> C:\directory> ?%R_HOME%\bin\x64\R >> >> [[[R starts here]]] >> >> Alternately, edit the PATH environment variable in Windows and add the >> path to R (i.e., R_HOME\bin\i386\ or whatever it is for you), and you >> should be able to just enter "R" at the command prompt and have it >> start. > > Editing the PATH is probably the best approach, but a lot of people get it > wrong because of misunderstanding how it works: > > ?- ?If you change PATH in one process the changes won't propagate anywhere > else, and will be lost as soon as you close that process. ?That could be a > cmd window, or an R session, or just about any other process that lets you > change environment variables. > > ?- ?If you want to make global changes to the PATH, you need to do it in the > control panel "System|Advanced|Environment variables" entries. > > ?- Often it is good enough to use a more Unix-like approach, and only make > the change at startup of the cmd processor. ?You use the /k option when > starting cmd if you want to run something on startup. > You can also use Rcmd.bat, R.bat, Rgui.bat, etc. found at http://batchfiles.googlecode.com Just put any you wish to use anywhere on your path and it will work on all cmd instances and will also work when you install a new version of R since it looks up R's location in the registry. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From jwiley.psych at gmail.com Wed Jan 5 16:59:26 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Wed, 5 Jan 2011 07:59:26 -0800 Subject: [R] R not recognized in command line In-Reply-To: <4D249120.2000108@gmail.com> References: <4D249120.2000108@gmail.com> Message-ID: On Wed, Jan 5, 2011 at 7:41 AM, Duncan Murdoch wrote: > Editing the PATH is probably the best approach, but a lot of people get it > wrong because of misunderstanding how it works: > > ?- ?If you change PATH in one process the changes won't propagate anywhere > else, and will be lost as soon as you close that process. ?That could be a > cmd window, or an R session, or just about any other process that lets you > change environment variables. > > ?- ?If you want to make global changes to the PATH, you need to do it in the > control panel "System|Advanced|Environment variables" entries. Note it is also possible to make global changes using the powershell by setting the user to "Machine". [Environment]::SetEnvironmentVariable("TestVariable", "Test value.", "Machine") Josh > > ?- Often it is good enough to use a more Unix-like approach, and only make > the change at startup of the cmd processor. ?You use the /k option when > starting cmd if you want to run something on startup. > > Duncan Murdoch From friendly at yorku.ca Wed Jan 5 17:06:46 2011 From: friendly at yorku.ca (Michael Friendly) Date: Wed, 05 Jan 2011 11:06:46 -0500 Subject: [R] OT: Reprinting of Bertin's Semiology of Graphics Message-ID: <4D249716.9080406@yorku.ca> Aficionados of graphics may be interested to know that the English translation (1984) of Jacques Bertin's Semiology of Graphics has been reprinted by ESRI. http://www.amazon.com/Semiology-Graphics-Diagrams-Networks-Maps/dp/0299090604 new edition: http://www.amazon.com/Semiology-Graphics-Diagrams-Networks-Maps/dp/1589482611/ref=tmm_hrd_title_0 The long out-of-print 1984 edition sells for $380, but the new printing is a bargain at ~$49. It is all the more remarkable in that most of the diagrams and graphs were drawn by hand, yet show a palette of graphical techniques richer than our graphical software provides even today. best, -Michael -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA From ligges at statistik.tu-dortmund.de Wed Jan 5 17:17:30 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Wed, 05 Jan 2011 17:17:30 +0100 Subject: [R] R(D) Com under R1070 In-Reply-To: <18656_1294212685_4D241E4D_18656_96717_1_4D241E4C.2040103@fr.thalesgroup.com> References: <18656_1294212685_4D241E4D_18656_96717_1_4D241E4C.2040103@fr.thalesgroup.com> Message-ID: <4D24999A.1060208@statistik.tu-dortmund.de> ???? Can you please quote what you are referring to? The subject seems to refer to an R version R-1.7.0 which is for almost a decade outdated. Uwe Ligges On 05.01.2011 08:31, Henri Leblond wrote: > I get the same trouble > Please finally did you succeed fixing this trouble ? > > Henri > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From marchywka at hotmail.com Wed Jan 5 17:37:04 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Wed, 5 Jan 2011 11:37:04 -0500 Subject: [R] Simulation - Natrual Selection In-Reply-To: References: Message-ID: > Date: Wed, 5 Jan 2011 15:48:46 +0000 > From: benjamin.ward at bathspa.org > To: r-help at r-project.org > Subject: [R] Simulation - Natrual Selection > > Hi, > > I've been modelling some data over the past few days, of my work, > repeatedly challenging microbes to a certain concentration of cleaner, > until the required concentration to inhibit or kill them increaces, at > which point they are challenged to a slightly higher concentration each > day. I'm doing ths for two different cleaners and I'm collecting the > required concentration to kill them as a percentage, the challenge > number, the cleaner as a two level variable, and the lineage theyre in, > because I have several different lineages. I'm expecting the values to > rise for one cleaner but not the other as they aqquire resistance for > one but not the other. Which has happened, but I have wide variation > because one linage aqquired a very dramatic change which has made it > immune to 50%, whereas the others, have exhibited a much more gradual > increace, and so I have very weak p values for the cleaner variable, > because it is secondary to the challenge vector, which has the most > explanatory power, because without time and these challenges, the > selection would no happen. I was using two bacterium species, but one > was keen on giving hight erratic results, and insisted on becoming cross > contaminated, BUT if I include it's data, It shoves cleaner over the > p0.05 threshold, so i may just be having a problem with lack of data. So > I've been asking about bootstrapping, which I plan to do to my cases, > and thenfit a model to see what the confidence is like then. I assume if > I bootstrap then it will re-select whole cases, and not jumble > everything up, otherwise a microbe (totake the most extreme value as an > example) with 50% concentration tolerance at the beginning, would make > no sense at all. I'm also planning on doing models lineage by lineage, > rather than putting them into one whole, just to have a look at what > happens. > You can't really have a p-value without a specific hypothesis to test, if you have that then all your other questions are probably easy to answer. Generally you want to sample from things that are "iid" or maybe you want to test the "identical" i. Generally you want to have done a lit search ahead of time and had some idea of likely evolution dynamics of your system given your design and things like your forcing functions etc. Most statisticians would not take seriously a posteriori designs and indeed it can be hard to avoid rationalization and selection bias ( problems that always and only effect people who disagree with me LOL) as being anything other than exploratory or hypothesis generating- you are looking for predictive value. While it is not always worthwhile doing blind tests, it may be something worth considering ( do you know which group gets what thing?) > But what I really wanted to know from this email, was if there's a > package or function for natrual selection simulation I could make use > of, to see if I can simulate the experiment. I want to start with a http://www.google.com/#sclient=psy&hl=en&q=%22R+package%22+natural+selection but as implied above, R has lots of analysis stuff and maybe you would find something more useful that is not linked to the keywords you suggest. You may find, for whatever reason, you could write a differential equation to express your results but that isn't often used with "natural selection." > distribution of concentration tolerance values, taken from th e > inhibitory concentration values from my first lot of microbes, back when > term began. Draw 3000 from this. Then values in that draw that fall > below the exposure concentration I did in my experiment, are removed, or > have a high chance of being removed. Then, from what is left, a draw is > made again - or perhaps a copy operation (rather than a random draw) > until I have 3000 again, rather than have all exactly the same > concentration, then a value can be added to some of them, that increaces > their concentration tolerance slightly, but not by a great deal, except > in a few individuals, where it may be increaced dramatically(some sort > of exponential dstribution perhaps). Then when the distribution of this > simulated population of microbes has reached the next concentration > (possibly the mean or mode of the distribution) (I have a series of 1 in > 2 dilutions, so 100% 50%, 25% and so on), then they move on to the next > concentration. > > I know it's probably quite a heavy thing, it was just a thought that > came to me, if anybody has any experience in this area of R or knows of > something that allows this to be done, please let me know. > > Thanks, > Ben. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From landronimirc at gmail.com Wed Jan 5 17:47:52 2011 From: landronimirc at gmail.com (Liviu Andronic) Date: Wed, 5 Jan 2011 17:47:52 +0100 Subject: [R] Cost-benefit/value for money analysis In-Reply-To: References: <7DFDB350-42F8-4968-BCD6-48E9CFF7EF3B@comcast.net> Message-ID: On Wed, Jan 5, 2011 at 12:29 PM, Graham Smith wrote: >> maximal choices would break the budget. This sounds like a homework problem >> and I don't see any student effort yet. Search terms include: "decision >> analysis" , "cost-benefit analysis", or "utility theory". >> > > Hopefully, ?my response to Ben will clarify my question, and why I am asking > it. ?At the moment (and that may change) I'm not specifically interested in > how you do it R, just as to whether there is a package aimed at this kind of > Cost Benefit analysis. > Try this: > require(sos) > findFn('cost benefit') found 12 matches Regards Liviu From benjamin.ward at bathspa.org Wed Jan 5 17:56:40 2011 From: benjamin.ward at bathspa.org (Ben Ward) Date: Wed, 5 Jan 2011 16:56:40 +0000 Subject: [R] Simulation - Natrual Selection In-Reply-To: References: Message-ID: On 05/01/2011 16:37, Mike Marchywka wrote: > > > > >> Date: Wed, 5 Jan 2011 15:48:46 +0000 >> From: benjamin.ward at bathspa.org >> To: r-help at r-project.org >> Subject: [R] Simulation - Natrual Selection >> >> Hi, >> >> I've been modelling some data over the past few days, of my work, >> repeatedly challenging microbes to a certain concentration of cleaner, >> until the required concentration to inhibit or kill them increaces, at >> which point they are challenged to a slightly higher concentration each >> day. I'm doing ths for two different cleaners and I'm collecting the >> required concentration to kill them as a percentage, the challenge >> number, the cleaner as a two level variable, and the lineage theyre in, >> because I have several different lineages. I'm expecting the values to >> rise for one cleaner but not the other as they aqquire resistance for >> one but not the other. Which has happened, but I have wide variation >> because one linage aqquired a very dramatic change which has made it >> immune to 50%, whereas the others, have exhibited a much more gradual >> increace, and so I have very weak p values for the cleaner variable, >> because it is secondary to the challenge vector, which has the most >> explanatory power, because without time and these challenges, the >> selection would no happen. I was using two bacterium species, but one >> was keen on giving hight erratic results, and insisted on becoming cross >> contaminated, BUT if I include it's data, It shoves cleaner over the >> p0.05 threshold, so i may just be having a problem with lack of data. So >> I've been asking about bootstrapping, which I plan to do to my cases, >> and thenfit a model to see what the confidence is like then. I assume if >> I bootstrap then it will re-select whole cases, and not jumble >> everything up, otherwise a microbe (totake the most extreme value as an >> example) with 50% concentration tolerance at the beginning, would make >> no sense at all. I'm also planning on doing models lineage by lineage, >> rather than putting them into one whole, just to have a look at what >> happens. >> > You can't really have a p-value without a specific hypothesis to test, > if you have that then all your other questions are probably easy to answer. > Generally you want to sample from things that are "iid" or maybe you > want to test the "identical" i. My Hypothesis is that Cleaner A (I don't really want to go into names or brands), will exhbit a rise in concentration tolerance values, or rather, the microbial culture I keep exposed to it, will, reflecting aqquisition of antimicrobial resistance. And this has largely happened. And that in cleaner B, this will not happen, or if it does, it will not be as dramatic and take longer. So I expecting in my model, the cleaner variable to have a p below 0.05, and quite hight explanatory power, and a satisfying coefficient. The notion behind the hypothesis being that one might have a more difficult complex chemical structure, requiring more mutations to develop some resistance. I can't really do anything with genes or chemical structure at my current institution and at my level because of no equippment for that sort of thing, and that they felt it would be too far for a 3rd year project. So I'm using the concentration required to kill them - or stop them from growing, as a indication. > Generally you want to have done a lit search ahead of time and > had some idea of likely evolution dynamics of your system given > your design and things like your forcing functions etc. > Most statisticians would not take seriously a posteriori designs and > indeed it can be hard to avoid rationalization and selection bias ( problems > that always and only effect people who disagree with me LOL) as being > anything other than exploratory or hypothesis generating- you are looking > for predictive value. While it is not always worthwhile doing blind tests, > it may be something worth considering ( do you know which group gets what thing?) > > >> But what I really wanted to know from this email, was if there's a >> package or function for natrual selection simulation I could make use >> of, to see if I can simulate the experiment. I want to start with a > > http://www.google.com/#sclient=psy&hl=en&q=%22R+package%22+natural+selection > > but as implied above, R has lots of analysis stuff and maybe you > would find something more useful that is not linked to the keywords > you suggest. You may find, for whatever reason, you could write a differential > equation to express your results but that isn't often used with "natural selection." > > >> distribution of concentration tolerance values, taken from th > e >> inhibitory concentration values from my first lot of microbes, back when >> term began. Draw 3000 from this. Then values in that draw that fall >> below the exposure concentration I did in my experiment, are removed, or >> have a high chance of being removed. Then, from what is left, a draw is >> made again - or perhaps a copy operation (rather than a random draw) >> until I have 3000 again, rather than have all exactly the same >> concentration, then a value can be added to some of them, that increaces >> their concentration tolerance slightly, but not by a great deal, except >> in a few individuals, where it may be increaced dramatically(some sort >> of exponential dstribution perhaps). Then when the distribution of this >> simulated population of microbes has reached the next concentration >> (possibly the mean or mode of the distribution) (I have a series of 1 in >> 2 dilutions, so 100% 50%, 25% and so on), then they move on to the next >> concentration. >> >> I know it's probably quite a heavy thing, it was just a thought that >> came to me, if anybody has any experience in this area of R or knows of >> something that allows this to be done, please let me know. >> >> Thanks, >> Ben. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > From sebastian.daza at gmail.com Wed Jan 5 18:04:23 2011 From: sebastian.daza at gmail.com (=?ISO-8859-1?Q?Sebasti=E1n_Daza?=) Date: Wed, 05 Jan 2011 11:04:23 -0600 Subject: [R] integration Sweave and TexMakerX In-Reply-To: References: <1290201233425-3051003.post@n4.nabble.com> Message-ID: <4D24A497.5060303@gmail.com> Hi, Does anyone know how to integrate texmakerx and sweave on Windows? I mean, to run .rnw files directly from texmakerx and get a pdf or dvi file. Thank you in advance, -- Sebasti?n Daza sebastian.daza at gmail.com From Greg.Snow at imail.org Wed Jan 5 18:25:38 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Wed, 5 Jan 2011 10:25:38 -0700 Subject: [R] get() within a command, specifically lmer In-Reply-To: References: Message-ID: Formula syntax is different from regular syntax, it is "quoted" and not evaluated in the same way as regular commands (otherwise operations like '+' and '-' would do very different things). For what you are trying to do, I would suggest creating the formula as a string using paste or sprintf, then use as.formula on that string. You can also use the substitute function, but that tends to be a bit more complicated. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Patrick McKann > Sent: Wednesday, January 05, 2011 8:26 AM > To: r-help at r-project.org > Subject: [R] get() within a command, specifically lmer > > Hello all. Why doesn't this work? > > d=data.frame(y=rpois(10,1),x=rnorm(10),z=rnorm(10),grp=rep(c('a','b'),e > ach=5)) > library(lme4) > model=lmer(y~x+z+(1|grp),family=poisson,data=d) > update(model,~.-z)###works, removes z > var='z' > update(model,~.-get(var))##doesn't remove z > update(model,~. -get(var,pos=d))###doesn't remove z > > I am trying to remove z from the model in the update, but I can't do it > using get(), which is what I would like to do for a more complicated > program. There's something about environments and get() that I don't > understand. > > Any suggestions? > > Thanks. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From wwwhsd at gmail.com Wed Jan 5 18:25:39 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Wed, 5 Jan 2011 15:25:39 -0200 Subject: [R] Stop and call objects In-Reply-To: <4D247EEC.4030004@cognigencorp.com> References: <4D247EEC.4030004@cognigencorp.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From benjamin.ward at bathspa.org Wed Jan 5 18:27:23 2011 From: benjamin.ward at bathspa.org (Ben Ward) Date: Wed, 5 Jan 2011 17:27:23 +0000 Subject: [R] Fwd: Re: Simulation - Natrual Selection Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From moshersteven at gmail.com Wed Jan 5 18:40:04 2011 From: moshersteven at gmail.com (steven mosher) Date: Wed, 5 Jan 2011 09:40:04 -0800 Subject: [R] Navigating web pages using R In-Reply-To: <512422.96077.qm@web37407.mail.mud.yahoo.com> References: <512422.96077.qm@web37407.mail.mud.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From andy_liaw at merck.com Wed Jan 5 19:01:13 2011 From: andy_liaw at merck.com (Liaw, Andy) Date: Wed, 5 Jan 2011 13:01:13 -0500 Subject: [R] randomForest speed improvements In-Reply-To: References: <1294084769056-3172523.post@n4.nabble.com><1294097299183-3172834.post@n4.nabble.com><1294183823780-3174621.post@n4.nabble.com> Message-ID: From: Liaw, Andy > > Note that that isn't exactly what I recommended. If you look at the > example in the help page for combine(), you'll see that it is > combining > RF objects trained on the same data; i.e., instead of having > one RF with > 500 trees, you can combine five RFs trained on the same data with 100 > trees each into one 500-tree RF. > > The way you are using combine() is basically using sample > size to limit > tree size, which you can do by playing with the nodesize argument in > randomForest() as I suggested previously. Either way is fine > as long as > you don't see prediction performance degrading. I should also mention that another way you can do something similar is by making use of the sampsize argument in randomForest(). For example, if you call randomForest() with sampsize=500, it will randomly draw 500 data points to grow each tree. This way you don't even need to run the RFs separately and combine them. Andy > Andy > > > -----Original Message----- > > From: r-help-bounces at r-project.org > > [mailto:r-help-bounces at r-project.org] On Behalf Of apresley > > Sent: Tuesday, January 04, 2011 6:30 PM > > To: r-help at r-project.org > > Subject: Re: [R] randomForest speed improvements > > > > > > Andy, > > > > Thanks for the reply. I had no idea I could combine them > > back ... that > > actually will work pretty well. We can have several "worker > > threads" load > > up the RF's on different machines and/or cores, and then > > re-assemble them. > > RMPI might be an option down the road, but would be a bit of > > overhead for us > > now. > > > > Using the method of combine() ... I was able to drastically > reduce the > > amount of time to build randomForest objects. IE, using > > about 25,000 rows > > (6 columns), it takes maybe 5 minutes on my laptop. Using 5 > > randomForest > > objects (each with 5k rows), and then combining them, takes < > > 1 minute. > > > > -- > > Anthony > > -- > > View this message in context: > > http://r.789695.n4.nabble.com/randomForest-speed-improvements- > > tp3172523p3174621.html > > Sent from the R help mailing list archive at Nabble.com. > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > Notice: This e-mail message, together with any > attachme...{{dropped:11}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Notice: This e-mail message, together with any attachme...{{dropped:11}} From felvds at yahoo.com.br Wed Jan 5 18:46:35 2011 From: felvds at yahoo.com.br (felipe araujo) Date: Wed, 5 Jan 2011 09:46:35 -0800 (PST) Subject: [R] Forecasting with STL Message-ID: <746990.28365.qm@web38707.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From frodo.jedi at yahoo.com Wed Jan 5 12:55:00 2011 From: frodo.jedi at yahoo.com (Frodo Jedi) Date: Wed, 5 Jan 2011 03:55:00 -0800 (PST) Subject: [R] t-test or ANOVA...who wins? Help please! In-Reply-To: References: <183012.52698.qm@web57908.mail.re3.yahoo.com> Message-ID: <635745.44416.qm@web57905.mail.re3.yahoo.com> Dear Tal Galili, thanks a lot for your answer! I agree with you, the t-test is comparing 2 conditions at one level of stimulus, while the ANOVA table is testing the significance of the interaction between condition and stimuls....the two tests are testing two different things. But still I don?t understand which is the right way to perform the analysis in order to solve my problem. Let?s consider now only the table I posted before. The same stimuli in the table have been presented to subjects in two conditions: A and AH, where AH is the condition A plus something elese (let?s call it "H"). I want to know if AT GLOBAL LEVEL adding "H" bring to better results in the participants evaluations of the stimuli rather than the stimulus presented only with condition "A". Data in column "response" are evaluation on realism of the stimulus from a 7 point scale. If I calculate the mean for each stmulus in each condition, the results show that for each stimulus the AH condition is always greater than the first. Anyway, doing a t-test to compare the stimuli by couple (es. flat_550_W_realism in condition A, flat_550_W_realism in condition AH) I get that only sometimes the differences are statistically significant. I ask you if there is a way to say that condition AH is better than condition A, at global level. In attachment you find the table in .txt and also in .csv format. Is it possible for you to make an example in R, including also the R results in order to tell me what to see in the console to see if my problem is solved or not? For example, I was checking in the anova results the stimulus:conditon line.....but I don?t know if my anova analysis was correct or not. I am not an expert of R, nor of statistics ;-( Anyway I am doing my best to study and understand. Please enlighten me. Thanks in advance Best regards ________________________________ From: Tal Galili To: Frodo Jedi Cc: r-help at r-project.org Sent: Wed, January 5, 2011 10:15:41 AM Subject: Re: [R] t-test or ANOVA...who wins? Help please! Hello Frodo, It is not clear to me from your questions some of the basics of your analysis. If you only have two levels of a factor, and one response - why in the anova do you use more factors (and their interactions)? In that sense, it is obvious that your results would differ from the t-test. In either case, I am not sure if any of these methods are valid since your data doesn't seem to be normal. Here is an example code of how to get the same results from aov and t.test. And also a nonparametric option (that might be more fitting) flat_550_W_realism =c(3,3,5,3,3,3,3,5,3,3,5,7,5,2,3) flat_550_W_realism_AH =c(7,4,5,3,6,5,3,5,5,7,2,7,5, 5) x <- c(rep(1, length(flat_550_W_realism)), rep(2, length(flat_550_W_realism_AH))) y <- c(flat_550_W_realism , flat_550_W_realism_AH) # equal results between t test and anova t.test(y ~ x, var.equal= T) summary(aov(y ~ x)) # plotting the data: boxplot(y ~ x) # group 1 is not at all symetrical... wilcox.test(y ~ x) # a more fitting test ----------------Contact Details:------------------------------------------------------- Contact me: Tal.Galili at gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- On Wed, Jan 5, 2011 at 12:37 AM, Frodo Jedi wrote: >I kindly ask you an help because I really don?t know how to solve this problem. > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: table_realism_wood.txt URL: From frodo.jedi at yahoo.com Wed Jan 5 12:28:36 2011 From: frodo.jedi at yahoo.com (Frodo Jedi) Date: Wed, 5 Jan 2011 03:28:36 -0800 (PST) Subject: [R] Assumptions for ANOVA: the right way to check the normality Message-ID: <9521.53053.qm@web57907.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From frodo.jedi at yahoo.com Wed Jan 5 18:10:11 2011 From: frodo.jedi at yahoo.com (Frodo Jedi) Date: Wed, 5 Jan 2011 09:10:11 -0800 (PST) Subject: [R] Comparing fitting models Message-ID: <459095.84075.qm@web57907.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From iurie.malai at gmail.com Wed Jan 5 13:39:08 2011 From: iurie.malai at gmail.com (Iurie Malai) Date: Wed, 5 Jan 2011 04:39:08 -0800 (PST) Subject: [R] R Commander - how to disable the alphabetical sorting of variable names? Message-ID: <1294231148896-3175426.post@n4.nabble.com> I try to disable alphabetical sorting of the variable names but I fail, R Commander does not store any changes made in the "Commander Options" menu / window. I tried to insert "options(sort.names = FALSE)" in Rprofile.site and .Rprofile config files but without success. Does anyone know the solution? -- View this message in context: http://r.789695.n4.nabble.com/R-Commander-how-to-disable-the-alphabetical-sorting-of-variable-names-tp3175426p3175426.html Sent from the R help mailing list archive at Nabble.com. From kevinummel at gmail.com Wed Jan 5 13:03:34 2011 From: kevinummel at gmail.com (Kevin Ummel) Date: Wed, 5 Jan 2011 12:03:34 +0000 Subject: [R] How to 'explode' a matrix Message-ID: <0EA333E4-1739-4516-BDDA-9BFDF2F74916@gmail.com> Hi everyone, I'm looking for a way to 'explode' a matrix like this: > matrix(1:4,2,2) [,1] [,2] [1,] 1 3 [2,] 2 4 into a matrix like this: > matrix(c(1,1,2,2,1,1,2,2,3,3,4,4,3,3,4,4),4,4) [,1] [,2] [,3] [,4] [1,] 1 1 3 3 [2,] 1 1 3 3 [3,] 2 2 4 4 [4,] 2 2 4 4 My current kludge is this: v1=rep(1:4,each=2,times=2) v2=v1[order(rep(1:2,each=4,times=2))] matrix(v2,4,4) But I'm hoping there's a more efficient solution that I'm not aware of. Many thanks, Kevin From nostef at gmail.com Wed Jan 5 17:10:40 2011 From: nostef at gmail.com (Marcelo Barbudas) Date: Wed, 05 Jan 2011 18:10:40 +0200 Subject: [R] real time R Message-ID: <4D249800.7070705@gmail.com> Hi, We're using R in an application where asking for a probability of an event takes about 130ms. What could we do to take that down to 30ms-40ms? The query code uses randomforest, knn. -- M. From P.Wilson at sheffield.ac.uk Wed Jan 5 12:36:06 2011 From: P.Wilson at sheffield.ac.uk (P Wilson) Date: Wed, 5 Jan 2011 11:36:06 +0000 Subject: [R] loop variable names as function arguments Message-ID: <1294227366.4d2457a665f2e@webmail.shef.ac.uk> Dear all, is there a way to loop the rp.doublebutton function in the rpanel package? The difficulty I'm having lies with the variable name argument. library(rpanel) if (interactive()) { draw <- function(panel) { plot(unlist(panel$V),ylim=0:1) panel } panel <- rp.control(V=as.list(rep(.5,3))) rp.doublebutton(panel, var = V[[1]], step = 0.05, action = draw, range = c(0, 1)) rp.doublebutton(panel, var = V[[2]], step = 0.05, action = draw, range = c(0, 1)) rp.doublebutton(panel, var = V[[3]], step = 0.05, action = draw, range = c(0, 1)) } Regards, Philip From ptit_bleu at yahoo.fr Wed Jan 5 14:31:34 2011 From: ptit_bleu at yahoo.fr (PtitBleu) Date: Wed, 5 Jan 2011 05:31:34 -0800 (PST) Subject: [R] looking for the RMySQL package for R 2.12.0 under XP In-Reply-To: <4CED8F71.7090603@auckland.ac.nz> References: <1290613721424-3057537.post@n4.nabble.com> <4CED8F71.7090603@auckland.ac.nz> Message-ID: <1294234294947-3175513.post@n4.nabble.com> Hello David, As I had no time to try to compile the RMySQL package, I finally followed your advice and moved to RODBC. The decision to modify my scripts was taken after I discovered the function odbcDriverConnect which allow to directly connect to database (like RMySQL) without declaring the database through ODBC window. With the following command, ch<-odbcDriverConnect(connection="SERVER=localhost;DRIVER=MySQL ODBC 5.1 Driver;DATABASE=my_db;UID=my_user;PWD=my_pwd;case=tolower") the connection runs nicely. Thanks again for the link. Happy New Year to all the R-users, Ptit Bleu. -- View this message in context: http://r.789695.n4.nabble.com/looking-for-the-RMySQL-package-for-R-2-12-0-under-XP-tp3057537p3175513.html Sent from the R help mailing list archive at Nabble.com. From Roger.Bivand at nhh.no Wed Jan 5 12:30:02 2011 From: Roger.Bivand at nhh.no (Roger Bivand) Date: Wed, 5 Jan 2011 03:30:02 -0800 (PST) Subject: [R] update.views("Spatial") does not seem to be able to find RPyGeo package In-Reply-To: References: Message-ID: <1294227002977-3175299.post@n4.nabble.com> The package is stated only to run under Windows (see the SystemRequirements field on its CRAN page), and you are on Linux - does this explain anything? Maybe ask the package maintainer? Roger Linder, Eric wrote: > > I have this problem with loading RPyGeo package when using update.views. > How can I fix this. I have tried to use other CRAN mirrors with the same > result. > Below is a copy of my session. > ---------------------session----------------------- > R version 2.12.1 (2010-12-16) > Copyright (C) 2010 The R Foundation for Statistical Computing > ISBN 3-900051-07-0 > Platform: i486-pc-linux-gnu (32-bit) > > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type 'license()' or 'licence()' for distribution details. > > R is a collaborative project with many contributors. > Type 'contributors()' for more information and > 'citation()' on how to cite R or R packages in publications. > > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > [Previously saved workspace restored] > >> library(ctv) >> update.views('Spatial') > --- Please select a CRAN mirror for use in this session --- > Loading Tcl/Tk interface ... done > Warning message: > In update.views("Spatial") : > The following packages are not available: RPyGeo >> > ---------------------session----------------------- > > > > > > > > The information contained in this communication may be C...{{dropped:11}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ----- Roger Bivand Economic Geography Section Department of Economics Norwegian School of Economics and Business Administration Helleveien 30 N-5045 Bergen, Norway -- View this message in context: http://r.789695.n4.nabble.com/update-views-Spatial-does-not-seem-to-be-able-to-find-RPyGeo-package-tp3174870p3175299.html Sent from the R help mailing list archive at Nabble.com. From myotistwo at gmail.com Wed Jan 5 19:12:02 2011 From: myotistwo at gmail.com (Graham Smith) Date: Wed, 5 Jan 2011 18:12:02 +0000 Subject: [R] Cost-benefit/value for money analysis In-Reply-To: References: <7DFDB350-42F8-4968-BCD6-48E9CFF7EF3B@comcast.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From singhi at cs.ucr.edu Wed Jan 5 16:04:57 2011 From: singhi at cs.ucr.edu (Indrajeet Singh) Date: Wed, 5 Jan 2011 07:04:57 -0800 Subject: [R] unique limited to 536870912 In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From surfprjab at hotmail.com Wed Jan 5 16:50:13 2011 From: surfprjab at hotmail.com (jose Bartolomei) Date: Wed, 5 Jan 2011 15:50:13 +0000 Subject: [R] vector of character with unequal width Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From young.stat at gmail.com Wed Jan 5 16:03:29 2011 From: young.stat at gmail.com (Young Cho) Date: Wed, 5 Jan 2011 09:03:29 -0600 Subject: [R] speed up in R apply Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wdunlap at tibco.com Wed Jan 5 19:14:50 2011 From: wdunlap at tibco.com (William Dunlap) Date: Wed, 5 Jan 2011 10:14:50 -0800 Subject: [R] Stop and call objects In-Reply-To: References: <4D247EEC.4030004@cognigencorp.com> Message-ID: <77EB52C6DD32BA4D87471DCD70C8D70003C2E1D9@NA-PA-VBE03.na.tibco.com> > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Henrique > Dallazuanna > Sent: Wednesday, January 05, 2011 9:26 AM > To: Sebastien Bihorel > Cc: R-help > Subject: Re: [R] Stop and call objects > > Try this: > > f <- function(x) > tryCatch(sum(x),error=function(e)sprintf("Error in %s: > %s", deparse(sys.call(1)), e$message)) > f('a') The argument e to the error handler contains a call component so you don't have to rely on the unreliable sys.call(1) to get the offending call. E.g., > f2 <- function(x) { tryCatch(sum(x), error=function(e) { sprintf("Error in %s: %s", deparse(e$call)[1], e$message) } ) } > f2('char') [1] "Error in sum(x): invalid 'type' (character) of argument" Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > On Wed, Jan 5, 2011 at 12:23 PM, Sebastien Bihorel < > Sebastien.Bihorel at cognigencorp.com> wrote: > > > Dear R-users, > > > > Let's consider the following snippet: > > > > f <- function(x) tryCatch(sum(x),error=function(e) stop(e)) > > f('a') > > > > As expected, the last call returns an error message: Error > in sum(x) : > > invalid 'type' (character) of argument > > > > My questions are the following: > > 1- can I easily ask the stop function to reference the "f" > function in > > addition to "sum(x)" in the error message? > > 2- If not, I guess I would have to extract the call and > message objects > > from e, coerce the call as a character object, build a > custom string, and > > pass it to the stop function using call.=F. How can I > coerce a call object > > to a character and maintain the "aspect" of the printed > call (i.e. "sum(x)" > > instead of the character vector "sum" "x" returned by > as.character(e$call))? > > > > Thank you > > > > Sebastien > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Henrique Dallazuanna > Curitiba-Paran?-Brasil > 25? 25' 40" S 49? 16' 22" O > > [[alternative HTML version deleted]] > > From ligges at statistik.tu-dortmund.de Wed Jan 5 19:30:05 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Wed, 05 Jan 2011 19:30:05 +0100 Subject: [R] real time R In-Reply-To: <4D249800.7070705@gmail.com> References: <4D249800.7070705@gmail.com> Message-ID: <4D24B8AD.20701@statistik.tu-dortmund.de> On 05.01.2011 17:10, Marcelo Barbudas wrote: > Hi, > > We're using R in an application where asking for a probability of an > event takes about 130ms. > > What could we do to take that down to 30ms-40ms? The query code uses > randomforest, knn. > Use a machine that is 4 times faster? Otherwise: Use another method or a more efficient implementation. Don't use R at all if you want _guaranteed_ real time processing. Uwe Ligges From wwwhsd at gmail.com Wed Jan 5 19:30:30 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Wed, 5 Jan 2011 16:30:30 -0200 Subject: [R] How to 'explode' a matrix In-Reply-To: <0EA333E4-1739-4516-BDDA-9BFDF2F74916@gmail.com> References: <0EA333E4-1739-4516-BDDA-9BFDF2F74916@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bbolker at gmail.com Wed Jan 5 19:30:36 2011 From: bbolker at gmail.com (Ben Bolker) Date: Wed, 5 Jan 2011 18:30:36 +0000 (UTC) Subject: [R] How to 'explode' a matrix References: <0EA333E4-1739-4516-BDDA-9BFDF2F74916@gmail.com> Message-ID: Kevin Ummel gmail.com> writes: > I'm looking for a way to 'explode' a matrix like this: > > > matrix(1:4,2,2) > [,1] [,2] > [1,] 1 3 > [2,] 2 4 > This is the Kronecker product of your matrix with the matrix (1 1 ; 1 1) m <- matrix(1:4,2,2) kronecker(m,matrix(1,2,2)) cheers Ben Bolker From ligges at statistik.tu-dortmund.de Wed Jan 5 19:30:58 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Wed, 05 Jan 2011 19:30:58 +0100 Subject: [R] multipanel plots In-Reply-To: References: Message-ID: <4D24B8E2.4020208@statistik.tu-dortmund.de> On 05.01.2011 06:16, smriti Sebastian wrote: > hi, > i have attached a doc file. Maybe, but it cannot make it through the list. > Is this graph can be plotted using R?Plz help We do not know. Make it available on some webserver and refer to it with an URL. Uwe Ligges > > regards, > smriti > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From b.rowlingson at lancaster.ac.uk Wed Jan 5 19:32:45 2011 From: b.rowlingson at lancaster.ac.uk (Barry Rowlingson) Date: Wed, 5 Jan 2011 18:32:45 +0000 Subject: [R] real time R In-Reply-To: <4D249800.7070705@gmail.com> References: <4D249800.7070705@gmail.com> Message-ID: On Wed, Jan 5, 2011 at 4:10 PM, Marcelo Barbudas wrote: > Hi, > > We're using R in an application where asking for a probability of an > event takes about 130ms. > > What could we do to take that down to 30ms-40ms? The query code uses > randomforest, knn. > That's a fairly vague question.... So some vague answers: Firstly, profile your query to identify bottlenecks and then concentrate your effort on removing them. Anything else is a waste of time. Secondly, get a faster computer - whether that means faster CPU, faster hard disks, faster RAM depends on where the bottleneck is in your process. Or get parallel and use multiple CPUs. Or rewrite in C. Or machine code. Or do it on a GPU. Thirdly, give us something more specific! Like examples perhaps? Barry From jorgeivanvelez at gmail.com Wed Jan 5 19:35:46 2011 From: jorgeivanvelez at gmail.com (Jorge Ivan Velez) Date: Wed, 5 Jan 2011 13:35:46 -0500 Subject: [R] How to 'explode' a matrix In-Reply-To: <0EA333E4-1739-4516-BDDA-9BFDF2F74916@gmail.com> References: <0EA333E4-1739-4516-BDDA-9BFDF2F74916@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From benjamin.ward at bathspa.org Wed Jan 5 19:39:40 2011 From: benjamin.ward at bathspa.org (Ben Ward) Date: Wed, 5 Jan 2011 18:39:40 +0000 Subject: [R] Simulation - Natrual Selection In-Reply-To: References: Message-ID: On 05/01/2011 17:40, Bert Gunter wrote: >> My hypothesis was specified before I did my experiment. Whilst far from >> perfect, I've tried to do the best I can to assess rise in resistance, >> without going into genetics as it's not possible. (Although may be at the >> next institution I've applied for MSc). >> >> With my hypothesis (I mentioned it below), I was of the frame of mind that a >> nonsignificant p-value on the cleaner variable (for now - experiment is far >> from over), indicated a lack of evidence for rejecting the null. And so at >> the minute, it looks like the type of cleaner makes no difference. > I have no fundamental objection, but be careful. I would simply > qualify your last sentence by saying that it means that the > experimental noise is to great to precisely determine the size of the > cleaner effect. Scientific reality tells us that it is never exactly > 0; what your results show is that your uncertainty about the value of > the difference encompasses both positive and negative values. This > does NOT mean that the difference might not be scientifically large > enough to be of interest -- a confidence interval for the difference > (MUCH better than a P value) would help you determine that. If the > interval is narrow enough that the difference, positive or negative, > is too small to be of scientific interest, then you're done. If the > linterval is large, then it tells you that you need more data, a > better experiment (less noisy) etc. > > -- Bert > At the moment I wouldn't call the confidence interval small, it's definately wide, and at the minute the confidence interval covers zero. My R-squared at the minite is also 0.5, this is mostly due to the few extreme cases of adaptation as I mentioned before, but I'm hesitant to remove it as papers in my literature study which also evolve bacteria show that there is often (sometimes wide) variation in the paths populations take. So whilst mathematically a bit undesirable, and makes me and the model uncertain, it does fall into place with what is known, or has been previously shown of the reality of selection. Again if I include the data from the bacteria dropped from the study, all that "improves", and uncertainty is reduced. It may also be worth me mentioning, I am also taking a more traditional approach (by that I mean a more "Statistics 101" approach, indeed that is all the stats tuition covered in my course as a taught element), incase what I've described above did not work or was not ideal, because we (me and my supervisor) did forsee a model report may contain a lot of uncertainty. Indeed we did expect some populations to adapt and some to not etc. So I've also been collecting data on the width of the zones of inhibition shown by putting disks of cleaner on plates of growth, and measuring the dead zone that results. I can get lots of data from this with only a few plates, and doing this at the start of the study, a few times in the middle, and at the end. Will allow me to do more traditional analysis, for example t.test on the dead zone widths at the end of the study, between cleaner a and b. Or a non parametric equivalent, maybe even a permutation test. The modelling stuff is already beyond what my supervisor expects of me, but I felt it would add value and a lot more insight to the study, allowing more variables to be accounted for, than a more short-sighted traditional "test". From wwwhsd at gmail.com Wed Jan 5 19:41:39 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Wed, 5 Jan 2011 16:41:39 -0200 Subject: [R] vector of character with unequal width In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From savicky at cs.cas.cz Wed Jan 5 19:50:28 2011 From: savicky at cs.cas.cz (Petr Savicky) Date: Wed, 5 Jan 2011 19:50:28 +0100 Subject: [R] vector of character with unequal width In-Reply-To: References: Message-ID: <20110105185028.GA16539@cs.cas.cz> On Wed, Jan 05, 2011 at 03:50:13PM +0000, jose Bartolomei wrote: [...] > > I was thinking to create a character vector of 0's 9-nchar(xx). > Then paste it to xx. > 9-nchar(xx) > [1] 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 > [38] 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 6 6 6 6 6 5 5 5 5 5 5 5 5 > [75] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 ......1 > > > > > Nevertheless, I have not been able to create this vector nor I do not know if this is the best option. Did you consider something like the following? xx <- c("abc", "abcd", "abcde") z1 <- rep("000000000", times=length(xx)) z2 <- substr(z1, 1, 9 - nchar(xx)) yy <- paste(z2, xx, sep="") cbind(yy) # yy #[1,] "000000abc" #[2,] "00000abcd" #[3,] "0000abcde" Petr Savicky. From anjan.purkayastha at gmail.com Wed Jan 5 20:00:51 2011 From: anjan.purkayastha at gmail.com (ANJAN PURKAYASTHA) Date: Wed, 5 Jan 2011 14:00:51 -0500 Subject: [R] Plotting colour-coded points Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jorgeivanvelez at gmail.com Wed Jan 5 20:13:37 2011 From: jorgeivanvelez at gmail.com (Jorge Ivan Velez) Date: Wed, 5 Jan 2011 14:13:37 -0500 Subject: [R] Plotting colour-coded points In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Wed Jan 5 20:22:11 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 5 Jan 2011 14:22:11 -0500 Subject: [R] speed up in R apply In-Reply-To: References: Message-ID: <7B9A8B27-6E58-409F-BC0F-50BFDD1A60EC@comcast.net> On Jan 5, 2011, at 10:03 AM, Young Cho wrote: > Hi, > > I am doing some simulations and found a bottle neck in my R script. > I made > an example: > >> a = matrix(rnorm(5000000),1000000,5) >> tt = Sys.time(); sum(a[,1]*a[,2]*a[,3]*a[,4]*a[,5]); Sys.time() - tt > [1] -1291.026 > Time difference of 0.2354031 secs >> >> tt = Sys.time(); sum(apply(a,1,prod)); Sys.time() - tt > [1] -1291.026 > Time difference of 20.23150 secs > > Is there a faster way of calculating sum of products (of columns, or > of > rows)? You should look at crossprod and tcrossprod. > And is this an expected behavior? Yes. For loops and *apply strategies are slower than the proper use of vectorized functions. > > Thanks for your advice in advance, > -- David Winsemius, MD West Hartford, CT From f.harrell at vanderbilt.edu Wed Jan 5 20:22:32 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Wed, 5 Jan 2011 11:22:32 -0800 (PST) Subject: [R] OT: Reprinting of Bertin's Semiology of Graphics In-Reply-To: <4D249716.9080406@yorku.ca> References: <4D249716.9080406@yorku.ca> Message-ID: <1294255352098-3176233.post@n4.nabble.com> This is a major publishing event for statistical graphics. I have long possessed Bertin's shorter book Graphics and Graphic Information Processing but Semiology is the one I've been waiting for. Thanks for the good news Michael! Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/OT-Reprinting-of-Bertin-s-Semiology-of-Graphics-tp3175859p3176233.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Wed Jan 5 20:32:07 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 5 Jan 2011 14:32:07 -0500 Subject: [R] Plotting colour-coded points In-Reply-To: References: Message-ID: On Jan 5, 2011, at 2:00 PM, ANJAN PURKAYASTHA wrote: > Hi, > I have a file of the following type: > > id a b > 1 0.5 5 > 2 0.7 15 > 3 1.6 7 > 4 0.5 25 > .................... > > I would like to plot the data in column a on the y-axis and the > corresponding data in column id on the x-axis, so plot(a~id). > However I > would like to colour these points according to the data in column b. > column b data may be colour coded into the following bins: 0-9; 10-19; > 20-29. > Any idea on how to accomplish this? Something along the lines of this code: plot(a ~ id, data=dfrm, col=c("red", "green", "blue")[findInterval(dfrm$b, c(0,10,20,30) )] ) -- David. > TIA, > Anjan > > -- > =================================== > anjan purkayastha, phd. > research associate > fas center for systems biology, > harvard university > 52 oxford street > cambridge ma 02138 > phone-703.740.6939 > =================================== > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From jfox at mcmaster.ca Wed Jan 5 20:33:46 2011 From: jfox at mcmaster.ca (John Fox) Date: Wed, 5 Jan 2011 14:33:46 -0500 Subject: [R] R Commander - how to disable the alphabetical sorting of variable names? In-Reply-To: <1294231148896-3175426.post@n4.nabble.com> References: <1294231148896-3175426.post@n4.nabble.com> Message-ID: <008901cbad0f$78d327d0$6a797770$@ca> Dear Iurie Malai, How Rcmdr options are set is described in ?Commander, which is also accessible via the R Commander menus, "Help -> Commander help". You need options(Rcmdr=list(sort.names=FALSE)) which you can put in Rprofile.site. Best, John -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On > Behalf Of Iurie Malai > Sent: January-05-11 7:39 AM > To: r-help at r-project.org > Subject: [R] R Commander - how to disable the alphabetical sorting of > variable names? > > > I try to disable alphabetical sorting of the variable names but I fail, R > Commander does not store any changes made in the "Commander Options" menu / > window. I tried to insert "options(sort.names = FALSE)" in Rprofile.site and > .Rprofile config files but without success. Does anyone know the solution? > -- > View this message in context: http://r.789695.n4.nabble.com/R-Commander-how- > to-disable-the-alphabetical-sorting-of-variable-names-tp3175426p3175426.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From bates at stat.wisc.edu Wed Jan 5 20:40:26 2011 From: bates at stat.wisc.edu (Douglas Bates) Date: Wed, 5 Jan 2011 13:40:26 -0600 Subject: [R] speed up in R apply In-Reply-To: <7B9A8B27-6E58-409F-BC0F-50BFDD1A60EC@comcast.net> References: <7B9A8B27-6E58-409F-BC0F-50BFDD1A60EC@comcast.net> Message-ID: On Wed, Jan 5, 2011 at 1:22 PM, David Winsemius wrote: > > On Jan 5, 2011, at 10:03 AM, Young Cho wrote: > >> Hi, >> >> I am doing some simulations and found a bottle neck in my R script. I made >> an example: >> >>> a = matrix(rnorm(5000000),1000000,5) >>> tt ?= Sys.time(); sum(a[,1]*a[,2]*a[,3]*a[,4]*a[,5]); Sys.time() - tt >> >> [1] -1291.026 >> Time difference of 0.2354031 secs >>> >>> tt ?= Sys.time(); sum(apply(a,1,prod)); Sys.time() - tt >> >> [1] -1291.026 >> Time difference of 20.23150 secs >> >> Is there a faster way of calculating sum of products (of columns, or of >> rows)? > > You should look at crossprod and tcrossprod. Hmm. Not sure that would help, David. You could use a matrix multiplication of a %*% rep(1, ncol(a)) if you wanted the row sums but of course you could also use rowSums to get those. >> And is this an expected behavior? > > Yes. For loops and *apply strategies are slower than the proper use of > vectorized functions. To expand a bit on David's point, the apply function isn't magic. It essentially loops over the rows, in this case. By multiplying columns together you are performing the looping over the rows in compiled code, which is much, much faster. If you want to do this kind of operation effectively in R for a general matrix (i.e. not knowing in advance that it has exactly 5 columns) you could use Reduce > a <- matrix(rnorm(5000000),1000000,5) > system.time(pr1 <- a[,1]*a[,2]*a[,3]*a[,4]*a[,5]) user system elapsed 0.15 0.09 0.37 > system.time(pr2 <- apply(a, 1, prod)) user system elapsed 22.090 0.140 22.902 > all.equal(pr1, pr2) [1] TRUE > system.time(pr3 <- Reduce(get("*"), as.data.frame(a), rep(1, nrow(a)))) user system elapsed 0.410 0.010 0.575 > all.equal(pr3, pr2) [1] TRUE >> >> Thanks for your advice in advance, >> > > -- > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From rbaer at atsu.edu Wed Jan 5 20:56:50 2011 From: rbaer at atsu.edu (Robert Baer) Date: Wed, 5 Jan 2011 13:56:50 -0600 Subject: [R] Assumptions for ANOVA: the right way to check the normality In-Reply-To: <9521.53053.qm@web57907.mail.re3.yahoo.com> References: <9521.53053.qm@web57907.mail.re3.yahoo.com> Message-ID: > Someone suggested me that I don?t have to check the normality of the > data, but > the normality of the residuals I get after the fitting of the linear > model. > I really ask you to help me to understand this point as I don?t find > enough > material online where to solve it. Try the following: # using your scrd data and your proposed models fit1<- lm(response ~ stimulus + condition + stimulus:condition, data=scrd) fit2<- lm(response ~ stimulus + condition, data=scrd) fit3<- lm(response ~ condition, data=scrd) # Set up for 6 plots on 1 panel op = par(mfrow=c(2,3)) # residuals function extracts residuals # Visual inspection is a good start for checking normality # You get a much better feel than from some "magic number" statistic hist(residuals(fit1)) hist(residuals(fit2)) hist(residuals(fit3)) # especially qqnorm() plots which are linear for normal data qqnorm(residuals(fit1)) qqnorm(residuals(fit2)) qqnorm(residuals(fit3)) # Restore plot parameters par(op) > > If the data are not normally distributed I have to use the kruskal wallys > test > and not the ANOVA...so please help > me to understand. Indeed - Kruskal-Wallis is a good test to use for one factor data that is ordinal so it is a good alternative to your fit3. Your "response" seems to be a discrete variable rather than a continuous variable. You must decide if it is reasonable to approximate it with a normal distribution which is by definition continuous. > > I make a numerical example, could you please tell me if the data in this > table > are normally distributed or not? > > Help! > > > number stimulus condition response > 1 flat_550_W_realism A 3 > 2 flat_550_W_realism A 3 > 3 flat_550_W_realism A 5 > 4 flat_550_W_realism A 3 > 5 flat_550_W_realism A 3 > 6 flat_550_W_realism A 3 > 7 flat_550_W_realism A 3 > 8 flat_550_W_realism A 5 > 9 flat_550_W_realism A 3 > 10 flat_550_W_realism A 3 > 11 flat_550_W_realism A 5 > 12 flat_550_W_realism A 7 > 13 flat_550_W_realism A 5 > 14 flat_550_W_realism A 2 > 15 flat_550_W_realism A 3 > 16 flat_550_W_realism AH 7 > 17 flat_550_W_realism AH 4 > 18 flat_550_W_realism AH 5 > 19 flat_550_W_realism AH 3 > 20 flat_550_W_realism AH 6 > 21 flat_550_W_realism AH 5 > 22 flat_550_W_realism AH 3 > 23 flat_550_W_realism AH 5 > 24 flat_550_W_realism AH 5 > 25 flat_550_W_realism AH 7 > 26 flat_550_W_realism AH 2 > 27 flat_550_W_realism AH 7 > 28 flat_550_W_realism AH 5 > 29 flat_550_W_realism AH 5 > 30 bump_2_step_W_realism A 1 > 31 bump_2_step_W_realism A 3 > 32 bump_2_step_W_realism A 5 > 33 bump_2_step_W_realism A 1 > 34 bump_2_step_W_realism A 3 > 35 bump_2_step_W_realism A 2 > 36 bump_2_step_W_realism A 5 > 37 bump_2_step_W_realism A 4 > 38 bump_2_step_W_realism A 4 > 39 bump_2_step_W_realism A 4 > 40 bump_2_step_W_realism A 4 > 41 bump_2_step_W_realism AH 3 > 42 bump_2_step_W_realism AH 5 > 43 bump_2_step_W_realism AH 1 > 44 bump_2_step_W_realism AH 5 > 45 bump_2_step_W_realism AH 4 > 46 bump_2_step_W_realism AH 4 > 47 bump_2_step_W_realism AH 5 > 48 bump_2_step_W_realism AH 4 > 49 bump_2_step_W_realism AH 3 > 50 bump_2_step_W_realism AH 4 > 51 bump_2_step_W_realism AH 5 > 52 bump_2_step_W_realism AH 4 > 53 hole_2_step_W_realism A 3 > 54 hole_2_step_W_realism A 3 > 55 hole_2_step_W_realism A 4 > 56 hole_2_step_W_realism A 1 > 57 hole_2_step_W_realism A 4 > 58 hole_2_step_W_realism A 3 > 59 hole_2_step_W_realism A 5 > 60 hole_2_step_W_realism A 4 > 61 hole_2_step_W_realism A 3 > 62 hole_2_step_W_realism A 4 > 63 hole_2_step_W_realism A 7 > 64 hole_2_step_W_realism A 5 > 65 hole_2_step_W_realism A 1 > 66 hole_2_step_W_realism A 4 > 67 hole_2_step_W_realism AH 7 > 68 hole_2_step_W_realism AH 5 > 69 hole_2_step_W_realism AH 5 > 70 hole_2_step_W_realism AH 1 > 71 hole_2_step_W_realism AH 5 > 72 hole_2_step_W_realism AH 5 > 73 hole_2_step_W_realism AH 5 > 74 hole_2_step_W_realism AH 2 > 75 hole_2_step_W_realism AH 6 > 76 hole_2_step_W_realism AH 5 > 77 hole_2_step_W_realism AH 5 > 78 hole_2_step_W_realism AH 6 > 79 bump_2_heel_toe_W_realism A 3 > 80 bump_2_heel_toe_W_realism A 3 > 81 bump_2_heel_toe_W_realism A 3 > 82 bump_2_heel_toe_W_realism A 2 > 83 bump_2_heel_toe_W_realism A 3 > 84 bump_2_heel_toe_W_realism A 3 > 85 bump_2_heel_toe_W_realism A 4 > 86 bump_2_heel_toe_W_realism A 3 > 87 bump_2_heel_toe_W_realism A 4 > 88 bump_2_heel_toe_W_realism A 4 > 89 bump_2_heel_toe_W_realism A 6 > 90 bump_2_heel_toe_W_realism A 5 > 91 bump_2_heel_toe_W_realism A 4 > 92 bump_2_heel_toe_W_realism AH 7 > 93 bump_2_heel_toe_W_realism AH 3 > 94 bump_2_heel_toe_W_realism AH 4 > 95 bump_2_heel_toe_W_realism AH 2 > 96 bump_2_heel_toe_W_realism AH 5 > 97 bump_2_heel_toe_W_realism AH 6 > 98 bump_2_heel_toe_W_realism AH 4 > 99 bump_2_heel_toe_W_realism AH 4 > 100 bump_2_heel_toe_W_realism AH 4 > 101 bump_2_heel_toe_W_realism AH 5 > 102 bump_2_heel_toe_W_realism AH 2 > 103 bump_2_heel_toe_W_realism AH 6 > 104 bump_2_heel_toe_W_realism AH 5 > 105 hole_2_heel_toe_W_realism A 3 > 106 hole_2_heel_toe_W_realism A 3 > 107 hole_2_heel_toe_W_realism A 1 > 108 hole_2_heel_toe_W_realism A 3 > 109 hole_2_heel_toe_W_realism A 3 > 110 hole_2_heel_toe_W_realism A 5 > 111 hole_2_heel_toe_W_realism A 2 > 112 hole_2_heel_toe_W_realism AH 5 > 113 hole_2_heel_toe_W_realism AH 1 > 114 hole_2_heel_toe_W_realism AH 3 > 115 hole_2_heel_toe_W_realism AH 6 > 116 hole_2_heel_toe_W_realism AH 5 > 117 hole_2_heel_toe_W_realism AH 4 > 118 hole_2_heel_toe_W_realism AH 4 > 119 hole_2_heel_toe_W_realism AH 3 > 120 hole_2_heel_toe_W_realism AH 3 > 121 hole_2_heel_toe_W_realism AH 1 > 122 hole_2_heel_toe_W_realism AH 5 > 123 bump_2_combination_W_realism A 4 > 124 bump_2_combination_W_realism A 2 > 125 bump_2_combination_W_realism A 4 > 126 bump_2_combination_W_realism A 1 > 127 bump_2_combination_W_realism A 4 > 128 bump_2_combination_W_realism A 4 > 129 bump_2_combination_W_realism A 2 > 130 bump_2_combination_W_realism A 4 > 131 bump_2_combination_W_realism A 2 > 132 bump_2_combination_W_realism A 4 > 133 bump_2_combination_W_realism A 2 > 134 bump_2_combination_W_realism A 6 > 135 bump_2_combination_W_realism AH 7 > 136 bump_2_combination_W_realism AH 3 > 137 bump_2_combination_W_realism AH 4 > 138 bump_2_combination_W_realism AH 1 > 139 bump_2_combination_W_realism AH 6 > 140 bump_2_combination_W_realism AH 5 > 141 bump_2_combination_W_realism AH 5 > 142 bump_2_combination_W_realism AH 6 > 143 bump_2_combination_W_realism AH 5 > 144 bump_2_combination_W_realism AH 4 > 145 bump_2_combination_W_realism AH 2 > 146 bump_2_combination_W_realism AH 4 > 147 bump_2_combination_W_realism AH 2 > 148 bump_2_combination_W_realism AH 5 > 149 hole_2_combination_W_realism A 5 > 150 hole_2_combination_W_realism A 2 > 151 hole_2_combination_W_realism A 4 > 152 hole_2_combination_W_realism A 1 > 153 hole_2_combination_W_realism A 5 > 154 hole_2_combination_W_realism A 4 > 155 hole_2_combination_W_realism A 3 > 156 hole_2_combination_W_realism A 5 > 157 hole_2_combination_W_realism A 2 > 158 hole_2_combination_W_realism A 5 > 159 hole_2_combination_W_realism A 5 > 160 hole_2_combination_W_realism A 1 > 161 hole_2_combination_W_realism AH 7 > 162 hole_2_combination_W_realism AH 5 > 163 hole_2_combination_W_realism AH 3 > 164 hole_2_combination_W_realism AH 1 > 165 hole_2_combination_W_realism AH 6 > 166 hole_2_combination_W_realism AH 4 > 167 hole_2_combination_W_realism AH 7 > 168 hole_2_combination_W_realism AH 5 > 169 hole_2_combination_W_realism AH 5 > 170 hole_2_combination_W_realism AH 2 > 171 hole_2_combination_W_realism AH 6 > 172 hole_2_combination_W_realism AH 2 > 173 hole_2_combination_W_realism AH 4 > > > > > Thanks in advance > > > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From benrhelp at yahoo.co.uk Wed Jan 5 20:58:54 2011 From: benrhelp at yahoo.co.uk (Ben Rhelp) Date: Wed, 5 Jan 2011 19:58:54 +0000 (GMT) Subject: [R] Nnet and AIC: selection of a parsimonious parameterisation Message-ID: <382933.11790.qm@web29213.mail.ird.yahoo.com> Hi All, I am trying to use a neural network for my work, but I am not sure about my approach to select a parsimonious model. In R with nnet, the IAC has not been defined for a feed-forward neural network with a single hidden layer. Is this because it does not make sens mathematically in this case? For example, is this pseudo code sensible? Thanks in advance for your help. I am sorry if this has been answered before, but I haven't found an answer for this in the archive. Below, I have added an implementation of this idea based on (Modern Applied Statistic with S) MASS code of chapter 8. Cheers, Ben -------------------------------------------------------------------------------- Pseudo code -------------------------------------------------------------------------------- Define RSS as: RSS = (1-alpha)*RSS(identification set) + alpha* RSS(validation set) and AIC as: AIC = 2*np + N*log(RSS) where np corresponds to the non-null parameters of the neural network and N is the sample size (based on http://en.wikipedia.org/wiki/Akaike_information_criterion). Assuming a feed-forward neural network with a single hidden layer and a maximum number of neurons (maxSize), For size = 1 to maxSize Optimise the decay Select the neural network with the smallest AIC for a given size and decay using random starting parameterisation and random identification set For the lowest to the largest diagonal element of the Hessian, Equate the corresponding parameter to 0 If AIC(i)>AIC(i-1), break; The neural network selected is the one with the smallest AIC. -------------------------------------------------------------------------------- an example based on cpus data in Chapter 8 of MASS -------------------------------------------------------------------------------- library(nnet) library(MASS) # From Chapter 6, for comparisons set.seed(123) cpus.samp <- c(3, 5, 6, 7, 8, 10, 11, 16, 20, 21, 22, 23, 24, 25, 29, 33, 39, 41, 44, 45, 46, 49, 57, 58, 62, 63, 65, 66, 68, 69, 73, 74, 75, 76, 78, 83, 86, 88, 98, 99, 100, 103, 107, 110, 112, 113, 115, 118, 119, 120, 122, 124, 125, 126, 127, 132, 136, 141, 144, 146, 147, 148, 149, 150, 151, 152, 154, 156, 157, 158, 159, 160, 161, 163, 166, 167, 169, 170, 173, 174, 175, 176, 177, 183, 184, 187, 188, 189, 194, 195, 196, 197, 198, 199, 202, 204, 205, 206, 208, 209) cpus2 <- cpus[, 2:8] # excludes names, authors? predictions attach(cpus2) cpus3 <- data.frame(syct = syct-2, mmin = mmin-3, mmax = mmax-4, cach=cach/256,chmin=chmin/100, chmax=chmax/100, perf) detach() CVnn.cpus <- function(formula, data = cpus3[cpus.samp, ], maxSize = 10, decayRange = c(0,0.2), nreps = 5, nifold = 10, alpha= 9/10, linout = TRUE, skip = TRUE, maxit = 1000,...){ #nreps=number of attempts to fit a nnet model with randomly chosen initial parameters # The one with the smallest RSS on the training data is then chosen nnWtsPrunning <-function(nn,data,alpha,i){ truth <- log10(data$perf) RSS=(1-alpha)*sum((truth[ri != i] - predict(nn, data[ri != i,]))^2) + alpha* sum((truth[ri == i] - predict(nn, data[ri == i,]))^2) AIC=2*sum(nn$wts!=0) + length(data$perf)*log(RSS) nn.tmp=nn for (j in (1:length(nn$wts))) { nn.tmp$wts[order(diag(nn.tmp$Hessian))[j]]=0 RSS.tmp=(1-alpha)*sum((truth[ri != i] - predict(nn.tmp, data[ri != i,]))^2) + alpha* sum((truth[ri == i] - predict(nn.tmp, data[ri == i,]))^2) AIC.tmp=2*sum(nn.tmp$wts!=0) + length(data$perf)*log(RSS.tmp) if (is.nan(AIC.tmp) || AIC.tmp>AIC ) { cat('\n j',j,'AIC'=AIC.tmp,'AIC_1',AIC,'\n') break } else { nn=nn.tmp; AIC=AIC.tmp; RSS=RSS.tmp } } list(choice=sqrt(RSS/100),nparam=sum(nn$wts!=0),AIC=AIC,nn=nn) } #Modified function for optimisation CVnn1 <- function(decay, formula, data, nreps=1, ri, size, linout, skip, maxit, optimFlag=FALSE, alpha) { truth <- log10(data$perf) nn <- nnet(formula, data[ri !=1,], trace=FALSE, size=size, linout=linout, skip=skip, maxit=maxit, Hess = TRUE) RSS=(alpha-1)*sum((truth[ri != 1] - predict(nn, data[ri != 1,]))^2) + alpha* sum((truth[ri == 1] - predict(nn, data[ri == 1,]))^2) ii=1 for (i in sort(unique(ri))) { for(rep in 1:nreps) { nn.tmp <- nnet(formula, data[ri !=i,], trace=FALSE, size=size, linout=linout, skip=skip, maxit=maxit, Hess = TRUE) RSS.tmp=(alpha-1)*sum((truth[ri != i] - predict(nn.tmp, data[ri != i,]))^2) + alpha* sum((truth[ri == i] - predict(nn.tmp, data[ri == i,]))^2) if (RSS.tmp A while back I asked about getting a list of points that R considers influential after fitting a linear model, and very quickly got a helpful pointer to influence.measures(). But "it has happened again." The trouble I am having is that points marked on plots are not flagged in the output from influence.measures(), and I can't read them on the plots. I tried some successive deletion, but then other points (naturally) start to look troublesome). Is there a good way to get a list of suspicious entries at the beginning? In this case, I am trying to help identify possible data entry errors, and I am interested in knowing what R bothered to mark up front. Perhaps the defaults should be telling me that what I want to do is silly, but it sure _seems_ like it would be helpful. Is there a way to control the threshold used by influence.measures() to get it to flag more items at one time? I am learning the hard way, so feel free to tell me that I should be trying to do this some other way. Bill From Kurt_Helf at nps.gov Wed Jan 5 21:12:25 2011 From: Kurt_Helf at nps.gov (Kurt_Helf at nps.gov) Date: Wed, 5 Jan 2011 14:12:25 -0600 Subject: [R] OT: Reducing pdf file size In-Reply-To: Message-ID: Greetings Does anyone have any suggestions for reducing pdf file size, particularly pdfs containing photos, without sacrificing quality? Thanks for any tips in advance. Cheers Kurt *************************************************************** Kurt Lewis Helf, Ph.D. Ecologist EEO Counselor National Park Service Cumberland Piedmont Network P.O. Box 8 Mammoth Cave, KY 42259 Ph: 270-758-2163 Lab: 270-758-2151 Fax: 270-758-2609 **************************************************************** Science, in constantly seeking real explanations, reveals the true majesty of our world in all its complexity. -Richard Dawkins The scientific tradition is distinguished from the pre-scientific tradition in having two layers. Like the latter it passes on its theories but it also passes on a critical attitude towards them. The theories are passed on not as dogmas but rather with the challenge to discuss them and improve upon them. -Karl Popper ...consider yourself a guest in the home of other creatures as significant as yourself. -Wayside at Wilderness Threshold in McKittrick Canyon, Guadalupe Mountains National Park, TX Cumberland Piedmont Network (CUPN) Homepage: http://tiny.cc/e7cdx CUPN Forest Pest Monitoring Website: http://bit.ly/9rhUZQ CUPN Cave Cricket Monitoring Website: http://tiny.cc/ntcql CUPN Cave Aquatic Biota Monitoring Website: http://tiny.cc/n2z1o From murdoch.duncan at gmail.com Wed Jan 5 21:25:15 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Wed, 05 Jan 2011 15:25:15 -0500 Subject: [R] integration Sweave and TexMakerX In-Reply-To: <4D24A497.5060303@gmail.com> References: <1290201233425-3051003.post@n4.nabble.com> <4D24A497.5060303@gmail.com> Message-ID: <4D24D3AB.7080806@gmail.com> On 05/01/2011 12:04 PM, Sebasti?n Daza wrote: > Hi, > > Does anyone know how to integrate texmakerx and sweave on Windows? I > mean, to run .rnw files directly from texmakerx and get a pdf or dvi file. I don't know texmakerx, but the patchDVI package (on R-forge, see https://r-forge.r-project.org/R/?group_id=233) contains some functions for hooking up Sweave with other LaTeX editors. If it's not flexible enough to handle yours I'd like to hear what's missing, and I'd probably add it. Duncan Murdoch From pearlmayd at yahoo.com Wed Jan 5 21:28:59 2011 From: pearlmayd at yahoo.com (pearl may dela cruz) Date: Wed, 5 Jan 2011 12:28:59 -0800 (PST) Subject: [R] Prediction error for Ordinary Kriging Message-ID: <972575.99511.qm@web30508.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Wed Jan 5 22:09:55 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 5 Jan 2011 16:09:55 -0500 Subject: [R] speed up in R apply In-Reply-To: References: <7B9A8B27-6E58-409F-BC0F-50BFDD1A60EC@comcast.net> Message-ID: <8AC3D4FE-1B19-42E5-AC62-5CEB3D57A839@comcast.net> On Jan 5, 2011, at 2:40 PM, Douglas Bates wrote: > On Wed, Jan 5, 2011 at 1:22 PM, David Winsemius > wrote: >> >> On Jan 5, 2011, at 10:03 AM, Young Cho wrote: >> >>> Hi, >>> >>> I am doing some simulations and found a bottle neck in my R >>> script. I made >>> an example: >>> >>>> a = matrix(rnorm(5000000),1000000,5) >>>> tt = Sys.time(); sum(a[,1]*a[,2]*a[,3]*a[,4]*a[,5]); Sys.time() >>>> - tt >>> >>> [1] -1291.026 >>> Time difference of 0.2354031 secs >>>> >>>> tt = Sys.time(); sum(apply(a,1,prod)); Sys.time() - tt >>> >>> [1] -1291.026 >>> Time difference of 20.23150 secs >>> >>> Is there a faster way of calculating sum of products (of columns, >>> or of >>> rows)? >> >> You should look at crossprod and tcrossprod. > > Hmm. Not sure that would help, David. You could use a matrix > multiplication of a %*% rep(1, ncol(a)) if you wanted the row sums but > of course you could also use rowSums to get those. Thanks for pointing that out. I misread the OP's code. > >>> And is this an expected behavior? >> >> Yes. For loops and *apply strategies are slower than the proper use >> of >> vectorized functions. > > To expand a bit on David's point, the apply function isn't magic. It > essentially loops over the rows, in this case. By multiplying columns > together you are performing the looping over the rows in compiled > code, which is much, much faster. If you want to do this kind of > operation effectively in R for a general matrix (i.e. not knowing in > advance that it has exactly 5 columns) you could use Reduce > >> a <- matrix(rnorm(5000000),1000000,5) >> system.time(pr1 <- a[,1]*a[,2]*a[,3]*a[,4]*a[,5]) > user system elapsed > 0.15 0.09 0.37 >> system.time(pr2 <- apply(a, 1, prod)) > user system elapsed > 22.090 0.140 22.902 >> all.equal(pr1, pr2) > [1] TRUE >> system.time(pr3 <- Reduce(get("*"), as.data.frame(a), rep(1, >> nrow(a)))) Slightly faster would be: system.time(pr3 <- Reduce("*", as.data.frame(a))) And thanks for the nice example. Using a data.frame to feed Reduce materially enhances its value to me. > user system elapsed > 0.410 0.010 0.575 >> all.equal(pr3, pr2) > [1] TRUE -- David Winsemius, MD West Hartford, CT From anthony.staines at dcu.ie Wed Jan 5 22:19:49 2011 From: anthony.staines at dcu.ie (Anthony Staines) Date: Wed, 05 Jan 2011 21:19:49 +0000 Subject: [R] Advice on obscuring unique IDs in R Message-ID: <4D24E075.9040508@dcu.ie> Dear colleagues, This may be a question with a really obvious answer, but I can't find it. I have access to a large file with real medical record identifiers (mixed strings of characters and numbers) in it. These represent medical events for many thousands of people. It's important to be able to link events for the same people. It's much more important that the real record numbers are strongly obscured. I'm interested in some kind of strong one-way hash function to which I can feed the real numbers and get back unique codes for each record identifier fed in. I can do this on the health service system, and I have to do this before making further use of the data! There is the 'digest' function, in the digest package, but this seems to work on the whole vector of IDs, producing, in my case, a vector with 60,000 identical entries. H.Out$P_ID = digest(H.In$MRNr,serialize=FALSE, algo='md5') I could do this in Perl, but I'd have to do quite a bit of work to get it installed. Any quick suggestions? Anthony Staines -- Anthony Staines, Professor of Health Systems Research, School of Nursing, Dublin City University, Dublin 9,Ireland. Tel:- +353 1 700 7807. Mobile:- +353 86 606 9713 From Greg.Snow at imail.org Wed Jan 5 22:34:15 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Wed, 5 Jan 2011 14:34:15 -0700 Subject: [R] Comparing fitting models In-Reply-To: <459095.84075.qm@web57907.mail.re3.yahoo.com> References: <459095.84075.qm@web57907.mail.re3.yahoo.com> Message-ID: Just do anova(fit3, fit1) This compares those 2 models directly. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Frodo Jedi > Sent: Wednesday, January 05, 2011 10:10 AM > To: r-help at r-project.org > Subject: [R] Comparing fitting models > > > Dear all, > I have 3 models (from simple to complex) and I want to compare them in > order to > see if they fit equally well or not. > From the R prompt I am not able to see where I can get this > information. > Let?s do an example: > > fit1<- lm(response ~ stimulus + condition + stimulus:condition, > data=scrd) > #EQUIVALE A lm(response ~ stimulus*condition, data=scrd) > > > fit2<- lm(response ~ stimulus + condition, data=scrd) > > fit3<- lm(response ~ condition, data=scrd) > > > > anova(fit2, fit1) #compare models > Analysis of Variance Table > > Model 1: response ~ stimulus + condition > Model 2: response ~ stimulus + condition + stimulus:condition > Res.Df RSS Df Sum of Sq F Pr(>F) > 1 165 364.13 > 2 159 362.67 6 1.4650 0.1071 0.9955 > > > > anova(fit3, fit2, fit1) #compare models > Analysis of Variance Table > > Model 1: response ~ condition > Model 2: response ~ stimulus + condition > Model 3: response ~ stimulus + condition + stimulus:condition > Res.Df RSS Df Sum of Sq F Pr(>F) > 1 171 382.78 > 2 165 364.13 6 18.650 1.3628 0.2328 > 3 159 362.67 6 1.465 0.1071 0.9955 > > > > How can I understand that the simple model fits as good as the complex > model > (the one with the interaction)? > > Thanks in advance > > All the best > > > > [[alternative HTML version deleted]] From santanu.pramanik at gmail.com Wed Jan 5 21:04:23 2011 From: santanu.pramanik at gmail.com (Santanu Pramanik) Date: Wed, 5 Jan 2011 15:04:23 -0500 Subject: [R] Reading large SAS dataset in R Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From kevinummel at gmail.com Wed Jan 5 20:16:47 2011 From: kevinummel at gmail.com (Kevin Ummel) Date: Wed, 5 Jan 2011 19:16:47 +0000 Subject: [R] Match numeric vector against rows in a matrix? Message-ID: <0AF8C77E-FD6B-4CFA-BA18-2430F16F56E9@gmail.com> Two posts in one day is not a good day...and this question seems like it should have an obvious answer: I have a matrix where rows are unique combinations of 1's and 0's: > combs=as.matrix(expand.grid(c(0,1),c(0,1))) > combs Var1 Var2 [1,] 0 0 [2,] 1 0 [3,] 0 1 [4,] 1 1 I want a single function that will give the row index containing an exact match with vector x: > x=c(0,1) The solution needs to be applied many times, so I need something quick -- I was hoping a base function would do it, but I'm drawing a blank. Thanks! Kevin From kevinummel at gmail.com Wed Jan 5 21:02:53 2011 From: kevinummel at gmail.com (Kevin Ummel) Date: Wed, 5 Jan 2011 20:02:53 +0000 Subject: [R] How to 'explode' a matrix In-Reply-To: References: <0EA333E4-1739-4516-BDDA-9BFDF2F74916@gmail.com> Message-ID: <68375078-2F9D-4DA4-BF1C-0A5E61310FED@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From maj at stats.waikato.ac.nz Wed Jan 5 22:01:51 2011 From: maj at stats.waikato.ac.nz (Murray Jorgensen) Date: Thu, 06 Jan 2011 10:01:51 +1300 Subject: [R] Converting Fortran or C++ etc to R In-Reply-To: References: <4D23B526.2020906@stats.waikato.ac.nz> Message-ID: <4D24DC3F.2000100@stats.waikato.ac.nz> Thanks Barry and thanks to others who applied off-list. I can see that I should have given more details about my motives for wanting to replace a Fortran program by an R one. At this stage I want to get something working in pure R because it is easier to fool around with and tweak with than Fortran and I have a few things that I want to try out that will involve perturbing the original code and I think I'd rather be doing them in R than in a 3GL. Now that I have publicly asked the question I find that the answer to it occurs to me: The program that I want to port to R is an ML estimation by the EM algorithm. The iterative steps are fairly simple except they need to be repeated a large number of times. What I have noticed is that I can replace (maybe) the within-step loops by matrix multiplications. This means that I will, by using %*%, be effectively handing a lot of the work to external Fortran (or similar) routines without calling .Fortran(). OK, I know that you can see though me and I accept that I am just rationalising my reluctance to get into package-writing. I will bite the bullet on that in due course but for the meantime I'm just going to fool around with straight R. Barry came closest to answering my real question and I will formulate a follow-up question as follows: Does anyone know of a helpful set of examples of the vectorization of code? Cheers, Murray On 6/01/2011 12:32 a.m., Barry Rowlingson wrote: > On Wed, Jan 5, 2011 at 7:33 AM, lcn wrote: > >> As for your actual requirement to do the "convertion", I guess there'd not >> exist any quick ways. You have to be both familiar with R and the other >> language to make the rewrite work. > > To make the rewrite work _well_ is the bigger problem! The easiest > way to big performance wins is going to be spotting vectorisation > possibilities in the Fortran code. Any time you see a DO K=1,N loop > then look to see if its just a single vector operation in R. > > Another way to big wins is to write test code, so you can check if > your R code gives the same results as the Fortran (C/C++) code at > every stage of the rewrite. Don't just write it all in one go and then > hope it works! Small steps.... > > Barry -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: maj at waikato.ac.nz Fax 7 838 4155 Phone +64 7 838 4773 wk Home +64 7 825 0441 Mobile 021 0200 8350 From lraeburn at sfu.ca Wed Jan 5 22:33:55 2011 From: lraeburn at sfu.ca (lraeburn at sfu.ca) Date: Wed, 5 Jan 2011 13:33:55 -0800 (PST) Subject: [R] Heat map in R Message-ID: <1294263235534-3176478.post@n4.nabble.com> Hello, I am trying to make a heatmap in R and am having some trouble. I am very new to the world of R, but have been told that what I am trying to do should be possible. I want to make a heat map that looks like a gene expression heatmap (see http://en.wikipedia.org/wiki/Heat_map). I have 43 samples and 900 genes (yes I know this will be a huge map). I also have copy numbers associated with each gene/sample and need these to be represented as the colour intensities on the heat map. There are multiple genes per sample with different copy numbers. I think my trouble may be how I am setting up my data frame. My data frame was created in excel as a tab deliminated text file: Gene Copy Number Sample ID A 1935 01 B 2057 01 C 2184 02 D 1498 03 E 2294 03 F 2485 03 G 1560 04 H 3759 04 I 2792 05 J 7081 05 K 1922 06 . . . . . . . . . ZZZ 1354 43 My code in R is something like this: data<-read.table("/Users/jsmt/desktop/test.txt",header=T) data_matrix<-data.matrix(data) data_heatmap <- heatmap(data_matrix, Rowv=NA, Colv=NA, col = cm.colors(256), scale="column", margins=c(5,10)) I end up getting a heat map split into 3 columns: sample, depth, gene and the colours are just in big blocks that don't mean anything. Can anyone help me with my dataframe or my R code? Again, I am fairly new to R, so if you can help, please give me very detailed help :) Thanks in advance! -- View this message in context: http://r.789695.n4.nabble.com/Heat-map-in-R-tp3176478p3176478.html Sent from the R help mailing list archive at Nabble.com. From Seeliger.Curt at epamail.epa.gov Wed Jan 5 22:42:53 2011 From: Seeliger.Curt at epamail.epa.gov (Seeliger.Curt at epamail.epa.gov) Date: Wed, 5 Jan 2011 13:42:53 -0800 Subject: [R] Advice on obscuring unique IDs in R In-Reply-To: <4D24E075.9040508@dcu.ie> References: <4D24E075.9040508@dcu.ie> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From marc_schwartz at me.com Wed Jan 5 22:43:33 2011 From: marc_schwartz at me.com (Marc Schwartz) Date: Wed, 05 Jan 2011 15:43:33 -0600 Subject: [R] Advice on obscuring unique IDs in R In-Reply-To: <4D24E075.9040508@dcu.ie> References: <4D24E075.9040508@dcu.ie> Message-ID: On Jan 5, 2011, at 3:19 PM, Anthony Staines wrote: > Dear colleagues, > > This may be a question with a really obvious answer, but I > can't find it. I have access to a large file with real > medical record identifiers (mixed strings of characters and > numbers) in it. These represent medical events for many > thousands of people. It's important to be able to link > events for the same people. > > It's much more important that the real record numbers are > strongly obscured. I'm interested in some kind of strong > one-way hash function to which I can feed the real numbers > and get back unique codes for each record identifier fed > in. I can do this on the health service system, and I have > to do this before making further use of the data! > > There is the 'digest' function, in the digest package, but > this seems to work on the whole vector of IDs, producing, in > my case, a vector with 60,000 identical entries. > > H.Out$P_ID = digest(H.In$MRNr,serialize=FALSE, algo='md5') > > I could do this in Perl, but I'd have to do quite a bit of > work to get it installed. > > Any quick suggestions? > Anthony Staines Try using sapply(): L <- replicate(60000, paste(sample(letters, 10, replace = TRUE), collapse = "")) > str(L) chr [1:60000] "dfederergw" "nwphehurvb" "avzmvltrhn" ... > head(L) [1] "dfederergw" "nwphehurvb" "avzmvltrhn" "ecmeiasmbk" "kmlcxydygl" [6] "wpftnyrzwe" # Use sapply() to run digest() over each element of L > system.time(L.Digest <- sapply(L, digest)) user system elapsed 6.920 0.031 7.361 > str(L.Digest) Named chr [1:60000] "6d5861904ee004d251504cb0f731a69a" ... - attr(*, "names")= chr [1:60000] "dfederergw" "nwphehurvb" "avzmvltrhn" "ecmeiasmbk" ... > head(L.Digest) dfederergw nwphehurvb "6d5861904ee004d251504cb0f731a69a" "bf8ee61f69c83468988cad681a9f7ad0" avzmvltrhn ecmeiasmbk "ba1c66af41359cf1a3f5e91f22c6dfe5" "95ca2deaa6c1118852c9ffed71994a7f" kmlcxydygl wpftnyrzwe "f3647a7937a2c484123ef33bb52a27ac" "e84f17180703e4805493d88a760be682" HTH, Marc Schwartz From xie at yihui.name Wed Jan 5 22:47:56 2011 From: xie at yihui.name (Yihui Xie) Date: Wed, 5 Jan 2011 15:47:56 -0600 Subject: [R] convert expressions to characters Message-ID: Hi, Suppose I have x = parse(text = " {y=50+50+50#'asfasf' } ") now x is an expression with some src attributes. > x expression({y=50+50+50#'asfasf' }) attr(,"srcfile") attr(,"wholeSrcref") {y=50+50+50#'asfasf' } My question is, how can I get my string back (the string passed to parse() as the text argument)? > as.character(x) [1] "{" as.character() only returns "{". > as.character(expression({1})) [1] "{" > as.character(expression("1","2+3")) [1] "1" "2+3" Thanks a lot! Regards, Yihui -- Yihui Xie Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA From jrkrideau at yahoo.ca Wed Jan 5 23:00:35 2011 From: jrkrideau at yahoo.ca (John Kane) Date: Wed, 5 Jan 2011 14:00:35 -0800 (PST) Subject: [R] Plotting colour-coded points In-Reply-To: Message-ID: <414552.9042.qm@web38407.mail.mud.yahoo.com> With xx as your data.frame library(ggplot2) qplot(a, id, data=xx, color=b) --- On Wed, 1/5/11, ANJAN PURKAYASTHA wrote: > From: ANJAN PURKAYASTHA > Subject: [R] Plotting colour-coded points > To: r-help at r-project.org > Received: Wednesday, January 5, 2011, 2:00 PM > Hi, > I have a file of the following type: > > id? ? a? ? ? ? b > 1???0.5? ? ???5 > 2???0.7? ? ? 15 > 3???1.6? ? ???7 > 4? ? 0.5? ???25 > .................... > > I would like to plot the data in column a on the y-axis and > the > corresponding data in column id on the x-axis, so > plot(a~id).? However I > would like to colour these points according to the data in > column b. > column b data may be colour coded into the following bins: > 0-9; 10-19; > 20-29. > Any idea on how to accomplish this? > TIA, > Anjan > > -- > =================================== > anjan purkayastha, phd. > research associate > fas center for systems biology, > harvard university > 52 oxford street > cambridge ma 02138 > phone-703.740.6939 > =================================== > > ??? [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > From dwinsemius at comcast.net Wed Jan 5 23:07:48 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 5 Jan 2011 17:07:48 -0500 Subject: [R] convert expressions to characters In-Reply-To: References: Message-ID: On Jan 5, 2011, at 4:47 PM, Yihui Xie wrote: > Hi, > > Suppose I have > > x = parse(text = " > {y=50+50+50#'asfasf' > } > ") > > now x is an expression with some src attributes. > >> x > expression({y=50+50+50#'asfasf' > }) > attr(,"srcfile") > > attr(,"wholeSrcref") > > {y=50+50+50#'asfasf' > } > > My question is, how can I get my string back (the string passed to > parse() as the text argument)? > attr(x, "wholeSrcref") {y=50+50+50#'asfasf' } -- David. > >> as.character(x) > [1] "{" > > as.character() only returns "{". > >> as.character(expression({1})) > [1] "{" >> as.character(expression("1","2+3")) > [1] "1" "2+3" > > > Thanks a lot! > > Regards, > Yihui > -- > Yihui Xie > Phone: 515-294-2465 Web: http://yihui.name > Department of Statistics, Iowa State University > 2215 Snedecor Hall, Ames, IA > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From murdoch.duncan at gmail.com Wed Jan 5 23:10:11 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Wed, 05 Jan 2011 17:10:11 -0500 Subject: [R] convert expressions to characters In-Reply-To: References: Message-ID: <4D24EC43.6060000@gmail.com> On 11-01-05 4:47 PM, Yihui Xie wrote: > Hi, > > Suppose I have > > x = parse(text = " > {y=50+50+50#'asfasf' > } > ") > > now x is an expression with some src attributes. > >> x > expression({y=50+50+50#'asfasf' > }) > attr(,"srcfile") > > attr(,"wholeSrcref") > > {y=50+50+50#'asfasf' > } > > My question is, how can I get my string back (the string passed to > parse() as the text argument)? You can use as.character(attr(x, "wholeSrcref")) If length(x) > 1, you can get the parts corresponding to each expression within it as for (i in 1:length(x)) { print (as.character(attr(x, "srcref")[[i]])) } but this leaves off comments and whitespace that are not embedded within the expressions the way your comment is. Duncan Murdoch > >> as.character(x) > [1] "{" > > as.character() only returns "{". > >> as.character(expression({1})) > [1] "{" >> as.character(expression("1","2+3")) > [1] "1" "2+3" > > > Thanks a lot! > > Regards, > Yihui > -- > Yihui Xie > Phone: 515-294-2465 Web: http://yihui.name > Department of Statistics, Iowa State University > 2215 Snedecor Hall, Ames, IA > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From dwinsemius at comcast.net Wed Jan 5 23:16:54 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 5 Jan 2011 17:16:54 -0500 Subject: [R] Match numeric vector against rows in a matrix? In-Reply-To: <0AF8C77E-FD6B-4CFA-BA18-2430F16F56E9@gmail.com> References: <0AF8C77E-FD6B-4CFA-BA18-2430F16F56E9@gmail.com> Message-ID: <5BFDE7ED-F3ED-464E-B4FC-39C354CFE8D7@comcast.net> On Jan 5, 2011, at 2:16 PM, Kevin Ummel wrote: > Two posts in one day is not a good day...and this question seems > like it should have an obvious answer: > > I have a matrix where rows are unique combinations of 1's and 0's: > >> combs=as.matrix(expand.grid(c(0,1),c(0,1))) >> combs > Var1 Var2 > [1,] 0 0 > [2,] 1 0 > [3,] 0 1 > [4,] 1 1 > > I want a single function that will give the row index containing an > exact match with vector x: > >> x=c(0,1) > intersect( which(combs[,1]==x[1]), which(combs[,2]==x[2]) ) [1] 3 Or maybe even faster: > which( combs[,1]==x[1] & combs[,2]==x[2]) [1] 3 > > The solution needs to be applied many times, so I need something > quick -- I was hoping a base function would do it, but I'm drawing a > blank. > -- David Winsemius, MD West Hartford, CT From polidore at gmail.com Wed Jan 5 23:24:32 2011 From: polidore at gmail.com (Benjamin Polidore) Date: Wed, 5 Jan 2011 17:24:32 -0500 Subject: [R] pattern recognition with paths Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Greg.Snow at imail.org Wed Jan 5 23:27:53 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Wed, 5 Jan 2011 15:27:53 -0700 Subject: [R] Comparing fitting models In-Reply-To: <520184.62241.qm@web57901.mail.re3.yahoo.com> References: <459095.84075.qm@web57907.mail.re3.yahoo.com> <520184.62241.qm@web57901.mail.re3.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From spector at stat.berkeley.edu Wed Jan 5 23:51:04 2011 From: spector at stat.berkeley.edu (Phil Spector) Date: Wed, 5 Jan 2011 14:51:04 -0800 (PST) Subject: [R] Reading large SAS dataset in R In-Reply-To: References: Message-ID: Santanu - If you have sas installed on your computer, you may find using the sas.get function of the Hmisc package useful. If the only message that read.ssd produced was "Sas failed", it would be difficult to figure out what went wrong. Usually the location of the log file, which would explain the error more thoroughly, is included in the error message. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Wed, 5 Jan 2011, Santanu Pramanik wrote: > Hi all, > > I have a large (approx. 1 GB) SAS dataset (test.sas7bdat) located in the > server (?R:/? directory). I have SAS 9.1 installed in my PC and I can read > the SAS dataset in SAS, under a windows environment, after assigning libname > in "R:\" directory. > > > > Now I am trying to read the SAS dataset in R (R 2.12.0) using the read.ssd > function of the ?foreign? package, but I get an error message ?SAS failed?. > I believe I have specified the paths correctly (after reading some previous > posts I made sure that I do it right). Below is the small code: > > > > sashome<- "C:/Program Files/SAS/SAS 9.1" > > read.ssd(libname="R:/", sectionnames="test", sascmd=file.path(sashome, > "sas.exe")) > > > > Please let me know where I am making the mistake. Is it because of the size > of the file or the location of the file (in server instead of local hard > drive)? > > > > Thanks in advance, > > Santanu > > > -- > -------------------------------------------------------------------- > Santanu Pramanik > Survey Statistician > NORC at the University of Chicago > Bethesda, MD > > [[alternative HTML version deleted]] > > From dwinsemius at comcast.net Wed Jan 5 23:51:32 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 5 Jan 2011 17:51:32 -0500 Subject: [R] pattern recognition with paths In-Reply-To: References: Message-ID: On Jan 5, 2011, at 5:24 PM, Benjamin Polidore wrote: > I'm trying to identify patterns among various "paths" like the > following: > > http://i.imgur.com/bQPI3.png > > If I plot these, I can observe intuitively two different patterns: a > front > loaded (1 and 3) and a backloaded (2,4) progress path: > > http://i.imgur.com/L5qwZ.png > > I have thousands of observations like the above table, and I want to > use R > to identify clusters of these paths. I looked at spatstat, but it > seems > more relevant to points than paths. You need some sort of distance measure. Perhaps get signed maximum deviation from a diagonal progress = (1:13)/13, Or you could classify by how wavy they were with max(dev.positive) - min(dev.negative) Or for a two-D measure, you could divide the bin x Percentage space into boxes and see which ones get entered. progress1 and progress 2 might enter mostly the digoanl boxes while progress 3 and 4 would be in the lower-right-hand corner. If you gave the boxes associated measures you could transform a trajectory back to the max(measure) paradigm. Alas, as I think about the possibilities I am reminded that the set of possible functions on the interval [0, 1] is infinite. But perhaps some sort of functional data analysis approach can put the pieces of my dashed hopes back together. Come to think of it, there _is_ an fda package: http://www.psych.mcgill.ca/misc/fda/ -- David Winsemius, MD West Hartford, CT From RTan at panagora.com Wed Jan 5 23:55:01 2011 From: RTan at panagora.com (Tan, Richard) Date: Wed, 5 Jan 2011 17:55:01 -0500 Subject: [R] categorize a character column Message-ID: <3303FA84CE4F7244B27BE264EC4AE2A7158DC441@panemail.panagora.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From RTan at panagora.com Thu Jan 6 00:01:54 2011 From: RTan at panagora.com (Tan, Richard) Date: Wed, 5 Jan 2011 18:01:54 -0500 Subject: [R] categorize a character column In-Reply-To: <3303FA84CE4F7244B27BE264EC4AE2A7153FC725@panemail.panagora.com> References: <3303FA84CE4F7244B27BE264EC4AE2A7153FC725@panemail.panagora.com> Message-ID: <3303FA84CE4F7244B27BE264EC4AE2A7158DC444@panemail.panagora.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From davidD at qimr.edu.au Thu Jan 6 00:27:12 2011 From: davidD at qimr.edu.au (David Duffy) Date: Thu, 6 Jan 2011 09:27:12 +1000 (EST) Subject: [R] Converting Fortran or C++ etc to R In-Reply-To: References: Message-ID: > Murray Jorgensen wrote: > > I'm going to try my hand at converting some Fortran programs to R. Does > anyone know of any good articles giving hints at such tasks? I will post > a selective summary of my gleanings. Presuming you don't mean .Fortran(), I have gone both ways. Aside from the obvious fact that a single R vectorized command can replace either just a loop or an entire Fortran subroutine, I don't have any deep insights. I simply did a line-by-line translation to R, confirmed the code still worked, then looked for simple optimizations/refactorings. If you have a lot of code to port, and are hinting you would like an automated tool, I think you are out of luck ;) There is a Fortran to Lisp translator (f2cl), but I think the resulting code will not get you a lot closer (it is aimed at compilation). Cheers, David Duffy. -- | David Duffy (MBBS PhD) ,-_|\ | email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / * | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v From peter.langfelder at gmail.com Thu Jan 6 00:29:38 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Wed, 5 Jan 2011 15:29:38 -0800 Subject: [R] Converting Fortran or C++ etc to R In-Reply-To: <4D23B526.2020906@stats.waikato.ac.nz> References: <4D23B526.2020906@stats.waikato.ac.nz> Message-ID: On Tue, Jan 4, 2011 at 4:02 PM, Murray Jorgensen wrote: > I'm going to try my hand at converting some Fortran programs to R. Does > anyone know of any good articles giving hints at such tasks? I will post a > selective summary of my gleanings. If the code uses functions/subroutines, keep in mind that Fortran passes arguments by reference, whereas R passes arguments by value. Peter From xie at yihui.name Thu Jan 6 00:36:15 2011 From: xie at yihui.name (Yihui Xie) Date: Wed, 5 Jan 2011 17:36:15 -0600 Subject: [R] convert expressions to characters In-Reply-To: <4D24EC43.6060000@gmail.com> References: <4D24EC43.6060000@gmail.com> Message-ID: I see. Thanks! Regards, Yihui -- Yihui Xie Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Wed, Jan 5, 2011 at 4:10 PM, Duncan Murdoch wrote: > On 11-01-05 4:47 PM, Yihui Xie wrote: >> Hi, >> >> Suppose I have >> >> x = parse(text = " >> {y=50+50+50#'asfasf' >> } >> ") >> >> now x is an expression with some src attributes. >> >>> x >> expression({y=50+50+50#'asfasf' >> }) >> attr(,"srcfile") >> >> attr(,"wholeSrcref") >> >> {y=50+50+50#'asfasf' >> } >> >> My question is, how can I get my string back (the string passed to >> parse() as the text argument)? > > You can use > > as.character(attr(x, "wholeSrcref")) > > If length(x) > 1, you can get the parts corresponding to each expression > within it as > > for (i in 1:length(x)) { > > print (as.character(attr(x, "srcref")[[i]])) > > } > > but this leaves off comments and whitespace that are not embedded within the > expressions the way your comment is. > > Duncan Murdoch > > > >> >>> as.character(x) >> [1] "{" >> >> as.character() only returns "{". >> >>> as.character(expression({1})) >> [1] "{" >>> as.character(expression("1","2+3")) >> [1] "1" ? "2+3" >> >> >> Thanks a lot! >> >> Regards, >> Yihui >> -- >> Yihui Xie >> Phone: 515-294-2465 Web: http://yihui.name >> Department of Statistics, Iowa State University >> 2215 Snedecor Hall, Ames, IA >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > From cberry at tajo.ucsd.edu Thu Jan 6 01:01:31 2011 From: cberry at tajo.ucsd.edu (Charles C. Berry) Date: Wed, 5 Jan 2011 16:01:31 -0800 Subject: [R] pattern recognition with paths In-Reply-To: References: Message-ID: On Wed, 5 Jan 2011, Benjamin Polidore wrote: > I'm trying to identify patterns among various "paths" like the following: > > http://i.imgur.com/bQPI3.png > > If I plot these, I can observe intuitively two different patterns: a front > loaded (1 and 3) and a backloaded (2,4) progress path: > > http://i.imgur.com/L5qwZ.png > > I have thousands of observations like the above table, and I want to use R > to identify clusters of these paths. I looked at spatstat, but it seems > more relevant to points than paths. Hmmm. Is this what you are after? http://en.wikipedia.org/wiki/Functional_data_analysis It is a hefty topic. There is a substantial literature on characterizing curves. Just Google Functional Data Analysis for a start and look at the 'fda' and 'MFDA' packages. HTH, Chuck > > Thanks, > bp > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Charles C. Berry Dept of Family/Preventive Medicine cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 From jan at henckens.de Thu Jan 6 01:57:01 2011 From: jan at henckens.de (Jan Henckens) Date: Thu, 06 Jan 2011 01:57:01 +0100 Subject: [R] memisc-Tables with robost standard errors Message-ID: <4D25135D.50708@henckens.de> Hello, I've got a question concerning the usage of robust standard errors in regression using lm() and exporting the summaries to LaTeX using the memisc-packages function mtable(): Is there any possibility to use robust errors which are obtained by vcovHC() when generating the LateX-output by mtable()? I tried to manipulate the lm-object by appending the "new" covariance matrix but mtable seems to generate the summary itself since it is not possible to call mtable(summary(lm1)). I'd like to obtain a table with the following structure (using standard errors I already worked out how to archieve it): Variable & Coeff. & robust S.E. & lower 95% KI & upper 95% KI \\ Var1 & x.22^(*) & (xxxxx) & [xxxxx & xxxxx] \\ . . . Maybe someone has any suggestions how to implement this kind of table? Best regards, Jan Henckens -- jan.henckens | j?llenbecker str. 58 | 33613 bielefeld tel 0521-5251970 From rstuff.miles at gmail.com Thu Jan 6 02:34:47 2011 From: rstuff.miles at gmail.com (Andrew Miles) Date: Wed, 5 Jan 2011 20:34:47 -0500 Subject: [R] OT: Reducing pdf file size In-Reply-To: References: Message-ID: <0828E022-CB82-4FAD-8A18-8543805BCBCA@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rstuff.miles at gmail.com Thu Jan 6 02:38:25 2011 From: rstuff.miles at gmail.com (Andrew Miles) Date: Wed, 5 Jan 2011 20:38:25 -0500 Subject: [R] memisc-Tables with robost standard errors In-Reply-To: <4D25135D.50708@henckens.de> References: <4D25135D.50708@henckens.de> Message-ID: <94560555-00FB-4A16-A8C0-8A07D44E6F8B@gmail.com> I always use apsrtable in the apsrtable package, which allows you to specify a vcov matrix using the "se" option. The only trick is that you have to append it to your model object, something like this: fit=lm(y ~ x) fit$se=vcovHC(fit) apsrtable(fit, se="robust") Andrew Miles On Jan 5, 2011, at 7:57 PM, Jan Henckens wrote: > Hello, > > I've got a question concerning the usage of robust standard errors > in regression using lm() and exporting the summaries to LaTeX using > the memisc-packages function mtable(): > > Is there any possibility to use robust errors which are obtained by > vcovHC() when generating the LateX-output by mtable()? > > I tried to manipulate the lm-object by appending the "new" > covariance matrix but mtable seems to generate the summary itself > since it is not possible to call mtable(summary(lm1)). > > I'd like to obtain a table with the following structure (using > standard errors I already worked out how to archieve it): > > > Variable & Coeff. & robust S.E. & lower 95% KI & upper 95% KI \\ > Var1 & x.22^(*) & (xxxxx) & [xxxxx & xxxxx] \\ > . > . > . > > > > Maybe someone has any suggestions how to implement this kind of table? > > Best regards, > Jan Henckens > > > -- > jan.henckens | j?llenbecker str. 58 | 33613 bielefeld > tel 0521-5251970 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From Achim.Zeileis at uibk.ac.at Thu Jan 6 02:37:38 2011 From: Achim.Zeileis at uibk.ac.at (Achim Zeileis) Date: Thu, 6 Jan 2011 02:37:38 +0100 (CET) Subject: [R] memisc-Tables with robost standard errors In-Reply-To: <4D25135D.50708@henckens.de> References: <4D25135D.50708@henckens.de> Message-ID: On Thu, 6 Jan 2011, Jan Henckens wrote: > Hello, > > I've got a question concerning the usage of robust standard errors in > regression using lm() and exporting the summaries to LaTeX using the > memisc-packages function mtable(): > > Is there any possibility to use robust errors which are obtained by vcovHC() > when generating the LateX-output by mtable()? > > I tried to manipulate the lm-object by appending the "new" covariance matrix > but mtable seems to generate the summary itself since it is not possible to > call mtable(summary(lm1)). I'm not a "memisc" user but had a quick look at the mtable() function. It seems to rely on specification of some function getSummary() which produces the numbers that are displayed. By default the getSummary() method for "lm" objects is called which contains a list element "coefs" with the coefficient table and confidence intervals. So one quick hack would be to modify that to employ robust standard errors: mySummary <- function(obj, alpha = 0.05, ...) { ## get original summary s <- getSummary(obj, alpha = alpha, ...) ## replace Wald tests of coefficients s$coef[,1:4] <- coeftest(obj, vcov = vcovHC(obj)) ## replace confidence intervals crit <- qt(alpha/2, obj$df.residual) s$coef[,5] <- s$coef[,1] + crit * s$coef[,2] s$coef[,6] <- s$coef[,1] - crit * s$coef[,2] return(s) } Then you can do mtable(lm1, getSummary = mySummary) and add also all the other options of mtable() that you would like to use. hth, Z > I'd like to obtain a table with the following structure (using standard > errors I already worked out how to archieve it): > > > Variable & Coeff. & robust S.E. & lower 95% KI & upper 95% KI \\ > Var1 & x.22^(*) & (xxxxx) & [xxxxx & xxxxx] \\ > . > . > . > > > > Maybe someone has any suggestions how to implement this kind of table? > > Best regards, > Jan Henckens > > > -- > jan.henckens | j?llenbecker str. 58 | 33613 bielefeld > tel 0521-5251970 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From eduardo.oliveirahorta at gmail.com Thu Jan 6 03:38:39 2011 From: eduardo.oliveirahorta at gmail.com (Eduardo de Oliveira Horta) Date: Thu, 6 Jan 2011 00:38:39 -0200 Subject: [R] Cairo pdf canvas size Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Thu Jan 6 04:00:34 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 5 Jan 2011 22:00:34 -0500 Subject: [R] Cairo pdf canvas size In-Reply-To: References: Message-ID: On Jan 5, 2011, at 9:38 PM, Eduardo de Oliveira Horta wrote: > Hello, > > I want to save a pdf plot using Cairo, but the canvas of the saved > file > seems too large when compared to the actual plotted area. > > Is there a way to control the relation between the canvas size and > the size > of actual plotting area? > OS?, ... example? == David Winsemius, MD West Hartford, CT From eduardo.oliveirahorta at gmail.com Thu Jan 6 04:35:03 2011 From: eduardo.oliveirahorta at gmail.com (Eduardo de Oliveira Horta) Date: Thu, 6 Jan 2011 01:35:03 -0200 Subject: [R] Cairo pdf canvas size In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Thu Jan 6 04:43:52 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 5 Jan 2011 22:43:52 -0500 Subject: [R] Cairo pdf canvas size In-Reply-To: References: Message-ID: <36ED95DE-BBEA-496E-90F7-75885251497B@comcast.net> On Jan 5, 2011, at 10:35 PM, Eduardo de Oliveira Horta wrote: > Something like this: > > u=seq(from=-pi, to=pi, length=1000) > f=sin(u) > Cairo("example.pdf", type="pdf",width=12,height=12,units="cm",dpi=300) > par(cex.axis=.6,col.axis="grey",ann=FALSE, lwd=.25,bty="n", las=1, > tcl=-.2, mgp=c(3,.5,0)) > xlim=c(-pi,pi) > ylim=round(c(min(f),max(f))) > plot(u,f,xlim,ylim,type="l",col="firebrick3", axes=FALSE) > axis(side=1, lwd=.25, col="darkgrey", at=seq(from=xlim[1], > to=xlim[2], length=5)) > axis(side=2, lwd=.25, col="darkgrey", at=seq(from=ylim[1], > to=ylim[2], length=5)) > abline(v=seq(from=xlim[1], to=xlim[2], length=5), lwd=. > 25,lty="dotted", col="grey") > abline(h=seq(from=ylim[1], to=ylim[2], length=5), lwd=. > 25,lty="dotted", col="grey") > dev.off() > > Notice how the canvas' margins are relatively far from the plotting > area. > 'frraid I an't help ya' padna' First I tried your code: > Cairo("example.pdf", type="pdf",width=12,height=12,units="cm",dpi=300) Error: could not find function "Cairo" Then I tried: > cairo_pdf("example.pdf", type="pdf",width=12,height=12,units="cm",dpi=300) Error in cairo_pdf("example.pdf", type = "pdf", width = 12, height = 12, : unused argument(s) (type = "pdf", units = "cm", dpi = 300) So I guess someone with your as yet unstated OS can take over now. -- David. > Thanks, > > Eduardo > > On Thu, Jan 6, 2011 at 1:00 AM, David Winsemius > wrote: > > On Jan 5, 2011, at 9:38 PM, Eduardo de Oliveira Horta wrote: > > Hello, > > I want to save a pdf plot using Cairo, but the canvas of the saved > file > seems too large when compared to the actual plotted area. > > Is there a way to control the relation between the canvas size and > the size > of actual plotting area? > > > OS?, ... example? > > == > > David Winsemius, MD > West Hartford, CT > > David Winsemius, MD West Hartford, CT From peter.langfelder at gmail.com Thu Jan 6 04:47:06 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Wed, 5 Jan 2011 19:47:06 -0800 Subject: [R] Cairo pdf canvas size In-Reply-To: References: Message-ID: On Wed, Jan 5, 2011 at 7:35 PM, Eduardo de Oliveira Horta wrote: > Something like this: > > u=seq(from=-pi, to=pi, length=1000) > f=sin(u) > Cairo("example.pdf", type="pdf",width=12,height=12,units="cm",dpi=300) > par(cex.axis=.6,col.axis="grey",ann=FALSE, lwd=.25,bty="n", las=1, tcl=-.2, > mgp=c(3,.5,0)) > xlim=c(-pi,pi) > ylim=round(c(min(f),max(f))) > plot(u,f,xlim,ylim,type="l",col="firebrick3", axes=FALSE) > axis(side=1, lwd=.25, col="darkgrey", at=seq(from=xlim[1], to=xlim[2], > length=5)) > axis(side=2, lwd=.25, col="darkgrey", at=seq(from=ylim[1], to=ylim[2], > length=5)) > abline(v=seq(from=xlim[1], to=xlim[2], length=5), lwd=.25,lty="dotted", > col="grey") > abline(h=seq(from=ylim[1], to=ylim[2], length=5), lwd=.25,lty="dotted", > col="grey") > dev.off() > > Wow, you must like light colors :) To the point, just set margins, for example par(mar = c(2,2,0.5, 0.5)) (margins are bottom, left, top, right) after the Cairo command. BTW, Cairo doesn't work for me either... but I tried your example by plotting to the screen. Peter Notice how the canvas' margins are relatively far from the plotting area. > > Thanks, > > Eduardo > > On Thu, Jan 6, 2011 at 1:00 AM, David Winsemius wrote: > >> >> On Jan 5, 2011, at 9:38 PM, Eduardo de Oliveira Horta wrote: >> >> ?Hello, >>> >>> I want to save a pdf plot using Cairo, but the canvas of the saved file >>> seems too large when compared to the actual plotted area. >>> >>> Is there a way to control the relation between the canvas size and the >>> size >>> of actual plotting area? >>> >>> >> OS?, ?... example? >> >> == >> >> David Winsemius, MD >> West Hartford, CT >> >> > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From frodo.jedi at yahoo.com Wed Jan 5 22:45:55 2011 From: frodo.jedi at yahoo.com (Frodo Jedi) Date: Wed, 5 Jan 2011 13:45:55 -0800 (PST) Subject: [R] Comparing fitting models In-Reply-To: References: <459095.84075.qm@web57907.mail.re3.yahoo.com> Message-ID: <520184.62241.qm@web57901.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From young.stat at gmail.com Wed Jan 5 22:49:31 2011 From: young.stat at gmail.com (Young Cho) Date: Wed, 5 Jan 2011 15:49:31 -0600 Subject: [R] speed up in R apply In-Reply-To: <8AC3D4FE-1B19-42E5-AC62-5CEB3D57A839@comcast.net> References: <7B9A8B27-6E58-409F-BC0F-50BFDD1A60EC@comcast.net> <8AC3D4FE-1B19-42E5-AC62-5CEB3D57A839@comcast.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From frodo.jedi at yahoo.com Wed Jan 5 23:22:13 2011 From: frodo.jedi at yahoo.com (Frodo Jedi) Date: Wed, 5 Jan 2011 14:22:13 -0800 (PST) Subject: [R] Assumptions for ANOVA: the right way to check the normality In-Reply-To: References: <9521.53053.qm@web57907.mail.re3.yahoo.com> Message-ID: <262855.6389.qm@web57903.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rmanesiya at gmail.com Thu Jan 6 07:17:24 2011 From: rmanesiya at gmail.com (Rustamali Manesiya) Date: Thu, 6 Jan 2011 00:17:24 -0600 Subject: [R] Interpolation Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From frodo.jedi at yahoo.com Wed Jan 5 23:42:28 2011 From: frodo.jedi at yahoo.com (Frodo Jedi) Date: Wed, 5 Jan 2011 14:42:28 -0800 (PST) Subject: [R] Comparing fitting models In-Reply-To: References: <459095.84075.qm@web57907.mail.re3.yahoo.com> <520184.62241.qm@web57901.mail.re3.yahoo.com> Message-ID: <440827.19072.qm@web57906.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From savicky at cs.cas.cz Wed Jan 5 23:01:15 2011 From: savicky at cs.cas.cz (Petr Savicky) Date: Wed, 5 Jan 2011 23:01:15 +0100 Subject: [R] Advice on obscuring unique IDs in R In-Reply-To: <4D24E075.9040508@dcu.ie> References: <4D24E075.9040508@dcu.ie> Message-ID: <20110105220115.GA29610@cs.cas.cz> On Wed, Jan 05, 2011 at 09:19:49PM +0000, Anthony Staines wrote: > Dear colleagues, > > This may be a question with a really obvious answer, but I > can't find it. I have access to a large file with real > medical record identifiers (mixed strings of characters and > numbers) in it. These represent medical events for many > thousands of people. It's important to be able to link > events for the same people. > > It's much more important that the real record numbers are > strongly obscured. I'm interested in some kind of strong > one-way hash function to which I can feed the real numbers > and get back unique codes for each record identifier fed > in. I can do this on the health service system, and I have > to do this before making further use of the data! Producing unique integer codes for character values may be done using a factor, for example s <- c("cd", "bc", "ab", "bc", "ab") f <- factor(s) as.integer(f) # [1] 3 2 1 2 1 levels(f) # [1] "ab" "bc" "cd" If the codes should be ordered by the first ocurrence in the data, then use f <- factor(s, levels=unique(s)) as.integer(f) # [1] 1 2 3 2 3 levels(f) # [1] "cd" "bc" "ab" This does not perform any approximate matching. The codes are assigned based on exact equality. If an approximate matching is required, then an example of the identifiers would be helpful. Filtering out different types of delimiters may be done as a preprocessing step, for example, using gsub() s <- c("ab cd", "ab cd", "a b cd") gsub(" ", "", s) # [1] "abcd" "abcd" "abcd" where a general regular expression may also be used. Petr Savicky. From frodo.jedi at yahoo.com Thu Jan 6 00:10:24 2011 From: frodo.jedi at yahoo.com (Frodo Jedi) Date: Wed, 5 Jan 2011 15:10:24 -0800 (PST) Subject: [R] Problem with 2-ways ANOVA interactions Message-ID: <498111.46032.qm@web57908.mail.re3.yahoo.com> Dear All, I have a problem in understanding how the interactions of 2 ways ANOVA work, because I get conflicting results from a t-test and an anova. For most of you my problem is very simple I am sure. I need an help with an example, looking at one table I am analyzing. The table is in attachment and can be imported in R by means of this command: scrd<- read.table('/Users/luca/Documents/Analisi_passi/Codice_R/Statistics_results_bump_hole_Audio_Haptic/tables_for_R/table_realism_wood.txt', header=TRUE, colClasse=c('numeric','factor','factor','numeric')) This table is the result of a simple experiment. Subjects where exposed to some stimuli and they where asked to evaluate the degree of realism of the stimuli on a 7 point scale (i.e., data in column "response"). Each stimulus was presented in two conditions, "A" and "AH", where AH is the condition A plus another thing (let?s call it "H"). Now, what means exactly in my table the interaction stimulus:condition? I think that if I do the analysis anova(response ~ stimulus*condition) I will get the comparison between the same stimulus in condition A and in condition AH. Am I wrong? For instance the comparison of stimulus flat_550_W_realism presented in condition A with the same stimulus, flat_550_W_realism, presented in condition AH. The problem is that if I do a t-test between the values of this stimulus in the A and AH condition I get significative difference, while if I do the test with 2-ways ANOVA I don?t get any difference. How is this possible? Here I put the results analysis #Here the result of ANOVA: > fit1<- lm(response ~ stimulus + condition + stimulus:condition, data=scrd) >#EQUIVALE A lm(response ~ stimulus*condition, data=scrd) > > anova(fit1) Analysis of Variance Table Response: response Df Sum Sq Mean Sq F value Pr(>F) stimulus 6 15.05 2.509 1.1000 0.3647 condition 1 36.51 36.515 16.0089 9.64e-05 *** stimulus:condition 6 1.47 0.244 0.1071 0.9955 Residuals 159 362.67 2.281 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #As you can see the p-value for stimulus:condition is high. #Now I do the t-test with the same values of the table concerning the stimulus presented in A and AH conditions: flat_550_W_realism =c(3,3,5,3,3,3,3,5,3,3,5,7,5,2,3) flat_550_W_realism_AH =c(7,4,5,3,6,5,3,5,5,7,2,7,5, 5) > t.test(flat_550_W_realism,flat_550_W_realism_AH, var.equal=TRUE) Two Sample t-test data: flat_550_W_realism and flat_550_W_realism_AH t = -2.2361, df = 27, p-value = 0.03381 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -2.29198603 -0.09849016 sample estimates: mean of x mean of y 3.733333 4.928571 #Now we have a significative difference between these two stimuli (p-value = 0.03381) Why I get this beheaviour? Moreover, how by means of ANOVA I could track the significative differences between the stimuli presented in A and AH condition whitout doing the t-test? Please help! Thanks in advance -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: table_realism_wood.txt URL: From schamber at rice.edu Thu Jan 6 06:32:57 2011 From: schamber at rice.edu (Scott Chamberlain) Date: Wed, 5 Jan 2011 23:32:57 -0600 Subject: [R] Potential biparite problem? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From landronimirc at gmail.com Thu Jan 6 09:53:44 2011 From: landronimirc at gmail.com (Liviu Andronic) Date: Thu, 6 Jan 2011 09:53:44 +0100 Subject: [R] speed up in R apply In-Reply-To: References: <7B9A8B27-6E58-409F-BC0F-50BFDD1A60EC@comcast.net> <8AC3D4FE-1B19-42E5-AC62-5CEB3D57A839@comcast.net> Message-ID: On Wed, Jan 5, 2011 at 10:49 PM, Young Cho wrote: > When introduced to R, I learned how to use *apply whenever I could to avoid > for-loops and all. And, getting the habit, I think I somehow got the > mis-conception that it is a magic source, always an optimal way of coding in > R. > See [1] for an article on vectorisation and loops in R. Liviu [1] http://www.r-project.org/doc/Rnews/Rnews_2008-1.pdf > Thanks a lot for all of your helpful advice and comment! > > Young > > On Wed, Jan 5, 2011 at 3:09 PM, David Winsemius wrote: > >> >> On Jan 5, 2011, at 2:40 PM, Douglas Bates wrote: >> >> ?On Wed, Jan 5, 2011 at 1:22 PM, David Winsemius >>> wrote: >>> >>>> >>>> On Jan 5, 2011, at 10:03 AM, Young Cho wrote: >>>> >>>> ?Hi, >>>>> >>>>> I am doing some simulations and found a bottle neck in my R script. I >>>>> made >>>>> an example: >>>>> >>>>> ?a = matrix(rnorm(5000000),1000000,5) >>>>>> tt ?= Sys.time(); sum(a[,1]*a[,2]*a[,3]*a[,4]*a[,5]); Sys.time() - tt >>>>>> >>>>> >>>>> [1] -1291.026 >>>>> Time difference of 0.2354031 secs >>>>> >>>>>> >>>>>> tt ?= Sys.time(); sum(apply(a,1,prod)); Sys.time() - tt >>>>>> >>>>> >>>>> [1] -1291.026 >>>>> Time difference of 20.23150 secs >>>>> >>>>> Is there a faster way of calculating sum of products (of columns, or of >>>>> rows)? >>>>> >>>> >>>> You should look at crossprod and tcrossprod. >>>> >>> >>> Hmm. ?Not sure that would help, David. ?You could use a matrix >>> multiplication of a %*% rep(1, ncol(a)) if you wanted the row sums but >>> of course you could also use rowSums to get those. >>> >> >> Thanks for pointing ?that out. I misread the OP's code. >> >> >>> ?And is this an expected behavior? >>>>> >>>> >>>> Yes. For loops and *apply strategies are slower than the proper use of >>>> vectorized functions. >>>> >>> >>> To expand a bit on David's point, the apply function isn't magic. ?It >>> essentially loops over the rows, in this case. ?By multiplying columns >>> together you are performing the looping over the rows in compiled >>> code, which is much, much faster. ?If you want to do this kind of >>> operation effectively in R for a general matrix (i.e. not knowing in >>> advance that it has exactly 5 columns) you could use Reduce >>> >>> ?a <- matrix(rnorm(5000000),1000000,5) >>>> system.time(pr1 <- a[,1]*a[,2]*a[,3]*a[,4]*a[,5]) >>>> >>> ?user ?system elapsed >>> ?0.15 ? ?0.09 ? ?0.37 >>> >>>> system.time(pr2 <- apply(a, 1, prod)) >>>> >>> ?user ?system elapsed >>> 22.090 ? 0.140 ?22.902 >>> >>>> all.equal(pr1, pr2) >>>> >>> [1] TRUE >>> >>>> system.time(pr3 <- Reduce(get("*"), as.data.frame(a), rep(1, nrow(a)))) >>>> >>> >> Slightly faster would be: >> >> system.time(pr3 <- Reduce("*", as.data.frame(a))) >> >> And thanks for the nice example. Using a data.frame to feed Reduce >> materially enhances its value to me. >> >> >> ? user ?system elapsed >>> ?0.410 ? 0.010 ? 0.575 >>> >>>> all.equal(pr3, pr2) >>>> >>> [1] TRUE >>> >> >> -- >> David Winsemius, MD >> West Hartford, CT >> >> > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail From ligges at statistik.tu-dortmund.de Thu Jan 6 10:09:40 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Thu, 06 Jan 2011 10:09:40 +0100 Subject: [R] speed up in R apply In-Reply-To: References: <7B9A8B27-6E58-409F-BC0F-50BFDD1A60EC@comcast.net> <8AC3D4FE-1B19-42E5-AC62-5CEB3D57A839@comcast.net> Message-ID: <4D2586D4.2060803@statistik.tu-dortmund.de> On 05.01.2011 22:49, Young Cho wrote: > When introduced to R, I learned how to use *apply whenever I could to avoid > for-loops and all. And, getting the habit, I think I somehow got the > mis-conception that it is a magic source, always an optimal way of coding in > R. That is right, but your apply emulates a loop over all rows. And vectorized solutions are almost always preferable. If you try to run the apply() way in the other dimension of the matrix you will find that it is as fast the vectorizes solution (since only 5 iterations are required then). Uwe Ligges > Thanks a lot for all of your helpful advice and comment! > > Young > > On Wed, Jan 5, 2011 at 3:09 PM, David Winsemiuswrote: > >> >> On Jan 5, 2011, at 2:40 PM, Douglas Bates wrote: >> >> On Wed, Jan 5, 2011 at 1:22 PM, David Winsemius >>> wrote: >>> >>>> >>>> On Jan 5, 2011, at 10:03 AM, Young Cho wrote: >>>> >>>> Hi, >>>>> >>>>> I am doing some simulations and found a bottle neck in my R script. I >>>>> made >>>>> an example: >>>>> >>>>> a = matrix(rnorm(5000000),1000000,5) >>>>>> tt = Sys.time(); sum(a[,1]*a[,2]*a[,3]*a[,4]*a[,5]); Sys.time() - tt >>>>>> >>>>> >>>>> [1] -1291.026 >>>>> Time difference of 0.2354031 secs >>>>> >>>>>> >>>>>> tt = Sys.time(); sum(apply(a,1,prod)); Sys.time() - tt >>>>>> >>>>> >>>>> [1] -1291.026 >>>>> Time difference of 20.23150 secs >>>>> >>>>> Is there a faster way of calculating sum of products (of columns, or of >>>>> rows)? >>>>> >>>> >>>> You should look at crossprod and tcrossprod. >>>> >>> >>> Hmm. Not sure that would help, David. You could use a matrix >>> multiplication of a %*% rep(1, ncol(a)) if you wanted the row sums but >>> of course you could also use rowSums to get those. >>> >> >> Thanks for pointing that out. I misread the OP's code. >> >> >>> And is this an expected behavior? >>>>> >>>> >>>> Yes. For loops and *apply strategies are slower than the proper use of >>>> vectorized functions. >>>> >>> >>> To expand a bit on David's point, the apply function isn't magic. It >>> essentially loops over the rows, in this case. By multiplying columns >>> together you are performing the looping over the rows in compiled >>> code, which is much, much faster. If you want to do this kind of >>> operation effectively in R for a general matrix (i.e. not knowing in >>> advance that it has exactly 5 columns) you could use Reduce >>> >>> a<- matrix(rnorm(5000000),1000000,5) >>>> system.time(pr1<- a[,1]*a[,2]*a[,3]*a[,4]*a[,5]) >>>> >>> user system elapsed >>> 0.15 0.09 0.37 >>> >>>> system.time(pr2<- apply(a, 1, prod)) >>>> >>> user system elapsed >>> 22.090 0.140 22.902 >>> >>>> all.equal(pr1, pr2) >>>> >>> [1] TRUE >>> >>>> system.time(pr3<- Reduce(get("*"), as.data.frame(a), rep(1, nrow(a)))) >>>> >>> >> Slightly faster would be: >> >> system.time(pr3<- Reduce("*", as.data.frame(a))) >> >> And thanks for the nice example. Using a data.frame to feed Reduce >> materially enhances its value to me. >> >> >> user system elapsed >>> 0.410 0.010 0.575 >>> >>>> all.equal(pr3, pr2) >>>> >>> [1] TRUE >>> >> >> -- >> David Winsemius, MD >> West Hartford, CT >> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From thomas.carrie at bnpparibas.com Thu Jan 6 10:11:03 2011 From: thomas.carrie at bnpparibas.com (thomas.carrie at bnpparibas.com) Date: Thu, 6 Jan 2011 10:11:03 +0100 Subject: [R] What are the necessary Oracle software to install and run ROracle ? In-Reply-To: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From benj.bad.ac at googlemail.com Thu Jan 6 10:30:23 2011 From: benj.bad.ac at googlemail.com (Benjamin B.) Date: Thu, 6 Jan 2011 10:30:23 +0100 Subject: [R] Problem with timeSequence {timeDate} - wrong end date Message-ID: <4d258bb1.4c02cc0a.25a2.245a@mx.google.com> Dear help-list, I have a problem with timeSequence {timeDate}. When I use it like > timeSequence(from = "2008-01-01", to = "2010-12-13", by = "1 month") GMT [1] [2008-01-01] [2008-02-01] [2008-03-01] [2008-04-01] [2008-05-01] [2008-06-01] [2008-07-01] [2008-08-01] [2008-09-01] [2008-10-01] [2008-11-01] [12] [2008-12-01] [2009-01-01] [2009-02-01] [2009-03-01] [2009-04-01] [2009-05-01] [2009-06-01] [2009-07-01] [2009-08-01] [2009-09-01] [2009-10-01] [23] [2009-11-01] [2009-12-01] [2010-01-01] [2010-02-01] [2010-03-01] [2010-04-01] [2010-05-01] [2010-06-01] [2010-07-01] [2010-08-01] [2010-09-01] [34] [2010-10-01] [2010-11-01] [2010-12-01] The result is as expected: a list of dates with all dates smaller then the "to" date. But somehow it behaves strange when I use it with a different starting date: > test <- timeSequence(from = "2008-01-15", to = "2010-12-13", by = "1 month") GMT [1] [2008-01-15] [2008-02-15] [2008-03-15] [2008-04-15] [2008-05-15] [2008-06-15] [2008-07-15] [2008-08-15] [2008-09-15] [2008-10-15] [2008-11-15] [12] [2008-12-15] [2009-01-15] [2009-02-15] [2009-03-15] [2009-04-15] [2009-05-15] [2009-06-15] [2009-07-15] [2009-08-15] [2009-09-15] [2009-10-15] [23] [2009-11-15] [2009-12-15] [2010-01-15] [2010-02-15] [2010-03-15] [2010-04-15] [2010-05-15] [2010-06-15] [2010-07-15] [2010-08-15] [2010-09-15] [34] [2010-10-15] [2010-11-15] [2010-12-15] In this case the last calculated date is obviously LARGER than the "to" date: > as.Date(test[length(test)]) < "2010-12-13" [1] FALSE This seem to occur also with other parameters: > timeSequence(from = "1999-12-12", to = "2000-06-08", by = "2 months") GMT [1] [1999-12-12] [2000-02-12] [2000-04-12] [2000-06-12] > timeSequence(from = as.Date("1999-12-12"), to = as.Date("2000-06-08"), by = "2 months") GMT [1] [1999-12-12] [2000-02-12] [2000-04-12] [2000-06-12] Am I missing something essential in using timeSequence? Is this behavior wanted (then I don't get why it should...)? Is there a better way to get those dates? Thanks for reading and greetings, Benjamin Benjamin B. Hamburg, Germany From djmuser at gmail.com Thu Jan 6 11:51:14 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Thu, 6 Jan 2011 02:51:14 -0800 Subject: [R] Interpolation In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From stp08emj at shef.ac.uk Thu Jan 6 11:56:45 2011 From: stp08emj at shef.ac.uk (emj83) Date: Thu, 6 Jan 2011 02:56:45 -0800 (PST) Subject: [R] How to join matrices of different row length from a list Message-ID: <1294311405773-3177212.post@n4.nabble.com> Hi, I have several matrix in a list, for example: e [[1]] [,1] [,2] [1,] 1 3 [2,] 2 4 [[2]] [,1] [,2] [1,] 1 4 [2,] 2 5 [3,] 3 6 [[3]] [,1] [,2] [1,] 2 1 I would like to join them by column i.e. [,1] [,2] [,3] [,4][,5] [,6] [1,] 1 3 1 4 2 1 [2,] 2 4 2 5 NA NA [3,] NA NA 3 6 NA NA I have tried do.call(cbind,e) but I get this error message as the rows are of different length- Error in function (..., deparse.level = 1) : number of rows of matrices must match (see arg 2) Can anyone advise me please? Thanks Emma -- View this message in context: http://r.789695.n4.nabble.com/How-to-join-matrices-of-different-row-length-from-a-list-tp3177212p3177212.html Sent from the R help mailing list archive at Nabble.com. From d.rizopoulos at erasmusmc.nl Thu Jan 6 12:23:09 2011 From: d.rizopoulos at erasmusmc.nl (Dimitris Rizopoulos) Date: Thu, 06 Jan 2011 12:23:09 +0100 Subject: [R] How to join matrices of different row length from a list In-Reply-To: <1294311405773-3177212.post@n4.nabble.com> References: <1294311405773-3177212.post@n4.nabble.com> Message-ID: <4D25A61D.6020600@erasmusmc.nl> try this: matLis <- list(matrix(1:4, 2, 2), matrix(1:6, 3, 2), matrix(2:1, 1, 2)) n <- max(sapply(matLis, nrow)) do.call(cbind, lapply(matLis, function (x) rbind(x, matrix(, n-nrow(x), ncol(x))))) I hope it helps. Best, Dimitris On 1/6/2011 11:56 AM, emj83 wrote: > > Hi, > > I have several matrix in a list, for example: > e > [[1]] > [,1] [,2] > [1,] 1 3 > [2,] 2 4 > > [[2]] > [,1] [,2] > [1,] 1 4 > [2,] 2 5 > [3,] 3 6 > > [[3]] > [,1] [,2] > [1,] 2 1 > > I would like to join them by column i.e. > [,1] [,2] [,3] [,4][,5] [,6] > [1,] 1 3 1 4 2 1 > [2,] 2 4 2 5 NA NA > [3,] NA NA 3 6 NA NA > > I have tried do.call(cbind,e) but I get this error message as the rows are > of different length- > Error in function (..., deparse.level = 1) : > number of rows of matrices must match (see arg 2) > > Can anyone advise me please? > > Thanks Emma > > -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ From stp08emj at shef.ac.uk Thu Jan 6 12:25:27 2011 From: stp08emj at shef.ac.uk (emj83) Date: Thu, 6 Jan 2011 03:25:27 -0800 (PST) Subject: [R] How to join matrices of different row length from a list In-Reply-To: <4D25A61D.6020600@erasmusmc.nl> References: <1294311405773-3177212.post@n4.nabble.com> <4D25A61D.6020600@erasmusmc.nl> Message-ID: <1294313127011-3177252.post@n4.nabble.com> Excellent- that is just what I need. Thank you so much for your prompt help, Emma -- View this message in context: http://r.789695.n4.nabble.com/How-to-join-matrices-of-different-row-length-from-a-list-tp3177212p3177252.html Sent from the R help mailing list archive at Nabble.com. From alaios at yahoo.com Thu Jan 6 13:06:28 2011 From: alaios at yahoo.com (Alaios) Date: Thu, 6 Jan 2011 04:06:28 -0800 (PST) Subject: [R] Deselect one of the array's matrix Message-ID: <994722.65685.qm@web120119.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ken.knoblauch at inserm.fr Thu Jan 6 13:25:34 2011 From: ken.knoblauch at inserm.fr (Ken Knoblauch) Date: Thu, 6 Jan 2011 12:25:34 +0000 (UTC) Subject: [R] Deselect one of the array's matrix References: <994722.65685.qm@web120119.mail.ne1.yahoo.com> Message-ID: Alaios yahoo.com> writes: > > Hello everyone and season's greetings. > I have an array that looks like that R<-c(1,2,3,4,5,6). > Is it possible to select all the elements but except one? For example to not select the third element and get back > (1,2,4,5,6)? > > How can I do that? > > I would like to thank you in advance for your help > Best Regards > Alex > > > [[alternative HTML version deleted]] Read section 2.7 of An Introduction to R that comes with the distribution. -- Ken Knoblauch Inserm U846 Stem-cell and Brain Research Institute Department of Integrative Neurosciences 18 avenue du Doyen L?pine 69500 Bron France tel: +33 (0)4 72 91 34 77 fax: +33 (0)4 72 91 34 61 portable: +33 (0)6 84 10 64 10 http://www.sbri.fr/members/kenneth-knoblauch.html From ligges at statistik.tu-dortmund.de Thu Jan 6 13:57:29 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Thu, 06 Jan 2011 13:57:29 +0100 Subject: [R] Problem with timeSequence {timeDate} - wrong end date In-Reply-To: <4d258bb1.4c02cc0a.25a2.245a@mx.google.com> References: <4d258bb1.4c02cc0a.25a2.245a@mx.google.com> Message-ID: <4D25BC39.7080903@statistik.tu-dortmund.de> Please report bugs in contributed packages to the corresponding package maintainer. Uwe Ligges On 06.01.2011 10:30, Benjamin B. wrote: > Dear help-list, > > I have a problem with timeSequence {timeDate}. > > When I use it like > >> timeSequence(from = "2008-01-01", to = "2010-12-13", by = "1 month") > GMT > [1] [2008-01-01] [2008-02-01] [2008-03-01] [2008-04-01] [2008-05-01] > [2008-06-01] [2008-07-01] [2008-08-01] [2008-09-01] [2008-10-01] > [2008-11-01] > [12] [2008-12-01] [2009-01-01] [2009-02-01] [2009-03-01] [2009-04-01] > [2009-05-01] [2009-06-01] [2009-07-01] [2009-08-01] [2009-09-01] > [2009-10-01] > [23] [2009-11-01] [2009-12-01] [2010-01-01] [2010-02-01] [2010-03-01] > [2010-04-01] [2010-05-01] [2010-06-01] [2010-07-01] [2010-08-01] > [2010-09-01] > [34] [2010-10-01] [2010-11-01] [2010-12-01] > > The result is as expected: a list of dates with all dates smaller then the > "to" date. > > But somehow it behaves strange when I use it with a different starting date: > >> test<- timeSequence(from = "2008-01-15", to = "2010-12-13", by = "1 > month") > GMT > [1] [2008-01-15] [2008-02-15] [2008-03-15] [2008-04-15] [2008-05-15] > [2008-06-15] [2008-07-15] [2008-08-15] [2008-09-15] [2008-10-15] > [2008-11-15] > [12] [2008-12-15] [2009-01-15] [2009-02-15] [2009-03-15] [2009-04-15] > [2009-05-15] [2009-06-15] [2009-07-15] [2009-08-15] [2009-09-15] > [2009-10-15] > [23] [2009-11-15] [2009-12-15] [2010-01-15] [2010-02-15] [2010-03-15] > [2010-04-15] [2010-05-15] [2010-06-15] [2010-07-15] [2010-08-15] > [2010-09-15] > [34] [2010-10-15] [2010-11-15] [2010-12-15] > > In this case the last calculated date is obviously LARGER than the "to" > date: > >> as.Date(test[length(test)])< "2010-12-13" > [1] FALSE > > This seem to occur also with other parameters: > >> timeSequence(from = "1999-12-12", to = "2000-06-08", by = "2 months") > GMT > [1] [1999-12-12] [2000-02-12] [2000-04-12] [2000-06-12] > >> timeSequence(from = as.Date("1999-12-12"), to = as.Date("2000-06-08"), by > = "2 months") > GMT > [1] [1999-12-12] [2000-02-12] [2000-04-12] [2000-06-12] > > Am I missing something essential in using timeSequence? > Is this behavior wanted (then I don't get why it should...)? > Is there a better way to get those dates? > > Thanks for reading and greetings, > > Benjamin > > > Benjamin B. > Hamburg, Germany > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From benjamin.ward at bathspa.org Thu Jan 6 14:09:07 2011 From: benjamin.ward at bathspa.org (Ben Ward) Date: Thu, 6 Jan 2011 13:09:07 +0000 Subject: [R] Splitting a Vector Message-ID: Hi all, I read in a text book, that you can examine a variable that is colinear with others, and giving different ANOVA output and explanatory power when ordered differently in the model forula, by modelling that explanatory variable, against the others colinear with it. Then, using that information to split the vector (explanatory variable) in question, into two new vectors, one should correspond to the fitted values and one the residuals of the (I think you could call it nested) model. One vector therefore should be aligned with the subspacespace defined by the other variables colinear with it, and the other will be residual, and so orthogonal to the subspace of the colinear variables. Then by including these two variables in the origional model - the one that showed the order dependency, you can see how much explanatory power the othogonal part of the order dependent variable has, at different orders, and in principle it shouldn't change, but the vector made from the part co-aligned with the co-variates, will change as the order changes - it's explanatory power should decreace in ANOVA is it moves away from being the first explanatory variable in the model. Obviously finding the fitted model values and residual required to split the vector in two is a simple lm() with the right variables. But how would I create two new vectors from this and append them to my dataframe? Is there a package or function specially designed with this sort of task in mind? Thanks, Ben Ward. From kw.stat at gmail.com Thu Jan 6 14:09:39 2011 From: kw.stat at gmail.com (Kevin Wright) Date: Thu, 6 Jan 2011 07:09:39 -0600 Subject: [R] Where is a package NEWS.Rd located? Message-ID: Hopefully a quick question. My package has a NEWS.Rd file that is not being found by "news". The "news" function calls "tools:::.build_news_db" which has this line: nfile <- file.path(dir, "inst", "NEWS.Rd") So it appears that the "news" function is searching for "mypackage/inst/NEWS.Rd". However, "Writing R extensions" says "The contents of the inst subdirectory will be copied recursively to the installation directory" During the installation, mypackage/inst/NEWS.Rd is copied into the "mypackage" directory, not "mypackage/inst". What am I doing wrong, or is this a bug? Kevin Wright -- Kevin Wright From jwiley.psych at gmail.com Thu Jan 6 14:27:40 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Thu, 6 Jan 2011 05:27:40 -0800 Subject: [R] Problem with timeSequence {timeDate} - wrong end date In-Reply-To: <4d258bb1.4c02cc0a.25a2.245a@mx.google.com> References: <4d258bb1.4c02cc0a.25a2.245a@mx.google.com> Message-ID: Dear Benjamin, timeSequence() ultimately is relying on seq.POSIXt(). If you look at the last paragraph of the "Details" section in ?seq.POSIXt it seems to basically indicate that using by = "month" just sequences through the months and the day is only changed if it is invalid for a particular month (the relevant lines are around 88--100 in the seq.POSIXt code). It is fairly straightforward ensure the output is less than "to". Here is one option: test <- timeSequence(from = "2008-01-01", to = "2010-12-13", by = "1 month") test <- test[as.Date(test) < as.Date("2010-12-13")] test Cheers, Josh On Thu, Jan 6, 2011 at 1:30 AM, Benjamin B. wrote: > Dear help-list, > > I have a problem with timeSequence {timeDate}. > > When I use it like > >> timeSequence(from = "2008-01-01", to = "2010-12-13", by = "1 month") > GMT > ?[1] [2008-01-01] [2008-02-01] [2008-03-01] [2008-04-01] [2008-05-01] > [2008-06-01] [2008-07-01] [2008-08-01] [2008-09-01] [2008-10-01] > [2008-11-01] > [12] [2008-12-01] [2009-01-01] [2009-02-01] [2009-03-01] [2009-04-01] > [2009-05-01] [2009-06-01] [2009-07-01] [2009-08-01] [2009-09-01] > [2009-10-01] > [23] [2009-11-01] [2009-12-01] [2010-01-01] [2010-02-01] [2010-03-01] > [2010-04-01] [2010-05-01] [2010-06-01] [2010-07-01] [2010-08-01] > [2010-09-01] > [34] [2010-10-01] [2010-11-01] [2010-12-01] > > The result is as expected: a list of dates with all dates smaller then the > "to" date. > > But somehow it behaves strange when I use it with a different starting date: > >> test <- timeSequence(from = "2008-01-15", to = "2010-12-13", by = "1 > month") > GMT > ?[1] [2008-01-15] [2008-02-15] [2008-03-15] [2008-04-15] [2008-05-15] > [2008-06-15] [2008-07-15] [2008-08-15] [2008-09-15] [2008-10-15] > [2008-11-15] > [12] [2008-12-15] [2009-01-15] [2009-02-15] [2009-03-15] [2009-04-15] > [2009-05-15] [2009-06-15] [2009-07-15] [2009-08-15] [2009-09-15] > [2009-10-15] > [23] [2009-11-15] [2009-12-15] [2010-01-15] [2010-02-15] [2010-03-15] > [2010-04-15] [2010-05-15] [2010-06-15] [2010-07-15] [2010-08-15] > [2010-09-15] > [34] [2010-10-15] [2010-11-15] [2010-12-15] > > In this case the last calculated date is obviously LARGER than the "to" > date: > >> as.Date(test[length(test)]) < "2010-12-13" > [1] FALSE > > This seem to occur also with other parameters: > >> timeSequence(from = "1999-12-12", to = "2000-06-08", by = "2 months") > GMT > [1] [1999-12-12] [2000-02-12] [2000-04-12] [2000-06-12] > >> timeSequence(from = as.Date("1999-12-12"), to = as.Date("2000-06-08"), by > = "2 months") > GMT > [1] [1999-12-12] [2000-02-12] [2000-04-12] [2000-06-12] > > Am I missing something essential in using timeSequence? > Is this behavior wanted (then I don't get why it should...)? > Is there a better way to get those dates? > > Thanks for reading and greetings, > > Benjamin > > > Benjamin B. > Hamburg, Germany > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From kw.stat at gmail.com Thu Jan 6 14:29:05 2011 From: kw.stat at gmail.com (Kevin Wright) Date: Thu, 6 Jan 2011 07:29:05 -0600 Subject: [R] Where is a package NEWS.Rd located? In-Reply-To: References: Message-ID: If you look at tools:::.build_news_db, the plain text NEWS file is searched for in pkg/NEWS and pkg/inst/NEWS, but NEWS.Rd in only searched for in pkg/inst/NEWS.Rd. Looks like a bug to me. I *think*. Thanks, Kevin On Thu, Jan 6, 2011 at 7:09 AM, Kevin Wright wrote: > Hopefully a quick question. ?My package has a NEWS.Rd file that is not > being found by "news". > > The "news" function calls "tools:::.build_news_db" which has this line: > > nfile <- file.path(dir, "inst", "NEWS.Rd") > > So it appears that the "news" function is searching for > "mypackage/inst/NEWS.Rd". > > However, "Writing R extensions" says "The contents of the inst > subdirectory will be copied recursively to the installation directory" > > During the installation, mypackage/inst/NEWS.Rd is copied into the > "mypackage" directory, not "mypackage/inst". > > What am I doing wrong, or is this a bug? > > Kevin Wright > > > > -- > Kevin Wright > -- Kevin Wright From eduardo.oliveirahorta at gmail.com Thu Jan 6 14:36:40 2011 From: eduardo.oliveirahorta at gmail.com (Eduardo de Oliveira Horta) Date: Thu, 6 Jan 2011 11:36:40 -0200 Subject: [R] Cairo pdf canvas size In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From k.unger at imperial.ac.uk Thu Jan 6 15:28:43 2011 From: k.unger at imperial.ac.uk (Unger, Kristian) Date: Thu, 6 Jan 2011 14:28:43 +0000 Subject: [R] survival analysis microarray expression data Message-ID: <4D25D19B.7070803@imperial.ac.uk> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From friendly at yorku.ca Thu Jan 6 15:33:25 2011 From: friendly at yorku.ca (Michael Friendly) Date: Thu, 06 Jan 2011 09:33:25 -0500 Subject: [R] defining a formula method for a weighted lm() In-Reply-To: <4D0A259B.7010105@yorku.ca> References: <4D0A259B.7010105@yorku.ca> Message-ID: <4D25D2B5.6020005@yorku.ca> No one replied to this, so I'll try again, with a simple example. I calculate a set of log odds ratios, and turn them into a data frame as follows: > library(vcdExtra) > (lor.CM <- loddsratio(CoalMiners)) log odds ratios for Wheeze and Breathlessness by Age 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 3.695261 3.398339 3.140658 3.014687 2.782049 2.926395 2.440571 2.637954 > > (lor.CM.df <- as.data.frame(lor.CM)) Wheeze Breathlessness Age LOR ASE 1 W:NoW B:NoB 25-29 3.695261 0.16471778 2 W:NoW B:NoB 30-34 3.398339 0.07733658 3 W:NoW B:NoB 35-39 3.140658 0.03341311 4 W:NoW B:NoB 40-44 3.014687 0.02866111 5 W:NoW B:NoB 45-49 2.782049 0.01875164 6 W:NoW B:NoB 50-54 2.926395 0.01585918 7 W:NoW B:NoB 55-59 2.440571 0.01452057 8 W:NoW B:NoB 60-64 2.637954 0.02159903 Now I want to fit a linear model by WLS, LOR ~ Age, which can do directly as > lm(LOR ~ as.numeric(Age), weights=1/ASE, data=lor.CM.df) Call: lm(formula = LOR ~ as.numeric(Age), data = lor.CM.df, weights = 1/ASE) Coefficients: (Intercept) as.numeric(Age) 3.5850 -0.1376 But, I want to do the fitting in my own function, the simplest version is my.lm <- function(formula, data, subset, weights) { lm(formula, data, subset, weights) } But there is obviously some magic about formula objects and evaluation environments, because I don't understand why this doesn't work. > my.lm(LOR ~ as.numeric(Age), weights=1/ASE, data=lor.CM.df) Error in model.frame.default(formula = formula, data = data, subset = subset, : invalid type (closure) for variable '(weights)' > A second question: Age is a factor, and as.numeric(Age) gives me 1:8. What simple expression on lor.CM.df$Age would give me either the lower limits (here: seq(25, 60, by = 5)) or midpoints of these Age intervals (here: seq(27, 62, by = 5))? best, -Michael On 12/16/2010 9:43 AM, Michael Friendly wrote: > In the vcdExtra package on R-Forge, I have functions and generic methods > for calculating log odds ratios > for R x C x strata tables. I'd like to define methods for fitting > weighted lm()s to the resulting loddsratio objects, > but I'm having problems figuring out how to do this generally. > > # install.packages("vcdExtra", repos="http://R-Forge.R-Project.org") > library(vcdExtra) > > > fung.lor <- loddsratio(Fungicide) > > fung.lor > log odds ratios for group and outcome by sex, strain > > strain > sex 1 2 > M -1.596015 -0.8266786 > F -1.386294 -0.6317782 > > > > fung.lor.df <- as.data.frame(fung.lor) > > fung.lor.df > group outcome sex strain LOR ASE > 1 Control:Treated Tumor:NoTumor M 1 -1.5960149 0.7394909 > 2 Control:Treated Tumor:NoTumor F 1 -1.3862944 0.9574271 > 3 Control:Treated Tumor:NoTumor M 2 -0.8266786 0.6587325 > 4 Control:Treated Tumor:NoTumor F 2 -0.6317782 1.1905545 > > > > Now, I want to test whether the odds ratios differ by sex or strain, so > I do a weighted lm() > > > fung.mod <- lm(LOR ~ sex + strain, data=fung.lor.df, weights=1/ASE^2) > > anova(fung.mod) > Analysis of Variance Table > > Response: LOR > Df Sum Sq Mean Sq F value Pr(>F) > sex 1 0.00744 0.00744 112.3 0.05990 . > strain 1 0.84732 0.84732 12788.1 0.00563 ** > Residuals 1 0.00007 0.00007 > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > > > I tried to write a generic formula method to do this, but I keep running > into errors: > > lor <- function(x, ...) > UseMethod("lor") > > lor.formula <- function(formula, data, subset, weights, > model = TRUE, x = FALSE, y = FALSE, > contrasts = NULL, ...) { > > data <- as.data.frame(data) > if (missing(weights)) { > if (! "ASE" %in% names(data)) stop("data does not contain an ASE column") > data$weights <- 1/data$ASE^2 > } > lm(formula, data, subset, weights=weights, > model = model, x = x, y = y, > contrasts = contrasts, ...) > } > > > lor(LOR ~ strain+sex, fung.lor) > Error in xj[i] : invalid subscript type 'closure' > > lor(LOR ~ strain+sex, fung.lor.df) > Error in xj[i] : invalid subscript type 'closure' > > > > traceback() > 8: `[.data.frame`(list(LOR = c(-1.59601489210196, -1.38629436111989, > -0.826678573184468, -0.631778234183653), strain = c(1L, 1L, 2L, > 2L), sex = c(1L, 2L, 1L, 2L), `(weights)` = c(1.82866556836903, > 1.09090909090909, 2.30452674897119, 0.705507123112907)), function (x, > ...) > UseMethod("subset"), , FALSE) > 7: model.frame.default(formula = formula, data = data, subset = subset, > weights = weights, drop.unused.levels = TRUE) > 6: model.frame(formula = formula, data = data, subset = subset, > weights = weights, drop.unused.levels = TRUE) > 5: eval(expr, envir, enclos) > 4: eval(mf, parent.frame()) > 3: lm(formula, data, subset, weights = weights, model = model, x = x, > y = y, contrasts = contrasts, ...) > 2: lor.formula(LOR ~ strain + sex, fung.lor.df) > 1: lor(LOR ~ strain + sex, fung.lor.df) > > > > How can I make this work? > > > -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA From jwiley.psych at gmail.com Thu Jan 6 15:47:43 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Thu, 6 Jan 2011 06:47:43 -0800 Subject: [R] Problem with timeSequence {timeDate} - wrong end date In-Reply-To: References: <4d258bb1.4c02cc0a.25a2.245a@mx.google.com> Message-ID: On Thu, Jan 6, 2011 at 5:27 AM, Joshua Wiley wrote: > timeSequence() ultimately is relying on seq.POSIXt(). If you look at My apologies, I spoke nonsense---timeSequence() does NOT rely on seq.POSIXt(). The timeDate package has its own method defined for seq() which is what gets dispatched. Still, the behavior is similar: > timeSequence(from = "2010-01-15", to = "2010-02-01", by = "1 month") GMT [1] [2010-01-15] [2010-02-15] > > seq.POSIXt(from = as.POSIXct("2010-01-15"), + to = as.POSIXct("2010-02-01"), by = "1 month") [1] "2010-01-15 PST" "2010-02-15 PST" which is not surprising because the code is responsible for this behavior is similar: ###### seq.timeDate* ###### else if (valid == 6) { if (missing(to)) { mon <- seq.int(r1$mon, by = by, length.out = length.out) } else { to <- as.POSIXlt(to) mon <- seq.int(r1$mon, 12 * (to$year - r1$year) + to$mon, by) } r1$mon <- mon r1$isdst <- -1 res <- as.POSIXct(r1) } ###### seq.POSIXt ###### else if (valid == 6L) { if (missing(to)) { mon <- seq.int(r1$mon, by = by, length.out = length.out) } else { to <- as.POSIXlt(to) mon <- seq.int(r1$mon, 12 * (to$year - r1$year) + to$mon, by) } r1$mon <- mon r1$isdst <- -1 res <- as.POSIXct(r1) } and "res" is the object returned in both cases (I believe). My system: R version 2.12.1 (2010-12-16) Platform: i486-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] timeDate_2130.91 From vdimitrakas at gmail.com Thu Jan 6 14:33:18 2011 From: vdimitrakas at gmail.com (Vassilis) Date: Thu, 6 Jan 2011 05:33:18 -0800 (PST) Subject: [R] weighed mean of a data frame row-by-row Message-ID: <1294320798093-3177421.post@n4.nabble.com> Dear list, This must be an easy one. I have a data frame like this one: test.df <- data.frame(x1=c(2,3,5), x2=c(5, 3, 4), w=c(0.8, 0.3, 0.5)) and I want to construct a weighted mean of the first two columns using the third column for weighting; i.e. y[1] = x1[1]*w[1] + x2[1]*(1-w[1]) y[2] = ... One way to do this is to use a loop like test.df$y <-numeric(3) with(test.df, for(i in 1:length(w)) { test.df$y[[i]] <<- weighted.mean(c(x1[[i]],x2[[i]]),c(w[[i]],1-w[[i]]) ) } ) My question is whether you can suggest a way to do the same without using a `for' loop, a vectorized version of this snippet - My actual dataset is large and it involves calculating the weighted mean of many columns. Such a loop becomes ugly to write and quite slow.... Thanks in advance, Vassilis -- View this message in context: http://r.789695.n4.nabble.com/weighed-mean-of-a-data-frame-row-by-row-tp3177421p3177421.html Sent from the R help mailing list archive at Nabble.com. From chrismcowen at gmail.com Thu Jan 6 12:29:26 2011 From: chrismcowen at gmail.com (Chris Mcowen) Date: Thu, 6 Jan 2011 11:29:26 +0000 Subject: [R] Multiple subsets of data Message-ID: <0833E90D-A57B-4217-B1FC-7DE7C5B52C28@gmail.com> Dear List, I have a data frame called trait with roughly 800 species in, each species have 15 columns of information: Species 1 2 3 etc.. a t y h b f j u c r y u etc.. I then have another data frame called com with the composition of species in each region, there are 506 different communities: community species NA1102 a NA1102 c NA0402 b NA0402 c AT1302 a AT1302 b etc.. What i want to do is extract the information held in the first data frame for each community and save this as a new data frame. Resulting in : - community_NA1102 a t y h c r y u community_NA0402 b f j u c r y u Thanks in advance for any suggestions / code. From cm744 at st-andrews.ac.uk Thu Jan 6 12:36:11 2011 From: cm744 at st-andrews.ac.uk (Chris Mcowen) Date: Thu, 6 Jan 2011 11:36:11 +0000 Subject: [R] Extract data Message-ID: <061B1B8E-7121-4F16-B357-48EF59816077@st-andrews.ac.uk> Dear List, I have a data frame called trait with roughly 800 species in, each species have 15 columns of information: Species 1 2 3 etc.. a t y h b f j u c r y u etc.. I then have another data frame called com with the composition of species in each region, there are 506 different communities: community species NA1102 a NA1102 c NA0402 b NA0402 c AT1302 a AT1302 b etc.. What i want to do is extract the information held in the first data frame for each community and save this as a new data frame. Resulting in : - community_NA1102 a t y h c r y u community_NA0402 b f j u c r y u Thanks in advance for any suggestions / code. From marcell at insart.com Thu Jan 6 15:04:58 2011 From: marcell at insart.com (Vladyslav Kolbasin) Date: Thu, 06 Jan 2011 16:04:58 +0200 Subject: [R] RSymphony appcrash Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From nvanzuydam at gmail.com Thu Jan 6 15:38:32 2011 From: nvanzuydam at gmail.com (Newbie19_02) Date: Thu, 6 Jan 2011 06:38:32 -0800 (PST) Subject: [R] Rserve: failed to find config file Message-ID: <1294324712382-3177518.post@n4.nabble.com> Dear R users, I've installed Rserve for R version 2.11.0 on x64 Windows 7. I've added the Rserve_d and Rserve files to the /bin/ folder where the R.exe is installed in the program files. I have also created an Rserv.cfg file that contains the following text: remote enable auth disable plaintext disable fileio enable when I run "R CMD Rserve_d" from the dos prompt I get the following error message: Rserve 0.6-2 (289) (C)Copyright 2002-2010 Simon Urbanek $Id: Rserv.c 289 2010-05-24 14:53:25Z urbanek $ Loading config file Rserv.cfg Failed to find config file Rserv.cfg The only time that it will run is if I run it in the C:\Program Files\R\R-2.11.0-x64\bin. What do I need to change for Rserve to run in any directory? Thanks, Natalie -- View this message in context: http://r.789695.n4.nabble.com/Rserve-failed-to-find-config-file-tp3177518p3177518.html Sent from the R help mailing list archive at Nabble.com. From jorgeivanvelez at gmail.com Thu Jan 6 15:55:02 2011 From: jorgeivanvelez at gmail.com (Jorge Ivan Velez) Date: Thu, 6 Jan 2011 09:55:02 -0500 Subject: [R] weighed mean of a data frame row-by-row In-Reply-To: <1294320798093-3177421.post@n4.nabble.com> References: <1294320798093-3177421.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pearlmayd at yahoo.com Thu Jan 6 16:00:56 2011 From: pearlmayd at yahoo.com (pearl may dela cruz) Date: Thu, 6 Jan 2011 07:00:56 -0800 (PST) Subject: [R] Cross validation for Ordinary Kriging Message-ID: <523289.24057.qm@web30501.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From crabak at acm.org Thu Jan 6 16:01:30 2011 From: crabak at acm.org (csrabak) Date: Thu, 06 Jan 2011 13:01:30 -0200 Subject: [R] weighed mean of a data frame row-by-row In-Reply-To: <1294320798093-3177421.post@n4.nabble.com> References: <1294320798093-3177421.post@n4.nabble.com> Message-ID: <4D25D94A.9060701@acm.org> Em 6/1/2011 11:33, Vassilis escreveu: > > Dear list, > > This must be an easy one. I have a data frame like this one: > > test.df <- data.frame(x1=c(2,3,5), x2=c(5, 3, 4), w=c(0.8, 0.3, 0.5)) > > and I want to construct a weighted mean of the first two columns using the > third column for weighting; i.e. > > y[1] = x1[1]*w[1] + x2[1]*(1-w[1]) > y[2] = ... > > One way to do this is to use a loop like > > test.df$y <-numeric(3) > > with(test.df, > for(i in 1:length(w)) { > test.df$y[[i]]<<- weighted.mean(c(x1[[i]],x2[[i]]),c(w[[i]],1-w[[i]]) ) } > ) > > My question is whether you can suggest a way to do the same without using a > `for' loop, a vectorized version of this snippet - My actual dataset is > large and it involves calculating the weighted mean of many columns. Such a > loop becomes ugly to write and quite slow.... > > Thanks in advance, > Vassilis, Is this what you're looking for? > test.df$y <- test.df$x1 * test.df$w + test.df$x2 * (1 - test.df$ w) > test.df$y [1] 2.6 3.0 4.5 -- Cesar Rabak From alaios at yahoo.com Thu Jan 6 16:07:25 2011 From: alaios at yahoo.com (Alaios) Date: Thu, 6 Jan 2011 07:07:25 -0800 (PST) Subject: [R] Find and remove elemnts of a data frame Message-ID: <91280.8321.qm@web120102.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jrkrideau at yahoo.ca Thu Jan 6 16:09:19 2011 From: jrkrideau at yahoo.ca (John Kane) Date: Thu, 6 Jan 2011 07:09:19 -0800 (PST) Subject: [R] Multiple subsets of data In-Reply-To: <0833E90D-A57B-4217-B1FC-7DE7C5B52C28@gmail.com> Message-ID: <164370.98953.qm@web38407.mail.mud.yahoo.com> Is there any common variable? From your description, I don't see how you would link a species to a community. I mean if you select species a in df1 how would you know what community it is in? --- On Thu, 1/6/11, Chris Mcowen wrote: > From: Chris Mcowen > Subject: [R] Multiple subsets of data > To: r-help at r-project.org > Received: Thursday, January 6, 2011, 6:29 AM > Dear List, > > I have a data frame called trait with roughly 800 species > in, each species have 15 columns of information: > > Species??? ??? > 1??? 2??? 3??? > etc.. > a??? ??? ??? > t??? y??? h > b??? ??? ??? > f??? j??? u > c??? ??? ??? > r??? y??? u > > etc.. > > > I then have another data frame called com with the > composition of species in each region, there are 506 > different communities: > > community??? species > NA1102??? ??? a > NA1102??? ??? c > NA0402??? ??? b > NA0402??? ??? c > AT1302??? ??? a > AT1302??? ??? b > > etc.. > > > What i want to do is extract the information held in the > first data frame for each community and save this as a new > data frame. > > Resulting in : - > > community_NA1102??? > > a??? ??? ??? > t??? y??? h > c??? ??? ??? > r??? y??? u > > community_NA0402??? > > b??? ??? ??? > f??? j??? u > c??? ??? ??? > r??? y??? u > > Thanks in advance for any suggestions / code. > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > From jrkrideau at yahoo.ca Thu Jan 6 16:27:57 2011 From: jrkrideau at yahoo.ca (John Kane) Date: Thu, 6 Jan 2011 07:27:57 -0800 (PST) Subject: [R] Find and remove elemnts of a data frame In-Reply-To: <91280.8321.qm@web120102.mail.ne1.yahoo.com> Message-ID: <59016.41896.qm@web38402.mail.mud.yahoo.com> With dataframe xx (naming a data.frame as data.frame is a bit dicey subset(xx, xx[,4]> -45) --- On Thu, 1/6/11, Alaios wrote: > From: Alaios > Subject: [R] Find and remove elemnts of a data frame > To: r-help at r-project.org > Received: Thursday, January 6, 2011, 10:07 AM > Dear all, > I have a data frame that is created like that > data.frame(x=CRX[-1],y=CRY[-1],z=CRagent[[1]]$sr) > > the output looks like > 45 116 162 -30.89105988567164 > 46 128? 79 -42.66296679571184 > 47 180 195 -30.45626175641315 > 48 114? 83 -45.26843476475688 > 49 118? 73 -46.85389245327003 > > > How can I select only the rows that their third column is > higher that -45? > This will return the following > 116 162 -30.89105988567164 > 128? 79 -42.66296679571184 > 180 195 -30.45626175641315 > > I would like to thank you in advance for your help > > Best Regards > Alex > > > > ? ? ? > ??? [[alternative HTML version deleted]] > > > -----Inline Attachment Follows----- > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > From dwinsemius at comcast.net Thu Jan 6 16:32:23 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 6 Jan 2011 10:32:23 -0500 Subject: [R] How to join matrices of different row length from a list In-Reply-To: <4D25A61D.6020600@erasmusmc.nl> References: <1294311405773-3177212.post@n4.nabble.com> <4D25A61D.6020600@erasmusmc.nl> Message-ID: <007F8078-065F-4B57-BAFD-97EE7B8C2A43@comcast.net> On Jan 6, 2011, at 6:23 AM, Dimitris Rizopoulos wrote: > try this: > > matLis <- list(matrix(1:4, 2, 2), matrix(1:6, 3, 2), > matrix(2:1, 1, 2)) > > n <- max(sapply(matLis, nrow)) > do.call(cbind, lapply(matLis, function (x) > rbind(x, matrix(, n-nrow(x), ncol(x))))) It's good that you solved the OP's question so neatly, since the alternate solution I was going to propose turns out to be for a different problem. Had the problem been for binding by row and padding with NA's, there is a ready-made function in the plyr package, rbind.fill.matrix(). No cbind.fill or cbind.fill.matrix, yet. It looks as though switching the roles of column and row in either of your respective solutions could create a general solution though. -- David. > > > I hope it helps. > > Best, > Dimitris > > > On 1/6/2011 11:56 AM, emj83 wrote: >> >> Hi, >> >> I have several matrix in a list, for example: >> e >> [[1]] >> [,1] [,2] >> [1,] 1 3 >> [2,] 2 4 >> >> [[2]] >> [,1] [,2] >> [1,] 1 4 >> [2,] 2 5 >> [3,] 3 6 >> >> [[3]] >> [,1] [,2] >> [1,] 2 1 >> >> I would like to join them by column i.e. >> [,1] [,2] [,3] [,4][,5] [,6] >> [1,] 1 3 1 4 2 1 >> [2,] 2 4 2 5 NA NA >> [3,] NA NA 3 6 NA NA >> >> I have tried do.call(cbind,e) but I get this error message as the >> rows are >> of different length- >> Error in function (..., deparse.level = 1) : >> number of rows of matrices must match (see arg 2) >> >> Can anyone advise me please? >> >> Thanks Emma >> >> > > -- > Dimitris Rizopoulos > Assistant Professor > Department of Biostatistics > Erasmus University Medical Center > > Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands > Tel: +31/(0)10/7043478 > Fax: +31/(0)10/7043014 > Web: http://www.erasmusmc.nl/biostatistiek/ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From ligges at statistik.tu-dortmund.de Thu Jan 6 16:36:12 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Thu, 06 Jan 2011 16:36:12 +0100 Subject: [R] RSymphony appcrash In-Reply-To: References: Message-ID: <4D25E16C.203@statistik.tu-dortmund.de> On 06.01.2011 15:04, Vladyslav Kolbasin wrote: > > > Hi list, > > Please tell me why sometimes RSymphony.dll crashes on > Windows 7 (may be on other Windows too). > > I saw that it depends on > input data. For example this code bring to crash: > > > library('Rsymphony') > mat = c(0, 1, -100.37967, 0, 0, 1, 0, 0, -200, 0, > 0, 1, -0.4, 0, 0, 1, 0, 0, -0.4, 0, 0.5, 0, 0, 0, 0, 0, -1, 0, 0, 0) > mat > = matrix(mat, nrow=6, byrow=TRUE) > mat > > dir = c("=", "==", "==") > rhs = > rep(0, 6) > obj = c(1, 1, 1, 1, 1) > types = c("C", "C", "B", "B", > "B") > bounds = NULL > > solution Hmmmm, there is no Rsymphony specific code in your message. Note that Rsymphony is not provided in binary form for recent versions of R. Uwe Ligges > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jrkrideau at yahoo.ca Thu Jan 6 16:45:37 2011 From: jrkrideau at yahoo.ca (John Kane) Date: Thu, 6 Jan 2011 07:45:37 -0800 (PST) Subject: [R] Multiple subsets of data In-Reply-To: <64AC0F8A-766D-415B-BFFC-4FC60AD99350@gmail.com> Message-ID: <160527.75747.qm@web38404.mail.mud.yahoo.com> You're definitely doing something I don't understand. Here's a quick mock-up of the sample data sets you provided. How do you know if species a in trait is from NA1102 or AT1302 in com? I may be blind but I just don't see what you're matching when you use which. Perhaps you might want to post the which command you're using? By the way dput is a valuable way of providing sample data. trait <- structure(list(Species = c("a", "b", "c"), v1 = c("t", "f", "r" ), v2 = c("y", "j", "y"), v3 = c("h", "u", "u")), .Names = c("Species", "v1", "v2", "v3"), class = "data.frame", row.names = c(NA, -3L )) com <- structure(list(community = c("NA1102", "NA1102", "NA0402", "NA0402", "AT1302", "AT1302"), species = c("a", "c", "b", "c", "a", "b" )), .Names = c("community", "species"), class = "data.frame", row.names = c(NA, -6L)) > From: Chris Mcowen > Subject: Re: [R] Multiple subsets of data > To: "John Kane" > Cc: r-help at r-project.org > Received: Thursday, January 6, 2011, 10:19 AM > Hi, > > Sorry the formatting has messed up, the common variable is > species ( spelt different in both but i have corrected this > now) > > I am having some luck with the which function but am > struggling to automate it for all communities > On 6 Jan 2011, at 15:09, John Kane wrote: > > Is there any common variable?? From your description, > I don't see how you would link a species to a > community.? I mean if you select species a in df1 how > would you know what community it is in? > > --- On Thu, 1/6/11, Chris Mcowen > wrote: > > > From: Chris Mcowen > > Subject: [R] Multiple subsets of data > > To: r-help at r-project.org > > Received: Thursday, January 6, 2011, 6:29 AM > > Dear List, > > > > I have a data frame called trait with roughly 800 > species > > in, each species have 15 columns of information: > > > > Species? ? ??? > > 1? ? 2? ? 3??? > > etc.. > > a? ? ? ? ??? > > t? ? y? ? h > > b? ? ? ? ??? > > f? ? j? ? u > > c? ? ? ? ??? > > r? ? y? ? u > > > > etc.. > > > > > > I then have another data frame called com with the > > composition of species in each region, there are 506 > > different communities: > > > > community? ? species > > NA1102? ? ? ? a > > NA1102? ? ? ? c > > NA0402? ? ? ? b > > NA0402? ? ? ? c > > AT1302? ? ? ? a > > AT1302? ? ? ? b > > > > etc.. > > > > > > What i want to do is extract the information held in > the > > first data frame for each community and save this as a > new > > data frame. > > > > Resulting in : - > > > > community_NA1102? ? > > > > a? ? ? ? ??? > > t? ? y? ? h > > c? ? ? ? ??? > > r? ? y? ? u > > > > community_NA0402? ? > > > > b? ? ? ? ??? > > f? ? j? ? u > > c? ? ? ? ??? > > r? ? y? ? u > > > > Thanks in advance for any suggestions / code. > > ______________________________________________ > > R-help at r-project.org > > mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, > > reproducible code. > > > > > > From dwinsemius at comcast.net Thu Jan 6 16:53:17 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 6 Jan 2011 10:53:17 -0500 Subject: [R] Extract data In-Reply-To: <061B1B8E-7121-4F16-B357-48EF59816077@st-andrews.ac.uk> References: <061B1B8E-7121-4F16-B357-48EF59816077@st-andrews.ac.uk> Message-ID: <71E401F6-E349-4D73-A6B5-DA7D7EEFD848@comcast.net> On Jan 6, 2011, at 6:36 AM, Chris Mcowen wrote: > Dear List, > > I have a data frame called trait with roughly 800 species in, each > species have 15 columns of information: > > Species 1 2 3 etc.. > a t y h > b f j u > c r y u > > etc.. > > > I then have another data frame called com with the composition of > species in each region, there are 506 different communities: > > community species > NA1102 a > NA1102 c > NA0402 b > NA0402 c > AT1302 a > AT1302 b > > etc.. > > > What i want to do is extract the information held in the first data > frame for each community and save this as a new data frame. > tapply(comm.info$species, comm.info$community, c) $AT1302 [1] 1 2 $NA0402 [1] 2 3 $NA1102 [1] 1 3 > lapply( tapply(comm.info$species, comm.info$community, c), function(x){ sp.info[x, ]} ) $AT1302 Species X1 X2 X3 1 a t y h 2 b f j u $NA0402 Species X1 X2 X3 2 b f j u 3 c r y u $NA1102 Species X1 X2 X3 1 a t y h 3 c r y u Might have looked more compact if I had assigned the output of tapply to an intermediate list: comm.sp <- tapply(comm.info$species, comm.info$community, c) lapply( comm.sp , function(x){ sp.info[x, ]} ) > > Resulting in : - > > community_NA1102 > > a t y h > c r y u > > community_NA0402 > > b f j u > c r y u > > Thanks in advance for any suggestions / code. David Winsemius, MD West Hartford, CT From alaios at yahoo.com Thu Jan 6 16:53:20 2011 From: alaios at yahoo.com (Alaios) Date: Thu, 6 Jan 2011 07:53:20 -0800 (PST) Subject: [R] Find and remove elemnts of a data frame In-Reply-To: <59016.41896.qm@web38402.mail.mud.yahoo.com> Message-ID: <200911.98831.qm@web120111.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From chrismcowen at gmail.com Thu Jan 6 16:19:13 2011 From: chrismcowen at gmail.com (Chris Mcowen) Date: Thu, 6 Jan 2011 15:19:13 +0000 Subject: [R] Multiple subsets of data In-Reply-To: <164370.98953.qm@web38407.mail.mud.yahoo.com> References: <164370.98953.qm@web38407.mail.mud.yahoo.com> Message-ID: <64AC0F8A-766D-415B-BFFC-4FC60AD99350@gmail.com> Hi, Sorry the formatting has messed up, the common variable is species ( spelt different in both but i have corrected this now) I am having some luck with the which function but am struggling to automate it for all communities On 6 Jan 2011, at 15:09, John Kane wrote: Is there any common variable? From your description, I don't see how you would link a species to a community. I mean if you select species a in df1 how would you know what community it is in? --- On Thu, 1/6/11, Chris Mcowen wrote: > From: Chris Mcowen > Subject: [R] Multiple subsets of data > To: r-help at r-project.org > Received: Thursday, January 6, 2011, 6:29 AM > Dear List, > > I have a data frame called trait with roughly 800 species > in, each species have 15 columns of information: > > Species > 1 2 3 > etc.. > a > t y h > b > f j u > c > r y u > > etc.. > > > I then have another data frame called com with the > composition of species in each region, there are 506 > different communities: > > community species > NA1102 a > NA1102 c > NA0402 b > NA0402 c > AT1302 a > AT1302 b > > etc.. > > > What i want to do is extract the information held in the > first data frame for each community and save this as a new > data frame. > > Resulting in : - > > community_NA1102 > > a > t y h > c > r y u > > community_NA0402 > > b > f j u > c > r y u > > Thanks in advance for any suggestions / code. > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > From harwood262 at gmail.com Thu Jan 6 16:28:13 2011 From: harwood262 at gmail.com (Mike Harwood) Date: Thu, 6 Jan 2011 07:28:13 -0800 (PST) Subject: [R] Set axis limits in mixtools plot Message-ID: <72f8f84d-c63a-43db-9f76-669bcad291fd@g26g2000vbz.googlegroups.com> Hello, Can the x and y axis limits be specified in a density plot with the mixtools package for a finite mixture model? Uncommenting the xlim2/ ylim2 lines in the plot command below generates 'not a graphical parameter' warnings (and does not change the axis settings), and uncommenting the xlim/ylim lines generates a 'formal argument "ylim" matched by multiple actual arguments' error. I want to use a consistent set of axis setting for four different sets of data. Thanks! test.data <- c(0.0289,0.0342,0.0379,0.0405,0.0421,0.0429,0.0430,0.0426,0.0423 , 0.0425,0.0435,0.0451,0.0466,0.0477,0.0480,0.0477,0.0473,0.0473,0.0479 , 0.0492,0.0507,0.0519,0.0526,0.0523,0.0507,0.0482,0.0452,0.0428,0.0414 , 0.0409,0.0404,0.0388,0.0358,0.0319,0.0283,0.0263,0.0269,0.0298,0.0339 , 0.0378,0.0404,0.0417,0.0425,0.0436,0.0457,0.0489,0.0522,0.0544,0.0546 , 0.0529,0.0501,0.0474,0.0457,0.0454,0.0460,0.0467,0.0466,0.0450,0.0423 , 0.0395,0.0378,0.0377,0.0388,0.0401,0.0404,0.0394,0.0378,0.0366,0.0371 , 0.0395,0.0433,0.0474,0.0512,0.0546,0.0581,0.0622,0.0673,0.0729,0.0782 , 0.0824,0.0854,0.0877,0.0902,0.0934,0.0973,0.1009,0.1032,0.1038,0.1029 , 0.1017,0.1017,0.1034,0.1065,0.1099,0.1126,0.1137,0.1133,0.1123,0.1118 , 0.1124,0.1140,0.1159,0.1175,0.1182,0.1179,0.1170,0.1160,0.1153,0.1153 , 0.1159,0.1164,0.1161,0.1146,0.1118,0.1080,0.1038,0.1002,0.0976,0.0961 , 0.0954,0.0948,0.0940,0.0929,0.0920,0.0916,0.0921,0.0934,0.0951,0.0964 , 0.0964,0.0951,0.0925,0.0894,0.0864,0.0841,0.0827,0.0821,0.0819,0.0817 , 0.0814,0.0808,0.0797,0.0783,0.0768,0.0756,0.0749,0.0749,0.0752,0.0752 , 0.0743,0.0724,0.0694,0.0662,0.0635,0.0616,0.0607,0.0603,0.0600,0.0592 , 0.0580,0.0567,0.0553,0.0539,0.0522,0.0499,0.0470,0.0435,0.0394,0.0346 , 0.0290,0.0227,0.0160,0.0098,0.0045,0.0009,-0.0012,-0.0021,-0.0025,-0.0029 ,-0.0038,-0.0052,-0.0074,-0.0103,-0.0141,-0.0189,-0.0246,-0.0308,-0.0370 ,-0.0432,-0.0495,-0.0561,-0.0627,-0.0681,-0.0716,-0.0732,-0.0736,-0.0740 ,-0.0750,-0.0760,-0.0758,-0.0732,-0.0677,-0.0603,-0.0531,-0.0481,-0.0459 ,-0.0448,-0.0421,-0.0364,-0.0285,-0.0219,-0.0202,-0.0245,-0.0321,-0.0386 ,-0.0399,-0.0352,-0.0272,-0.0206,-0.0187,-0.0220,-0.0280,-0.0330 ) library(mixtools) fit <- normalmixEM(test.data) plot(fit ,whichplots=2 # ,xlim2=c(-0.1, 0.15) # ,ylim2=c(0, 20) # ,xlim=c(-0.1, 0.15) # ,ylim=c(0, 20) ) grid() From marc_schwartz at me.com Thu Jan 6 17:04:59 2011 From: marc_schwartz at me.com (Marc Schwartz) Date: Thu, 06 Jan 2011 10:04:59 -0600 Subject: [R] What are the necessary Oracle software to install and run ROracle ? In-Reply-To: References: Message-ID: <4E8AFCEA-B605-49E1-8E91-826B8A022A81@me.com> On Jan 6, 2011, at 3:11 AM, thomas.carrie at bnpparibas.com wrote: > Hello, > > I have applied all tips (except moving to different DB lib) : > > move to R-2.12.1 > try R CMD INSTALL instead of install.packages('ROracle'); > run as root > checked that I have full 32 bit env > > It still fails with same error at installation when trying to load ROracle > lib > > --------------------- > ** testing if installed package can be loaded > Error in dyn.load(file, DLLpath = DLLpath, ...) : > unable to load shared object > '/opt/R-2.12.1/lib/R/library/ROracle/libs/ROracle.so': > /opt/R-2.12.1/lib/R/library/ROracle/libs/ROracle.so: undefined symbol: > sqlprc > --------------------- > > I know have additional warnings because of my upgrade to R-2.12.1 : > > Rd warning: ./man/DBIPreparedStatement-class.Rd:17: missing file link > 'dbPrepareStatement' > > I know have 2 problems to solve :-) > > what are the required Oracle tarballs for ROacle to install and run ? > how do I get read of these warnings ? > > Note that I try to use ROracle with Oracle 11. > > Thanks for tips > Thomas, Three thoughts: 1. How did you install R? From source or from a pre-built binary? If the former, it is possible that ROracle requires that R be built as a shared library. The default for this option is 'no' when compiling from source, whereas the pre-built binaries typically are built as a shared library. If you did build from source, use --enable-R-shlib when you run ./configure and them recompile and install R. 2. Mathieu Drapeau posted about a possibly similar issue back in 2006 (http://tolstoy.newcastle.edu.au/R/e2/help/06/09/1192.html). You may wish to contact him as he references a manual compilation. Hopefully his e-mail in that post is still valid. 3. If the above fails, I would contact the package maintainer (David James) for additional assistance. The package appears to not have been updated since late 2007 and who knows, perhaps something has changed in the intervening time frame that may require his attention either for the package itself or simply more clarity in the documentation of the installation process. The INSTALL file mentions that ROracle was last tested with R 2.3.0, which was released back in 2006. HTH, Marc Schwartz From sovo0815 at gmail.com Thu Jan 6 17:06:44 2011 From: sovo0815 at gmail.com (=?UTF-8?B?U8O2cmVuIFZvZ2Vs?=) Date: Thu, 6 Jan 2011 17:06:44 +0100 Subject: [R] Different LLRs on multinomial logit models in R and SPSS Message-ID: Hello, after calculating a multinomial logit regression on my data, I compared the output to an output retrieved with SPSS 18 (Mac). The coefficients appear to be the same, but the logLik (and therefore fit) values differ widely. Why? The regression in R: set.seed(1234) df <- data.frame( "y"=factor(sample(LETTERS[1:3], 143, repl=T, prob=c(4, 1, 10))), "a"=sample(1:5, 143, repl=T), "b"=sample(1:7, 143, repl=T), "c"=sample(1:2, 143, repl=T) ) library(nnet) mod1 <- multinom(y ~ ., data=df, trace=F) deviance(mod1) # 199.0659 mod0 <- update(mod1, . ~ 1, trace=FALSE) deviance(mod0) # 204.2904 Output data and syntax for SPSS: df2 <- df df2[, 1] <- as.numeric(df[, 1]) write.csv(df2, file="dfxy.csv", row.names=F, na="") syntaxfile <- "dfxy.sps" cat('GET DATA /TYPE=TXT /FILE=\'', getwd(), '/dfxy.csv\' /DELCASE=LINE /DELIMITERS="," /QUALIFIER=\'"\' /ARRANGEMENT=DELIMITED /FIRSTCASE=2 /IMPORTCASE=ALL /VARIABLES= y "F1.0" a "F8.4" b "F8.4" c "F8.4". CACHE. EXECUTE. DATASET NAME DataSet1 WINDOW=FRONT. VALUE LABELS /y 1 "A" 2 "B" 3 "C". EXECUTE. NOMREG y (BASE=1 ORDER=ASCENDING) WITH a b c /CRITERIA CIN(95) DELTA(0) MXITER(100) MXSTEP(5) CHKSEP(20) LCONVERGE(0) PCONVERGE(0.000001) SINGULAR(0.00000001) /MODEL /STEPWISE=PIN(.05) POUT(0.1) MINEFFECT(0) RULE(SINGLE) ENTRYMETHOD(LR) REMOVALMETHOD(LR) /INTERCEPT=INCLUDE /PRINT=FIT PARAMETER SUMMARY LRT CPS STEP MFI IC. ', file=syntaxfile, sep="", append=F) -> Loglik0: 135.02 -> Loglik1: 129.80 Thanks, S?ren From ggrothendieck at gmail.com Thu Jan 6 17:10:37 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Thu, 6 Jan 2011 11:10:37 -0500 Subject: [R] How to join matrices of different row length from a list In-Reply-To: <1294311405773-3177212.post@n4.nabble.com> References: <1294311405773-3177212.post@n4.nabble.com> Message-ID: On Thu, Jan 6, 2011 at 5:56 AM, emj83 wrote: > > Hi, > > I have several matrix in a list, for example: > e > [[1]] > ? ? [,1] [,2] > [1,] ? ?1 ? ?3 > [2,] ? ?2 ? ?4 > > [[2]] > ? ? [,1] [,2] > [1,] ? ?1 ? ?4 > [2,] ? ?2 ? ?5 > [3,] ? ?3 ? ?6 > > [[3]] > ? ? [,1] [,2] > [1,] ? ?2 ? ?1 > > I would like to join them by column i.e. > ? ? [,1] [,2] ? [,3] [,4][,5] [,6] > [1,] ? ?1 ? ?3 ? 1 ? ?4 ? ?2 ? ?1 > [2,] ? ?2 ? ?4 ? 2 ? ?5 ? NA ?NA > [3,] ? NA ?NA ?3 ? ?6 ? NA ? NA > > I have tried ?do.call(cbind,e) but I get this error message as the rows are > of different length- > Error in function (..., deparse.level = 1) ?: > ?number of rows of matrices must match (see arg 2) > One reasonably simple approach is to convert your matrices to time series (either ts series or zoo series) as cbind.ts and cbind.zoo both NA fill. L <- list(matrix(1:4, 2, 2), matrix(1:6, 3, 2), matrix(2:1, 1, 2)) # using ts M <- unclass(do.call(cbind, lapply(L, ts))) tsp(M) <- colnames(M) <- NULL # With zoo its slightly shorter: library(zoo) M <- coredata(do.call(cbind, lapply(L, zoo))) colnames(M) <- NULL We can omit the colnames(M) <- NULL part in both cases if the list itself or the constituent matrices have column names, e.g. L <- list(A = matrix(1:4, 2, 2), B = matrix(1:6, 3, 2), C = matrix(2:1, 1, 2)) # or L <- list(cbind(a = 1:2, b = 3:4), cbind(c = 1:3, d = 4:6), cbind(e = 2, f = 1)) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From dwinsemius at comcast.net Thu Jan 6 17:16:03 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 6 Jan 2011 11:16:03 -0500 Subject: [R] Different LLRs on multinomial logit models in R and SPSS In-Reply-To: References: Message-ID: <01E7DB58-AFAB-4872-8236-8E3F9E41AEFF@comcast.net> On Jan 6, 2011, at 11:06 AM, S?ren Vogel wrote: > Hello, after calculating a multinomial logit regression on my data, I > compared the output to an output retrieved with SPSS 18 (Mac). The > coefficients appear to be the same, but the logLik (and therefore fit) > values differ widely. Why? The likelihood is arbitrary. It is the difference in likelihoods that is important and in this respect the answers you got from the two software packages (delta deviance= 5.22) is equivalent. If you run logistic models with individual records versus using grouped data for equivalent models with the same software, the reported deviance will be widely different but the comparison of nested models will imply the same inferential conclusions. -- David. > > The regression in R: > > set.seed(1234) > df <- data.frame( > "y"=factor(sample(LETTERS[1:3], 143, repl=T, prob=c(4, 1, 10))), > "a"=sample(1:5, 143, repl=T), > "b"=sample(1:7, 143, repl=T), > "c"=sample(1:2, 143, repl=T) > ) > library(nnet) > mod1 <- multinom(y ~ ., data=df, trace=F) > deviance(mod1) # 199.0659 > mod0 <- update(mod1, . ~ 1, trace=FALSE) > deviance(mod0) # 204.2904 > > Output data and syntax for SPSS: > > df2 <- df > df2[, 1] <- as.numeric(df[, 1]) > write.csv(df2, file="dfxy.csv", row.names=F, na="") > syntaxfile <- "dfxy.sps" > cat('GET DATA > /TYPE=TXT > /FILE=\'', getwd(), '/dfxy.csv\' > /DELCASE=LINE > /DELIMITERS="," > /QUALIFIER=\'"\' > /ARRANGEMENT=DELIMITED > /FIRSTCASE=2 > /IMPORTCASE=ALL > /VARIABLES= > y "F1.0" > a "F8.4" > b "F8.4" > c "F8.4". > CACHE. > EXECUTE. > DATASET NAME DataSet1 WINDOW=FRONT. > > VALUE LABELS > /y 1 "A" 2 "B" 3 "C". > EXECUTE. > > NOMREG y (BASE=1 ORDER=ASCENDING) WITH a b c > /CRITERIA CIN(95) DELTA(0) MXITER(100) MXSTEP(5) CHKSEP(20) > LCONVERGE(0) PCONVERGE(0.000001) > SINGULAR(0.00000001) > /MODEL > /STEPWISE=PIN(.05) POUT(0.1) MINEFFECT(0) RULE(SINGLE) > ENTRYMETHOD(LR) REMOVALMETHOD(LR) > /INTERCEPT=INCLUDE > /PRINT=FIT PARAMETER SUMMARY LRT CPS STEP MFI IC. > ', file=syntaxfile, sep="", append=F) > > -> Loglik0: 135.02 > -> Loglik1: 129.80 > > Thanks, S?ren > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From bbolker at gmail.com Thu Jan 6 17:17:13 2011 From: bbolker at gmail.com (Ben Bolker) Date: Thu, 6 Jan 2011 16:17:13 +0000 (UTC) Subject: [R] Different LLRs on multinomial logit models in R and SPSS References: Message-ID: S?ren Vogel gmail.com> writes: > > Hello, after calculating a multinomial logit regression on my data, I > compared the output to an output retrieved with SPSS 18 (Mac). The > coefficients appear to be the same, but the logLik (and therefore fit) > values differ widely. Why? Since constants that are independent of the model parameters can be left out of a log-likelihood calculation without affecting inference among models, it is quite common for likelihoods to be calculated differently in different software packages (for example, the (n/2 log(2*pi)) constant in the Gaussian log-likelihood) -- this is true even among different computations within the R ecosystem. What's important is the difference among log-likelihoods between models, which is the same (up to the numeric precision of what you've shown us) for both software packages. > 135.02-129.8 [1] 5.22 > 204.2904-199.0659 [1] 5.2245 Ben Bolker From sovo0815 at gmail.com Thu Jan 6 17:23:55 2011 From: sovo0815 at gmail.com (=?UTF-8?B?U8O2cmVuIFZvZ2Vs?=) Date: Thu, 6 Jan 2011 17:23:55 +0100 Subject: [R] Different LLRs on multinomial logit models in R and SPSS In-Reply-To: References: Message-ID: Thanks for your replies. I am no mathematician or statistician by far, however, it appears to me that the actual value of any of the two LLs is indeed important when it comes to calculation of Pseudo-R-Squared-s. If Rnagel devides by (some transformation of) the actiual value of llnull then any calculation of Rnagel should differ. How come? Or is my function wrong? And if my function is right, how can I calculate a R-Squared independent from the software used? Rfits <- function(mod) { llnull <- deviance(update(mod, . ~ 1, trace=F)) llmod <- deviance(mod) n <- length(predict(mod)) Rcs <- 1 - exp( (llmod - llnull) / n ) Rnagel <- Rcs / (1 - exp(-llnull/n)) out <- list( "Rcs"=Rcs, "Rnagel"=Rnagel ) class(out) <- c("list", "table") return(out) } From shigesong at gmail.com Thu Jan 6 17:42:03 2011 From: shigesong at gmail.com (Shige Song) Date: Thu, 6 Jan 2011 11:42:03 -0500 Subject: [R] RGtk2 compilation problem In-Reply-To: References: Message-ID: Look forward to it. Thanks. Shige On Sat, Jan 1, 2011 at 8:45 AM, Michael Lawrence wrote: > Please watch for 2.20.5 and let me know if it helps. Not really sure what is > going on here, but someone else has reported the same issue. > > Thanks, > Michael > > On Wed, Dec 29, 2010 at 6:44 AM, Shige Song wrote: >> >> Dear All, >> >> I am trying to compile&install the package "RGtk2" on my Ubuntu 10.04 >> box. I did not have problem with earlier versions, but with the new >> version, I got the following error message : >> >> ------------------------------------------------------------------------------------------------- >> * installing *source* package ?RGtk2? ... >> checking for pkg-config... /usr/bin/pkg-config >> checking pkg-config is at least version 0.9.0... yes >> checking for INTROSPECTION... no >> checking for GTK... yes >> checking for GTHREAD... yes >> checking for gcc... gcc >> checking whether the C compiler works... yes >> checking for C compiler default output file name... a.out >> checking for suffix of executables... >> checking whether we are cross compiling... no >> checking for suffix of object files... o >> checking whether we are using the GNU C compiler... yes >> checking whether gcc accepts -g... yes >> checking for gcc option to accept ISO C89... none needed >> checking how to run the C preprocessor... gcc -E >> checking for grep that handles long lines and -e... /bin/grep >> checking for egrep... /bin/grep -E >> checking for ANSI C header files... yes >> checking for sys/types.h... yes >> checking for sys/stat.h... yes >> checking for stdlib.h... yes >> checking for string.h... yes >> checking for memory.h... yes >> checking for strings.h... yes >> checking for inttypes.h... yes >> checking for stdint.h... yes >> checking for unistd.h... yes >> checking for uintptr_t... yes >> configure: creating ./config.status >> config.status: creating src/Makevars >> ** libs >> gcc -std=gnu99 -I/usr/local/lib/R/include -g -D_R_=1 -pthread >> -D_REENTRANT -I/usr/include/gtk-2.0 -I/usr/lib/gtk-2.0/include >> -I/usr/include/atk-1.0 -I/usr/include/cairo -I/usr/include/pango-1.0 >> -I/usr/include/gio-unix-2.0/ -I/usr/include/glib-2.0 >> -I/usr/lib/glib-2.0/include -I/usr/include/pixman-1 >> -I/usr/include/freetype2 -I/usr/include/directfb >> -I/usr/include/libpng12 ? -I. ?-DHAVE_UINTPTR_T ?-I/usr/local/include >> ?-fpic ?-g -O2 -c RGtkDataFrame.c -o RGtkDataFrame.o >> In file included from RGtk2/gtk.h:19, >> ? ? ? ? ? ? ? ? from RGtkDataFrame.h:1, >> ? ? ? ? ? ? ? ? from RGtkDataFrame.c:1: >> ./RGtk2/gdkClasses.h:4:23: error: RGtk2/gdk.h: No such file or directory >> make: *** [RGtkDataFrame.o] Error 1 >> ERROR: compilation failed for package ?RGtk2? >> * removing ?/usr/local/lib/R/library/RGtk2? >> * restoring previous ?/usr/local/lib/R/library/RGtk2? >> >> The downloaded packages are in >> ? ? ? ??/tmp/RtmprSWbka/downloaded_packages? >> Updating HTML index of packages in '.Library' >> Warning message: >> In install.packages("RGtk2", dep = T) : >> ?installation of package 'RGtk2' had non-zero exit status >> >> ---------------------------------------------------------------------------------------- >> >> I noticed the requirement for the package >> (http://cran.r-project.org/web/packages/RGtk2/index.html) saying >> "...GTK+ (>= 2.8.0)..." The latest GTK+ is 2.20, could this be the >> problem? >> >> Many thanks. >> >> Best, >> Shige >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > From ripley at stats.ox.ac.uk Thu Jan 6 17:53:41 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Thu, 6 Jan 2011 16:53:41 +0000 (GMT) Subject: [R] RGtk2 compilation problem In-Reply-To: References: Message-ID: You need RGtk2 2.20.7 which is now on CRAN. Others have seen this, but it has taken a while to track down the exact cause. The diagnosis was that ML used a recent GNU tar which created a tarball with hard links that R's untar was not prepared to deal with. We consider that is a bug in GNU tar, but untar() has been updated in R-patched to cope. If you have such a tarball, try setting the environment variable R_INSTALL_TAR to 'tar' (or whatever GNU tar is called on your system) when installing the tarball. For those packaging source packages: in the unusual event that your package sources contains symbolic (or even hard) links, don't use GNU tar 1.24 or 1.25. On Thu, 6 Jan 2011, Shige Song wrote: > Look forward to it. > > Thanks. > > Shige > > On Sat, Jan 1, 2011 at 8:45 AM, Michael Lawrence > wrote: >> Please watch for 2.20.5 and let me know if it helps. Not really sure what is >> going on here, but someone else has reported the same issue. >> >> Thanks, >> Michael >> >> On Wed, Dec 29, 2010 at 6:44 AM, Shige Song wrote: >>> >>> Dear All, >>> >>> I am trying to compile&install the package "RGtk2" on my Ubuntu 10.04 >>> box. I did not have problem with earlier versions, but with the new >>> version, I got the following error message : ... -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From dwinsemius at comcast.net Thu Jan 6 18:03:17 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 6 Jan 2011 12:03:17 -0500 Subject: [R] Different LLRs on multinomial logit models in R and SPSS In-Reply-To: References: Message-ID: <8D4AC2A1-ABCB-4B51-B057-05A98745C28E@comcast.net> On Jan 6, 2011, at 11:23 AM, S?ren Vogel wrote: > Thanks for your replies. I am no mathematician or statistician by far, > however, it appears to me that the actual value of any of the two LLs > is indeed important when it comes to calculation of > Pseudo-R-Squared-s. If Rnagel devides by (some transformation of) the > actiual value of llnull then any calculation of Rnagel should differ. > How come? Or is my function wrong? And if my function is right, how > can I calculate a R-Squared independent from the software used? You have two models in that function, the null one with ".~ 1" and the origianl one and you are getting a ratio on the likelihood scale ( which is a difference on the log-likelihood or deviance scale). > > Rfits <- function(mod) { > llnull <- deviance(update(mod, . ~ 1, trace=F)) > llmod <- deviance(mod) > n <- length(predict(mod)) > Rcs <- 1 - exp( (llmod - llnull) / n ) > Rnagel <- Rcs / (1 - exp(-llnull/n)) > out <- list( > "Rcs"=Rcs, > "Rnagel"=Rnagel > ) > class(out) <- c("list", "table") > return(out) > } -- David Winsemius, MD West Hartford, CT From chrismcowen at gmail.com Thu Jan 6 16:55:42 2011 From: chrismcowen at gmail.com (Chris Mcowen) Date: Thu, 6 Jan 2011 15:55:42 +0000 Subject: [R] Extract data In-Reply-To: <71E401F6-E349-4D73-A6B5-DA7D7EEFD848@comcast.net> References: <061B1B8E-7121-4F16-B357-48EF59816077@st-andrews.ac.uk> <71E401F6-E349-4D73-A6B5-DA7D7EEFD848@comcast.net> Message-ID: <7327EBEA-0F40-44C4-B0A1-F3964D93D404@gmail.com> Dear David, Thats great, thanks very much for the help, much appreciated. On 6 Jan 2011, at 15:53, David Winsemius wrote: On Jan 6, 2011, at 6:36 AM, Chris Mcowen wrote: > Dear List, > > I have a data frame called trait with roughly 800 species in, each species have 15 columns of information: > > Species 1 2 3 etc.. > a t y h > b f j u > c r y u > > etc.. > > > I then have another data frame called com with the composition of species in each region, there are 506 different communities: > > community species > NA1102 a > NA1102 c > NA0402 b > NA0402 c > AT1302 a > AT1302 b > > etc.. > > > What i want to do is extract the information held in the first data frame for each community and save this as a new data frame. > tapply(comm.info$species, comm.info$community, c) $AT1302 [1] 1 2 $NA0402 [1] 2 3 $NA1102 [1] 1 3 > lapply( tapply(comm.info$species, comm.info$community, c), function(x){ sp.info[x, ]} ) $AT1302 Species X1 X2 X3 1 a t y h 2 b f j u $NA0402 Species X1 X2 X3 2 b f j u 3 c r y u $NA1102 Species X1 X2 X3 1 a t y h 3 c r y u Might have looked more compact if I had assigned the output of tapply to an intermediate list: comm.sp <- tapply(comm.info$species, comm.info$community, c) lapply( comm.sp , function(x){ sp.info[x, ]} ) > > Resulting in : - > > community_NA1102 > > a t y h > c r y u > > community_NA0402 > > b f j u > c r y u > > Thanks in advance for any suggestions / code. David Winsemius, MD West Hartford, CT From Horace.Tso at pgn.com Thu Jan 6 18:11:45 2011 From: Horace.Tso at pgn.com (Horace Tso) Date: Thu, 6 Jan 2011 09:11:45 -0800 Subject: [R] Solved : RE: problem installing R on ubuntu In-Reply-To: <5C3F9922B1D5FB4886B2D2045AB952F305300D8B27@IPEXMAIL.corp.dom> References: <5C3F9922B1D5FB4886B2D2045AB952F305300D8B27@IPEXMAIL.corp.dom> Message-ID: <5C3F9922B1D5FB4886B2D2045AB952F3056DF5501A@IPEXMAIL.corp.dom> This question of mine is now solved, thanks to a suggestion by Homer Strong, the organizer of the R user group in Portland, Oregon. The "unmet dependencies" as reported by install was caused by an incorrect entry in /etc/apt/sources.list. Previously I had deb http:///bin/linux/ubuntu hardy/ It should say lucid, deb http:///bin/linux/ubuntu lucid/ Once it's changed, the apt-get works without a glitch. Thanks again to Homer. Horace W Tso -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Horace Tso Sent: Thursday, December 23, 2010 10:31 AM To: r-help Subject: [R] problem installing R on ubuntu Following the official instructions to install R on ubuntu 10.04, I issued this command on the prompt, sudo apt-get install r-base Here is the error msg, Reading package lists... Done Building dependency tree Reading state information... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: r-base: Depends: r-base-core (>= 2.12.1-1hardy0) but 2.10.1-2 is to be installed Depends: r-recommended (= 2.12.1-1hardy0) but 2.10.1-2 is to be installed E: Broken packages Note I already have 2.10.1 installed. Any comments/advices appreciated. H [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From adasgupta at araastat.com Thu Jan 6 18:15:24 2011 From: adasgupta at araastat.com (Abhijit Dasgupta) Date: Thu, 6 Jan 2011 12:15:24 -0500 Subject: [R] Reading large SAS dataset in R In-Reply-To: References: Message-ID: <0293FAC0-26CC-4597-8D77-9D645FCC635F@araastat.com> Santanu, I second Phil's suggestion. sas.get is actually quite nice. Another current option is using a command-line utility called dsread (http://www.oview.co.uk/dsread/) to convert the sas7bdat file to a csv or tsv format, which can then easily be read into R using read.table and its derivatives. Frank Harrell (author of the Hmisc package) commented positively on this approach on the list a couple of months back. Abhijit On Jan 5, 2011, at 5:51 PM, Phil Spector wrote: > Santanu - > If you have sas installed on your computer, you may find using > the sas.get function of the Hmisc package useful. > If the only message that read.ssd produced was "Sas failed", it > would be difficult to figure out what went wrong. Usually the location of the log file, which would explain the error more thoroughly, is included in the error message. > > - Phil Spector > Statistical Computing Facility > Department of Statistics > UC Berkeley > spector at stat.berkeley.edu > > > On Wed, 5 Jan 2011, Santanu Pramanik wrote: > >> Hi all, >> >> I have a large (approx. 1 GB) SAS dataset (test.sas7bdat) located in the >> server (?R:/? directory). I have SAS 9.1 installed in my PC and I can read >> the SAS dataset in SAS, under a windows environment, after assigning libname >> in "R:\" directory. >> >> >> >> Now I am trying to read the SAS dataset in R (R 2.12.0) using the read.ssd >> function of the ?foreign? package, but I get an error message ?SAS failed?. >> I believe I have specified the paths correctly (after reading some previous >> posts I made sure that I do it right). Below is the small code: >> >> >> >> sashome<- "C:/Program Files/SAS/SAS 9.1" >> >> read.ssd(libname="R:/", sectionnames="test", sascmd=file.path(sashome, >> "sas.exe")) >> >> >> >> Please let me know where I am making the mistake. Is it because of the size >> of the file or the location of the file (in server instead of local hard >> drive)? >> >> >> >> Thanks in advance, >> >> Santanu >> >> >> -- >> -------------------------------------------------------------------- >> Santanu Pramanik >> Survey Statistician >> NORC at the University of Chicago >> Bethesda, MD >> >> [[alternative HTML version deleted]] >> >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jholtman at gmail.com Thu Jan 6 18:24:20 2011 From: jholtman at gmail.com (jim holtman) Date: Thu, 6 Jan 2011 12:24:20 -0500 Subject: [R] Extract data In-Reply-To: <061B1B8E-7121-4F16-B357-48EF59816077@st-andrews.ac.uk> References: <061B1B8E-7121-4F16-B357-48EF59816077@st-andrews.ac.uk> Message-ID: 'merge' comes in handy: > spec <- read.table(textConnection("Species 1 2 3 + a t y h + b f j u + c r y u"), header=TRUE) > comm <- read.table(textConnection("community species + NA1102 a + NA1102 c + NA0402 b + NA0402 c + AT1302 a + AT1302 b"), header = TRUE) > closeAllConnections() > # use merge > x <- merge(spec, comm, by.x="Species", by.y='species') > x Species X1 X2 X3 community 1 a t y h NA1102 2 a t y h AT1302 3 b f j u NA0402 4 b f j u AT1302 5 c r y u NA1102 6 c r y u NA0402 > split(x, x$community) $AT1302 Species X1 X2 X3 community 2 a t y h AT1302 4 b f j u AT1302 $NA0402 Species X1 X2 X3 community 3 b f j u NA0402 6 c r y u NA0402 $NA1102 Species X1 X2 X3 community 1 a t y h NA1102 5 c r y u NA1102 On Thu, Jan 6, 2011 at 6:36 AM, Chris Mcowen wrote: > Dear List, > > I have a data frame called trait with roughly 800 species in, each species have 15 columns of information: > > Species ? ? ? ? 1 ? ? ? 2 ? ? ? 3 ? ? ? etc.. > a ? ? ? ? ? ? ? ? ? ? ? t ? ? ? y ? ? ? h > b ? ? ? ? ? ? ? ? ? ? ? f ? ? ? j ? ? ? u > c ? ? ? ? ? ? ? ? ? ? ? r ? ? ? y ? ? ? u > > etc.. > > > I then have another data frame called com with the composition of species in each region, there are 506 different communities: > > community ? ? ? species > NA1102 ? ? ? ? ?a > NA1102 ? ? ? ? ?c > NA0402 ? ? ? ? ?b > NA0402 ? ? ? ? ?c > AT1302 ? ? ? ? ?a > AT1302 ? ? ? ? ?b > > etc.. > > > What i want to do is extract the information held in the first data frame for each community and save this as a new data frame. > > Resulting in : - > > community_NA1102 > > a ? ? ? ? ? ? ? ? ? ? ? t ? ? ? y ? ? ? h > c ? ? ? ? ? ? ? ? ? ? ? r ? ? ? y ? ? ? u > > community_NA0402 > > b ? ? ? ? ? ? ? ? ? ? ? f ? ? ? j ? ? ? u > c ? ? ? ? ? ? ? ? ? ? ? r ? ? ? y ? ? ? u > > Thanks in advance for any suggestions / code. > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From Greg.Snow at imail.org Thu Jan 6 18:41:49 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Thu, 6 Jan 2011 10:41:49 -0700 Subject: [R] Problem with 2-ways ANOVA interactions In-Reply-To: <498111.46032.qm@web57908.mail.re3.yahoo.com> References: <498111.46032.qm@web57908.mail.re3.yahoo.com> Message-ID: You really need to spend more time with a good aov textbook and probably a consultant that can explain things to you face to face. But here is a basic explanation to get you pointed in the right direction: Consider a simple 2x2 example with factors A and B each with 2 levels (1 and 2). Draw a 2x2 grid to represent this, there are 4 groups and the theory would be that they have means mu11, mu12, mu21, and mu22 (mu12 is for the group with A at level 1 and B at level 2, etc.). Now you fit the full model with 2 main effects and 1 interaction, if we assume treatment contrasts (the default in R, the coefficients/tests will be different for different contrasts, but the general idea is the same) then the intercept/mean/constant piece will correspond to mu11; the coefficient (only seen if treated as lm instead of aov object) for testing A will be (mu21-mu11) and for testing B will be (mu12-m11). Now the interaction piece gets a bit more complex, it is (mu11 - mu12 - mu21 + mu22), this makes a bit more sense if we rearrange it to be one of ( (mu22-mu21) - (mu12-mu11) ) or ( (mu22-mu12) - (mu21-mu11) ); it represents the difference in the differences, i.e. we find how much going from A1 to A2 changes things when B is 1, then we find how much going from A1 to A2 changes things when B is 2, then we find the difference in these changes, that is the interaction (and if it is 0, then the effects of A and B are additive and independent, i.e. the amount A changes things does not depend on the value of B and vis versa). So testing the interaction term is asking if how much a change in A affects things depends on the value of B. This is very different from comparing mu11 to mu12 (or mu21 to mu22) which is what I think you did in the t-test, it is asking a very different question and using different base assumptions (ignoring any effect of B, additional data, etc.). Note that your test on condition is very significant, this would be more similar to your t-test, but still not match exactly because of the differences. Now your case is more complicated since stimulus has 7 levels (6 df), so the interaction is a combination of 6 different differences of differences, which is why you need to spend some time in a good textbook/class to really understand what model(s) you are fitting. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Frodo Jedi > Sent: Wednesday, January 05, 2011 4:10 PM > To: r-help at r-project.org > Subject: [R] Problem with 2-ways ANOVA interactions > > Dear All, > I have a problem in understanding how the interactions of 2 ways ANOVA > work, > because I get conflicting results > from a t-test and an anova. For most of you my problem is very simple I > am sure. > > I need an help with an example, looking at one table I am analyzing. > The table > is in attachment > and can be imported in R by means of this command: > scrd<- > read.table('/Users/luca/Documents/Analisi_passi/Codice_R/Statistics_res > ults_bump_hole_Audio_Haptic/tables_for_R/table_realism_wood.txt', > header=TRUE, colClasse=c('numeric','factor','factor','numeric')) > > > This table is the result of a simple experiment. Subjects where exposed > to some > stimuli and they where asked to evaluate the degree of realism > of the stimuli on a 7 point scale (i.e., data in column "response"). > Each stimulus was presented in two conditions, "A" and "AH", where AH > is the > condition A plus another thing (let?s call it "H"). > > Now, what means exactly in my table the interaction stimulus:condition? > > I think that if I do the analysis anova(response ~ stimulus*condition) > I will > get the comparison between > > the same stimulus in condition A and in condition AH. Am I wrong? > > For instance the comparison of stimulus flat_550_W_realism presented in > condition A with the same stimulus, flat_550_W_realism, > > presented in condition AH. > > The problem is that if I do a t-test between the values of this > stimulus in the > A and AH condition I get significative difference, > while if I do the test with 2-ways ANOVA I don?t get any difference. > How is this possible? > > Here I put the results analysis > > > #Here the result of ANOVA: > > fit1<- lm(response ~ stimulus + condition + stimulus:condition, > data=scrd) > >#EQUIVALE A lm(response ~ stimulus*condition, data=scrd) > > > > anova(fit1) > Analysis of Variance Table > > Response: response > Df Sum Sq Mean Sq F value Pr(>F) > stimulus 6 15.05 2.509 1.1000 0.3647 > condition 1 36.51 36.515 16.0089 9.64e-05 *** > stimulus:condition 6 1.47 0.244 0.1071 0.9955 > Residuals 159 362.67 2.281 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > > #As you can see the p-value for stimulus:condition is high. > > > #Now I do the t-test with the same values of the table concerning the > stimulus > presented in A and AH conditions: > > flat_550_W_realism > =c(3,3,5,3,3,3,3,5,3,3,5,7,5,2,3) > flat_550_W_realism_AH =c(7,4,5,3,6,5,3,5,5,7,2,7,5, > 5) > > > t.test(flat_550_W_realism,flat_550_W_realism_AH, var.equal=TRUE) > > Two Sample t-test > > data: flat_550_W_realism and flat_550_W_realism_AH > t = -2.2361, df = 27, p-value = 0.03381 > alternative hypothesis: true difference in means is not equal to 0 > 95 percent confidence interval: > -2.29198603 -0.09849016 > sample estimates: > mean of x mean of y > 3.733333 4.928571 > > > #Now we have a significative difference between these two stimuli (p- > value = > 0.03381) > > > > Why I get this beheaviour? > > > Moreover, how by means of ANOVA I could track the significative > differences > between the stimuli presented in A and AH condition > whitout doing the t-test? > > Please help! > > Thanks in advance > > > From xie at yihui.name Thu Jan 6 18:46:19 2011 From: xie at yihui.name (Yihui Xie) Date: Thu, 6 Jan 2011 11:46:19 -0600 Subject: [R] OT: Reducing pdf file size In-Reply-To: <0828E022-CB82-4FAD-8A18-8543805BCBCA@gmail.com> References: <0828E022-CB82-4FAD-8A18-8543805BCBCA@gmail.com> Message-ID: Has anyone succeeded in porting any PDF compression tools to R so far? Regards, Yihui -- Yihui Xie Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Wed, Jan 5, 2011 at 7:34 PM, Andrew Miles wrote: > I assume you mean PDFs generated by R. ?This topic has been addressed > here: http://tolstoy.newcastle.edu.au/R/e2/help/07/05/17475.html > > I have always just output the graphics then used an external PDF > program (like Preview on the Mac) to do changes in file type, size > reductions, etc. > > Andrew Miles > > On Jan 5, 2011, at 3:12 PM, Kurt_Helf at nps.gov wrote: > >> Greetings >> ? ? Does anyone have any suggestions for reducing pdf file size, >> particularly pdfs containing photos, without sacrificing quality? >> Thanks >> for any tips in advance. >> Cheers >> Kurt >> >> *************************************************************** >> Kurt Lewis Helf, Ph.D. >> Ecologist >> EEO Counselor >> National Park Service >> Cumberland Piedmont Network >> P.O. Box 8 >> Mammoth Cave, KY 42259 >> Ph: 270-758-2163 >> Lab: 270-758-2151 >> Fax: 270-758-2609 >> **************************************************************** From Greg.Snow at imail.org Thu Jan 6 19:07:17 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Thu, 6 Jan 2011 11:07:17 -0700 Subject: [R] Assumptions for ANOVA: the right way to check the normality In-Reply-To: <262855.6389.qm@web57903.mail.re3.yahoo.com> References: <9521.53053.qm@web57907.mail.re3.yahoo.com> <262855.6389.qm@web57903.mail.re3.yahoo.com> Message-ID: Remember that an non-significant result (especially one that is still near alpha like yours) does not give evidence that the null is true. The reason that the 1st 2 tests below don't show significance is more due to lack of power than some of the residuals being normal. The only test that I would trust for this is SnowsPenultimateNormalityTest (TeachingDemos package, the help page is more useful than the function itself). But I think that you are mixing up 2 different concepts (a very common misunderstanding). What is important if we want to do normal theory inference is that the coefficients/effects/estimates are normally distributed. Now since these coefficients can be shown to be linear combinations of the error terms, if the errors are iid normal then the coefficients are also normally distributed. So many people want to show that the residuals come from a perfectly normal distribution. But it is the theoretical errors, not the observed residuals that are important (the observed residuals are not iid). You need to think about the source of your data to see if this is a reasonable assumption. Now I cannot fathom any universe (theoretical or real) in which normally distributed errors added to means that they are independent of will result in a finite set of integers, so an assumption of exact normality is not reasonable (some may want to argue this, but convincing me will be very difficult). But looking for exact normality is a bit of a red herring because, we also have the Central Limit Theorem that says that if the errors are not normal (but still iid) then the distribution of the coefficients will approach normality as the sample size increases. This is what make statistics doable (because no real dataset entered into the computer is exactly normal). The more important question is are the residuals "normal enough"? for which there is not a definitive test (experience and plots help). But this all depends on another assumption that I don't think that you have even considered. Yes we can use normal theory even when the random part of the data is not normally distributed, but this still assumes that the data is at least interval data, i.e. that we firmly believe that the difference between a response of 1 and a response of 2 is exactly the same as a difference between a 6 and a 7 and that the difference from 4 to 6 is exactly twice that of 1 vs. 2. From your data and other descriptions, I don't think that that is a reasonable assumption. If you are not willing to make that assumption (like me) then means and normal theory tests are meaningless and you should use other approaches. One possibility is to use non-parametric methods (which I believe Frank has already suggested you use), another is to use proportional odds logistic regression. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Frodo Jedi > Sent: Wednesday, January 05, 2011 3:22 PM > To: Robert Baer; r-help at r-project.org > Subject: Re: [R] Assumptions for ANOVA: the right way to check the > normality > > Dear Robert, > thanks so much!!! Now I understand! > So you also think that I have to check only the residuals and not the > data > directly. > Now just for curiosity I did the the shapiro test on the residuals. The > problem > is that on fit3 I don?t get from the test > that the data are normally distribuited. Why? Here the data: > > > shapiro.test(residuals(fit1)) > > Shapiro-Wilk normality test > > data: residuals(fit1) > W = 0.9848, p-value = 0.05693 > > #Here the test is ok: the test says that the data are distributed > normally > (p-value greather than 0.05) > > > > > shapiro.test(residuals(fit2)) > > Shapiro-Wilk normality test > > data: residuals(fit2) > W = 0.9853, p-value = 0.06525 > > #Here the test is ok: the test says that the data are distributed > normally > (p-value greather than 0.05) > > > > > shapiro.test(residuals(fit3)) > > Shapiro-Wilk normality test > > data: residuals(fit3) > W = 0.9621, p-value = 0.0001206 > > > > Now the test reveals p-value lower than 0.05: so the residuals for fit3 > are not > distributed normally.... > Why I get this beheaviour? Indeed in the histogram and Q-Q plot for > fit3 > residuals I get a normal distribution. > > > > > > > > > > > > > > > > > ________________________________ > From: Robert Baer > > Sent: Wed, January 5, 2011 8:56:50 PM > Subject: Re: [R] Assumptions for ANOVA: the right way to check the > normality > > > Someone suggested me that I don?t have to check the normality of the > data, but > > the normality of the residuals I get after the fitting of the linear > model. > > I really ask you to help me to understand this point as I don?t find > enough > > material online where to solve it. > > Try the following: > # using your scrd data and your proposed models > fit1<- lm(response ~ stimulus + condition + stimulus:condition, > data=scrd) > fit2<- lm(response ~ stimulus + condition, data=scrd) > fit3<- lm(response ~ condition, data=scrd) > > # Set up for 6 plots on 1 panel > op = par(mfrow=c(2,3)) > > # residuals function extracts residuals > # Visual inspection is a good start for checking normality > # You get a much better feel than from some "magic number" statistic > hist(residuals(fit1)) > hist(residuals(fit2)) > hist(residuals(fit3)) > > # especially qqnorm() plots which are linear for normal data > qqnorm(residuals(fit1)) > qqnorm(residuals(fit2)) > qqnorm(residuals(fit3)) > > # Restore plot parameters > par(op) > > > > > If the data are not normally distributed I have to use the kruskal > wallys test > > and not the ANOVA...so please help > > me to understand. > > Indeed - Kruskal-Wallis is a good test to use for one factor data that > is > ordinal so it is a good alternative to your fit3. > Your "response" seems to be a discrete variable rather than a > continuous > variable. > You must decide if it is reasonable to approximate it with a normal > distribution > which is by definition continuous. > > > > > I make a numerical example, could you please tell me if the data in > this table > > are normally distributed or not? > > > > Help! > > > > > > number stimulus condition response > > 1 flat_550_W_realism A 3 > > 2 flat_550_W_realism A 3 > > 3 flat_550_W_realism A 5 > > 4 flat_550_W_realism A 3 > > 5 flat_550_W_realism A 3 > > 6 flat_550_W_realism A 3 > > 7 flat_550_W_realism A 3 > > 8 flat_550_W_realism A 5 > > 9 flat_550_W_realism A 3 > > 10 flat_550_W_realism A 3 > > 11 flat_550_W_realism A 5 > > 12 flat_550_W_realism A 7 > > 13 flat_550_W_realism A 5 > > 14 flat_550_W_realism A 2 > > 15 flat_550_W_realism A 3 > > 16 flat_550_W_realism AH 7 > > 17 flat_550_W_realism AH 4 > > 18 flat_550_W_realism AH 5 > > 19 flat_550_W_realism AH 3 > > 20 flat_550_W_realism AH 6 > > 21 flat_550_W_realism AH 5 > > 22 flat_550_W_realism AH 3 > > 23 flat_550_W_realism AH 5 > > 24 flat_550_W_realism AH 5 > > 25 flat_550_W_realism AH 7 > > 26 flat_550_W_realism AH 2 > > 27 flat_550_W_realism AH 7 > > 28 flat_550_W_realism AH 5 > > 29 flat_550_W_realism AH 5 > > 30 bump_2_step_W_realism A 1 > > 31 bump_2_step_W_realism A 3 > > 32 bump_2_step_W_realism A 5 > > 33 bump_2_step_W_realism A 1 > > 34 bump_2_step_W_realism A 3 > > 35 bump_2_step_W_realism A 2 > > 36 bump_2_step_W_realism A 5 > > 37 bump_2_step_W_realism A 4 > > 38 bump_2_step_W_realism A 4 > > 39 bump_2_step_W_realism A 4 > > 40 bump_2_step_W_realism A 4 > > 41 bump_2_step_W_realism AH 3 > > 42 bump_2_step_W_realism AH 5 > > 43 bump_2_step_W_realism AH 1 > > 44 bump_2_step_W_realism AH 5 > > 45 bump_2_step_W_realism AH 4 > > 46 bump_2_step_W_realism AH 4 > > 47 bump_2_step_W_realism AH 5 > > 48 bump_2_step_W_realism AH 4 > > 49 bump_2_step_W_realism AH 3 > > 50 bump_2_step_W_realism AH 4 > > 51 bump_2_step_W_realism AH 5 > > 52 bump_2_step_W_realism AH 4 > > 53 hole_2_step_W_realism A 3 > > 54 hole_2_step_W_realism A 3 > > 55 hole_2_step_W_realism A 4 > > 56 hole_2_step_W_realism A 1 > > 57 hole_2_step_W_realism A 4 > > 58 hole_2_step_W_realism A 3 > > 59 hole_2_step_W_realism A 5 > > 60 hole_2_step_W_realism A 4 > > 61 hole_2_step_W_realism A 3 > > 62 hole_2_step_W_realism A 4 > > 63 hole_2_step_W_realism A 7 > > 64 hole_2_step_W_realism A 5 > > 65 hole_2_step_W_realism A 1 > > 66 hole_2_step_W_realism A 4 > > 67 hole_2_step_W_realism AH 7 > > 68 hole_2_step_W_realism AH 5 > > 69 hole_2_step_W_realism AH 5 > > 70 hole_2_step_W_realism AH 1 > > 71 hole_2_step_W_realism AH 5 > > 72 hole_2_step_W_realism AH 5 > > 73 hole_2_step_W_realism AH 5 > > 74 hole_2_step_W_realism AH 2 > > 75 hole_2_step_W_realism AH 6 > > 76 hole_2_step_W_realism AH 5 > > 77 hole_2_step_W_realism AH 5 > > 78 hole_2_step_W_realism AH 6 > > 79 bump_2_heel_toe_W_realism A 3 > > 80 bump_2_heel_toe_W_realism A 3 > > 81 bump_2_heel_toe_W_realism A 3 > > 82 bump_2_heel_toe_W_realism A 2 > > 83 bump_2_heel_toe_W_realism A 3 > > 84 bump_2_heel_toe_W_realism A 3 > > 85 bump_2_heel_toe_W_realism A 4 > > 86 bump_2_heel_toe_W_realism A 3 > > 87 bump_2_heel_toe_W_realism A 4 > > 88 bump_2_heel_toe_W_realism A 4 > > 89 bump_2_heel_toe_W_realism A 6 > > 90 bump_2_heel_toe_W_realism A 5 > > 91 bump_2_heel_toe_W_realism A 4 > > 92 bump_2_heel_toe_W_realism AH 7 > > 93 bump_2_heel_toe_W_realism AH 3 > > 94 bump_2_heel_toe_W_realism AH 4 > > 95 bump_2_heel_toe_W_realism AH 2 > > 96 bump_2_heel_toe_W_realism AH 5 > > 97 bump_2_heel_toe_W_realism AH 6 > > 98 bump_2_heel_toe_W_realism AH 4 > > 99 bump_2_heel_toe_W_realism AH 4 > > 100 bump_2_heel_toe_W_realism AH 4 > > 101 bump_2_heel_toe_W_realism AH 5 > > 102 bump_2_heel_toe_W_realism AH 2 > > 103 bump_2_heel_toe_W_realism AH 6 > > 104 bump_2_heel_toe_W_realism AH 5 > > 105 hole_2_heel_toe_W_realism A 3 > > 106 hole_2_heel_toe_W_realism A 3 > > 107 hole_2_heel_toe_W_realism A 1 > > 108 hole_2_heel_toe_W_realism A 3 > > 109 hole_2_heel_toe_W_realism A 3 > > 110 hole_2_heel_toe_W_realism A 5 > > 111 hole_2_heel_toe_W_realism A 2 > > 112 hole_2_heel_toe_W_realism AH 5 > > 113 hole_2_heel_toe_W_realism AH 1 > > 114 hole_2_heel_toe_W_realism AH 3 > > 115 hole_2_heel_toe_W_realism AH 6 > > 116 hole_2_heel_toe_W_realism AH 5 > > 117 hole_2_heel_toe_W_realism AH 4 > > 118 hole_2_heel_toe_W_realism AH 4 > > 119 hole_2_heel_toe_W_realism AH 3 > > 120 hole_2_heel_toe_W_realism AH 3 > > 121 hole_2_heel_toe_W_realism AH 1 > > 122 hole_2_heel_toe_W_realism AH 5 > > 123 bump_2_combination_W_realism A 4 > > 124 bump_2_combination_W_realism A 2 > > 125 bump_2_combination_W_realism A 4 > > 126 bump_2_combination_W_realism A 1 > > 127 bump_2_combination_W_realism A 4 > > 128 bump_2_combination_W_realism A 4 > > 129 bump_2_combination_W_realism A 2 > > 130 bump_2_combination_W_realism A 4 > > 131 bump_2_combination_W_realism A 2 > > 132 bump_2_combination_W_realism A 4 > > 133 bump_2_combination_W_realism A 2 > > 134 bump_2_combination_W_realism A 6 > > 135 bump_2_combination_W_realism AH 7 > > 136 bump_2_combination_W_realism AH 3 > > 137 bump_2_combination_W_realism AH 4 > > 138 bump_2_combination_W_realism AH 1 > > 139 bump_2_combination_W_realism AH 6 > > 140 bump_2_combination_W_realism AH 5 > > 141 bump_2_combination_W_realism AH 5 > > 142 bump_2_combination_W_realism AH 6 > > 143 bump_2_combination_W_realism AH 5 > > 144 bump_2_combination_W_realism AH 4 > > 145 bump_2_combination_W_realism AH 2 > > 146 bump_2_combination_W_realism AH 4 > > 147 bump_2_combination_W_realism AH 2 > > 148 bump_2_combination_W_realism AH 5 > > 149 hole_2_combination_W_realism A 5 > > 150 hole_2_combination_W_realism A 2 > > 151 hole_2_combination_W_realism A 4 > > 152 hole_2_combination_W_realism A 1 > > 153 hole_2_combination_W_realism A 5 > > 154 hole_2_combination_W_realism A 4 > > 155 hole_2_combination_W_realism A 3 > > 156 hole_2_combination_W_realism A 5 > > 157 hole_2_combination_W_realism A 2 > > 158 hole_2_combination_W_realism A 5 > > 159 hole_2_combination_W_realism A 5 > > 160 hole_2_combination_W_realism A 1 > > 161 hole_2_combination_W_realism AH 7 > > 162 hole_2_combination_W_realism AH 5 > > 163 hole_2_combination_W_realism AH 3 > > 164 hole_2_combination_W_realism AH 1 > > 165 hole_2_combination_W_realism AH 6 > > 166 hole_2_combination_W_realism AH 4 > > 167 hole_2_combination_W_realism AH 7 > > 168 hole_2_combination_W_realism AH 5 > > 169 hole_2_combination_W_realism AH 5 > > 170 hole_2_combination_W_realism AH 2 > > 171 hole_2_combination_W_realism AH 6 > > 172 hole_2_combination_W_realism AH 2 > > 173 hole_2_combination_W_realism AH 4 > > > > > > > > > > Thanks in advance > > > > > > > > [[alternative HTML version deleted]] > > > > > > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > [[alternative HTML version deleted]] From amelia_vettori at yahoo.co.nz Thu Jan 6 19:07:33 2011 From: amelia_vettori at yahoo.co.nz (Amelia Vettori) Date: Thu, 6 Jan 2011 10:07:33 -0800 (PST) Subject: [R] Calcuting returns Message-ID: <568079.93841.qm@web121408.mail.ne1.yahoo.com> Dear R forum helpers,I have following datatrans <- data.frame(currency_transacted = c("EURO", "USD", "USD", "GBP", "USD", "AUD"), position_amt = c(10000, 25000, 20000, 15000, 22000, 30000))date <- c("12/31/2010", "12/30/2010", "12/29/2010", "12/28/2010", "12/27/2010", "12/24/2010", "12/23/2010", "12/22/2010", "12/21/2010", "12/20/2010")USD <- c(112.05, 112.9, 110.85, 109.63, 108.08, 111.23, 112.49, 108.87, 109.33, 111.88)GBP <- c(171.52, 168.27,169.03, 169.64, 169.29, 169.47, 170.9, 168.69, 170.9, 169.96)EURO <- c(42.71, 42.68, 41.86, 44.71, 44.14, 44.58, 41.07, 42.23, 44.55, 41.12)CHF <- c(41.5, 41.47, 42.84, 43.44, 43.69, 42.3, 42.05, 41.23, 42.76, 43.79)AUD <- c(109.55, 102.52, 114.91, 122.48, 122.12, 123.96, 100.36, 110.19, 121.58, 103.46)These are the exchange rates and I am trying calculating the returns. I am giving only a small portion of actual data as I can't send this as an attachment.Actually, I have stored these as csv files (i.e. as transactions.csv and currency.csv files respectively) in my From Greg.Snow at imail.org Thu Jan 6 19:09:20 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Thu, 6 Jan 2011 11:09:20 -0700 Subject: [R] Splitting a Vector In-Reply-To: References: Message-ID: I think that you are looking for the 'resid' and 'fitted' functions, these will give you the residuals and fitted values from an lm object (that added together gives the original response but are orthogonal to each other). Those values can then be assigned to a data frame or used by themselves. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Ben Ward > Sent: Thursday, January 06, 2011 6:09 AM > To: r-help > Subject: [R] Splitting a Vector > > Hi all, > > I read in a text book, that you can examine a variable that is colinear > with others, and giving different ANOVA output and explanatory power > when ordered differently in the model forula, by modelling that > explanatory variable, against the others colinear with it. Then, using > that information to split the vector (explanatory variable) in > question, > into two new vectors, one should correspond to the fitted values and > one > the residuals of the (I think you could call it nested) model. One > vector therefore should be aligned with the subspacespace defined by > the > other variables colinear with it, and the other will be residual, and > so > orthogonal to the subspace of the colinear variables. Then by including > these two variables in the origional model - the one that showed the > order dependency, you can see how much explanatory power the othogonal > part of the order dependent variable has, at different orders, and in > principle it shouldn't change, but the vector made from the part > co-aligned with the co-variates, will change as the order changes - > it's > explanatory power should decreace in ANOVA is it moves away from being > the first explanatory variable in the model. > > Obviously finding the fitted model values and residual required to > split > the vector in two is a simple lm() with the right variables. But how > would I create two new vectors from this and append them to my > dataframe? Is there a package or function specially designed with this > sort of task in mind? > > Thanks, > Ben Ward. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From pburns at pburns.seanet.com Thu Jan 6 19:12:42 2011 From: pburns at pburns.seanet.com (Patrick Burns) Date: Thu, 06 Jan 2011 18:12:42 +0000 Subject: [R] Calcuting returns In-Reply-To: <568079.93841.qm@web121408.mail.ne1.yahoo.com> References: <568079.93841.qm@web121408.mail.ne1.yahoo.com> Message-ID: <4D26061A.6@pburns.seanet.com> I'm guessing this page will answer your question: http://www.portfolioprobe.com/2010/10/04/a-tale-of-two-returns/ If not, then you need to be more specific. On 06/01/2011 18:07, Amelia Vettori wrote: > Dear R forum helpers,I have following datatrans<- data.frame(currency_transacted = c("EURO", "USD", "USD", "GBP", "USD", "AUD"), position_amt = c(10000, 25000, 20000, 15000, 22000, 30000))date<- c("12/31/2010", "12/30/2010", "12/29/2010", "12/28/2010", "12/27/2010", "12/24/2010", "12/23/2010", "12/22/2010", "12/21/2010", "12/20/2010")USD<- c(112.05, 112.9, 110.85, 109.63, 108.08, 111.23, 112.49, 108.87, 109.33, 111.88)GBP<- c(171.52, 168.27,169.03, 169.64, 169.29, 169.47, 170.9, 168.69, 170.9, 169.96)EURO<- c(42.71, 42.68, 41.86, 44.71, 44.14, 44.58, 41.07, 42.23, 44.55, 41.12)CHF<- c(41.5, 41.47, 42.84, 43.44, 43.69, 42.3, 42.05, 41.23, 42.76, 43.79)AUD<- c(109.55, 102.52, 114.91, 122.48, 122.12, 123.96, 100.36, 110.19, 121.58, 103.46)These are the exchange rates and I am trying calculating the returns. I am giving only a small portion of actual data as I can't send this as an attachment.Actually, I have stored these as csv files (i.e. as > transactions.csv and currency.csv files respectively) in my > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Patrick Burns pburns at pburns.seanet.com twitter: @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') From albertonegron at gmail.com Thu Jan 6 19:44:07 2011 From: albertonegron at gmail.com (altons) Date: Thu, 6 Jan 2011 10:44:07 -0800 (PST) Subject: [R] Problem with package twitteR and converting S4 obj to data frame Message-ID: <1294339447659-3177948.post@n4.nabble.com> Hi, I wrote a simple script to retrieve an n number of followers for a given user in Twitter. I used a sample of n=10 to test my script and worked perfectly but once I started to changes n I started to get the following error: Error in list_to_dataframe(res, attr(.data, "split_labels")) : Results do not have equal lengths I have no clue why works fine with n=10 but for higher values fall over. Any Suggestions? Below is my script. Thanks, Alberto library(plyr) library(twitteR) getuser <- getUser('altons') count <- 50 UserFollowers <-userFollowers(getuser, n=count, session = getCurlHandle()) dffollow <- ldply(userFollowers(getuser, n=count, session = getCurlHandle()), function(x) c(screenName=x at screenName ,name=x at name ,statusesCount =x at statusesCount ,followersCount=x at followersCount ,friendsCount=x at friendsCount ,TwitterBirth=x at created ,description=x at description ,location=x at location) ) -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-package-twitteR-and-converting-S4-obj-to-data-frame-tp3177948p3177948.html Sent from the R help mailing list archive at Nabble.com. From kiotoqq at googlemail.com Thu Jan 6 19:23:37 2011 From: kiotoqq at googlemail.com (kiotoqq) Date: Thu, 6 Jan 2011 10:23:37 -0800 (PST) Subject: [R] need help for chi-squared test Message-ID: <1294338217815-3177925.post@n4.nabble.com> I've got a dataset which looks like this in the beginning: cbr dust smoking expo 1 0 0.20 1 5 2 0 0.25 1 4 3 0 0.25 1 8 4 0 0.25 1 4 5 0 0.25 1 4 (till no. 1240, anyway, a huge set) I have to analyse cbr and smoking, I know it works with chisq.test() for the whole set, but I only need cbr and smoking, and I have no idea how to extract them. -- View this message in context: http://r.789695.n4.nabble.com/need-help-for-chi-squared-test-tp3177925p3177925.html Sent from the R help mailing list archive at Nabble.com. From gunter.berton at gene.com Thu Jan 6 19:56:34 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Thu, 6 Jan 2011 10:56:34 -0800 Subject: [R] Waaaayy off topic...Statistical methods, pub bias, scientific validity Message-ID: Folks: The following has NOTHING (obvious) to do with R. But I believe that all on this list would find it relevant and, I hope, informative. It is LONG. I apologize in advance to those who feel I have wasted their time. http://www.newyorker.com/reporting/2010/12/13/101213fa_fact_lehrer Best regards to all, Bert -- Bert Gunter Genentech Nonclinical Biostatistics From ying.zhang at struq.com Thu Jan 6 20:11:03 2011 From: ying.zhang at struq.com (ying zhang) Date: Thu, 6 Jan 2011 19:11:03 -0000 Subject: [R] JRI & plot( ) Message-ID: <020f01cbadd5$77537ab0$65fa7010$@struq.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From maechler at stat.math.ethz.ch Thu Jan 6 20:16:48 2011 From: maechler at stat.math.ethz.ch (Martin Maechler) Date: Thu, 6 Jan 2011 20:16:48 +0100 Subject: [R] defining a formula method for a weighted lm() In-Reply-To: <4D25D2B5.6020005@yorku.ca> References: <4D0A259B.7010105@yorku.ca> <4D25D2B5.6020005@yorku.ca> Message-ID: <19750.5408.744169.57312@lynne.math.ethz.ch> >>>>> Michael Friendly >>>>> on Thu, 06 Jan 2011 09:33:25 -0500 writes: > No one replied to this, so I'll try again, with a simple example. I > calculate a set of log odds ratios, and turn them into a data frame as > follows: >> library(vcdExtra) >> (lor.CM <- loddsratio(CoalMiners)) > log odds ratios for Wheeze and Breathlessness by Age > 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 > 3.695261 3.398339 3.140658 3.014687 2.782049 2.926395 2.440571 2.637954 >> >> (lor.CM.df <- as.data.frame(lor.CM)) > Wheeze Breathlessness Age LOR ASE > 1 W:NoW B:NoB 25-29 3.695261 0.16471778 > 2 W:NoW B:NoB 30-34 3.398339 0.07733658 > 3 W:NoW B:NoB 35-39 3.140658 0.03341311 > 4 W:NoW B:NoB 40-44 3.014687 0.02866111 > 5 W:NoW B:NoB 45-49 2.782049 0.01875164 > 6 W:NoW B:NoB 50-54 2.926395 0.01585918 > 7 W:NoW B:NoB 55-59 2.440571 0.01452057 > 8 W:NoW B:NoB 60-64 2.637954 0.02159903 > Now I want to fit a linear model by WLS, LOR ~ Age, which can do directly as >> lm(LOR ~ as.numeric(Age), weights=1/ASE, data=lor.CM.df) > Call: > lm(formula = LOR ~ as.numeric(Age), data = lor.CM.df, weights = 1/ASE) > Coefficients: > (Intercept) as.numeric(Age) > 3.5850 -0.1376 > But, I want to do the fitting in my own function, the simplest version is > my.lm <- function(formula, data, subset, weights) { > lm(formula, data, subset, weights) > } > But there is obviously some magic about formula objects and evaluation > environments, because I don't understand why this doesn't work. >> my.lm(LOR ~ as.numeric(Age), weights=1/ASE, data=lor.CM.df) > Error in model.frame.default(formula = formula, data = data, subset = > subset, : > invalid type (closure) for variable '(weights)' >> Yes, the "magic" has been called "standard non-standard evaluation" for a while (since August 2002, to be precise), and the http://developer.r-project.org/ web page has had two very relevant links since then, namely those mentioned in the following two lines there: ---------------------------- # Description of the nonstandard evaluation rules in R 1.5.1 and some suggestions. (updated). Also an R function and docn for making model frames from multiple formulas. # Notes on model-fitting functions in R, and especially on how to enable all the safety features. ---------------------------- For what you want, I think (but haven't tried) the second link, which is http://developer.r-project.org/model-fitting-functions.txt is still very relevant. Many many people (package authors) had to use something like that or just directly taken the lm function as an example.. {{ but then probably failed the more subtle points on how to program residuals() , predict() , etc functions which you can also learn from model-fitting-functions.txt}} > A second question: Age is a factor, and as.numeric(Age) gives me 1:8. > What simple expression on lor.CM.df$Age would give me either the lower > limits (here: seq(25, 60, by = 5)) or midpoints of these Age intervals > (here: seq(27, 62, by = 5))? With data(CoalMiners, package = "vcd") here are some variations : > (Astr <- dimnames(CoalMiners)[[3]]) [1] "25-29" "30-34" "35-39" "40-44" "45-49" "50-54" "55-59" "60-64" > sapply(lapply(strsplit(Astr, "-"), as.numeric), `[[`, 1) [1] 25 30 35 40 45 50 55 60 > sapply(lapply(strsplit(Astr, "-"), as.numeric), `[[`, 2) [1] 29 34 39 44 49 54 59 64 > sapply(lapply(strsplit(Astr, "-"), as.numeric), mean) [1] 27 32 37 42 47 52 57 62 Or use the 2-row matrix and apply(*, 1) to that : > sapply(strsplit(Astr, "-"), as.numeric) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,] 25 30 35 40 45 50 55 60 [2,] 29 34 39 44 49 54 59 64 Regards, Martin Maechler, ETH Zurich From kw.stat at gmail.com Thu Jan 6 20:19:38 2011 From: kw.stat at gmail.com (Kevin Wright) Date: Thu, 6 Jan 2011 13:19:38 -0600 Subject: [R] Where is a package NEWS.Rd located? In-Reply-To: References: Message-ID: Yes, exactly. But the problem is with NEWS.Rd, not NEWS. pkg/inst/NEWS.Rd is moved to pkg/NEWS.Rd at build time, but for installed packages, "news" tried to load "pkg/inst/NEWS.Rd". I'm going to file a bug report. Kevin On Thu, Jan 6, 2011 at 7:29 AM, Kevin Wright wrote: > If you look at tools:::.build_news_db, the plain text NEWS file is > searched for in pkg/NEWS and pkg/inst/NEWS, but NEWS.Rd in only > searched for in pkg/inst/NEWS.Rd. > > Looks like a bug to me. > > I *think*. > > Thanks, > > Kevin > > > On Thu, Jan 6, 2011 at 7:09 AM, Kevin Wright wrote: >> Hopefully a quick question. ?My package has a NEWS.Rd file that is not >> being found by "news". >> >> The "news" function calls "tools:::.build_news_db" which has this line: >> >> nfile <- file.path(dir, "inst", "NEWS.Rd") >> >> So it appears that the "news" function is searching for >> "mypackage/inst/NEWS.Rd". >> >> However, "Writing R extensions" says "The contents of the inst >> subdirectory will be copied recursively to the installation directory" >> >> During the installation, mypackage/inst/NEWS.Rd is copied into the >> "mypackage" directory, not "mypackage/inst". >> >> What am I doing wrong, or is this a bug? >> >> Kevin Wright >> >> >> >> -- >> Kevin Wright >> > > > > -- > Kevin Wright > -- Kevin Wright From Art.Burke at educationnorthwest.org Thu Jan 6 20:20:34 2011 From: Art.Burke at educationnorthwest.org (Art Burke) Date: Thu, 6 Jan 2011 11:20:34 -0800 Subject: [R] Help spruce up a ggplot graph Message-ID: <2A85392715B78D4E87D89D4AD8199C0E0E5F1656D5@W0803.EducationNorthWest.Local> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Thu Jan 6 20:24:27 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 6 Jan 2011 14:24:27 -0500 Subject: [R] need help for chi-squared test In-Reply-To: <1294338217815-3177925.post@n4.nabble.com> References: <1294338217815-3177925.post@n4.nabble.com> Message-ID: <22E28994-346B-4DA6-8255-B0A6BFD399F8@comcast.net> On Jan 6, 2011, at 1:23 PM, kiotoqq wrote: > > I've got a dataset which looks like this in the beginning: > > > cbr dust smoking expo > 1 0 0.20 1 5 > 2 0 0.25 1 4 > 3 0 0.25 1 8 > 4 0 0.25 1 4 > 5 0 0.25 1 4 > > (till no. 1240, anyway, a huge set) > > I have to analyse cbr and smoking, I know it works with chisq.test() > for the > whole set, but I only need cbr and smoking, and I have no idea how to > extract them. This is not a sufficiently complex example on which to offer a solution, nor is it even clear enough to understand definitively what you want. So here is a guess: dfrm[which(dfrm$cbr==1 & dfrm$smoking==1), ] ... which would return a dataframe (or matrix depending on what form that data exists in) with only those cases where thos two conditions hold. You can either assign this value to an R object or you can apply the chisq.test() in what ever (unstated) manner you think has been "working" to the returned value as a whole. Please read the message at the bottom and follow its encouragement to read the Posting Guide. > -- > View this message in context: http://r.789695.n4.nabble.com/need-help-for-chi-squared-test-tp3177925p3177925.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From murdoch.duncan at gmail.com Thu Jan 6 20:29:52 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Thu, 06 Jan 2011 14:29:52 -0500 Subject: [R] Where is a package NEWS.Rd located? In-Reply-To: References: Message-ID: <4D261830.8000701@gmail.com> On 06/01/2011 2:19 PM, Kevin Wright wrote: > Yes, exactly. But the problem is with NEWS.Rd, not NEWS. I'm not sure who you are arguing with, but if you do file a bug report, please also put together a simple reproducible example, e.g. a small package containing NEWS.Rd in the inst directory (which is where the docs say it should go) and code that shows why this is bad. Don't just talk about internal functions used for building packages; as far as we can tell so far tools:::.build_news_db is doing exactly what it should be doing. Duncan Murdoch > pkg/inst/NEWS.Rd is moved to pkg/NEWS.Rd at build time, but for > installed packages, "news" tried to load "pkg/inst/NEWS.Rd". > > I'm going to file a bug report. > > Kevin > > > On Thu, Jan 6, 2011 at 7:29 AM, Kevin Wright wrote: > > If you look at tools:::.build_news_db, the plain text NEWS file is > > searched for in pkg/NEWS and pkg/inst/NEWS, but NEWS.Rd in only > > searched for in pkg/inst/NEWS.Rd. > > > > Looks like a bug to me. > > > > I *think*. > > > > Thanks, > > > > Kevin > > > > > > On Thu, Jan 6, 2011 at 7:09 AM, Kevin Wright wrote: > >> Hopefully a quick question. My package has a NEWS.Rd file that is not > >> being found by "news". > >> > >> The "news" function calls "tools:::.build_news_db" which has this line: > >> > >> nfile<- file.path(dir, "inst", "NEWS.Rd") > >> > >> So it appears that the "news" function is searching for > >> "mypackage/inst/NEWS.Rd". > >> > >> However, "Writing R extensions" says "The contents of the inst > >> subdirectory will be copied recursively to the installation directory" > >> > >> During the installation, mypackage/inst/NEWS.Rd is copied into the > >> "mypackage" directory, not "mypackage/inst". > >> > >> What am I doing wrong, or is this a bug? > >> > >> Kevin Wright > >> > >> > >> > >> -- > >> Kevin Wright > >> > > > > > > > > -- > > Kevin Wright > > > > > From jrkrideau at yahoo.ca Thu Jan 6 20:53:43 2011 From: jrkrideau at yahoo.ca (John Kane) Date: Thu, 6 Jan 2011 11:53:43 -0800 (PST) Subject: [R] Accessing data via url Message-ID: <352569.97781.qm@web38406.mail.mud.yahoo.com> # Can anyone suggest why this works datafilename <- "http://personality-project.org/r/datasets/maps.mixx.epi.bfi.data" person.data <- read.table(datafilename,header=TRUE) # but this does not? dd <- "https://sites.google.com/site/jrkrideau/home/general-stores/trees.txt" treedata <- read.table(dd, header=TRUE) =================================================================== Error in file(file, "rt") : cannot open the connection In addition: Warning message: In file(file, "rt") : unsupported URL scheme # I can access both through a hyperlink in OOO Calc. t # Thanks From frodo.jedi at yahoo.com Thu Jan 6 20:36:39 2011 From: frodo.jedi at yahoo.com (Frodo Jedi) Date: Thu, 6 Jan 2011 11:36:39 -0800 (PST) Subject: [R] Problem with 2-ways ANOVA interactions In-Reply-To: References: <498111.46032.qm@web57908.mail.re3.yahoo.com> Message-ID: <948126.53730.qm@web57907.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From kiotoqq at googlemail.com Thu Jan 6 20:34:43 2011 From: kiotoqq at googlemail.com (kiotoqq) Date: Thu, 6 Jan 2011 11:34:43 -0800 (PST) Subject: [R] need help for chi-squared test In-Reply-To: <22E28994-346B-4DA6-8255-B0A6BFD399F8@comcast.net> References: <1294338217815-3177925.post@n4.nabble.com> <22E28994-346B-4DA6-8255-B0A6BFD399F8@comcast.net> Message-ID: <1294342483043-3178052.post@n4.nabble.com> I used chisq.test(read.table("C:/Users/Maggy/Downloads/dust.asc", header=TRUE)) and got this Pearson's Chi-squared test data: read.table("C:/Users/Maggy/Downloads/dust.asc", header = TRUE) X-squared = 5226.164, df = 3735, p-value < 2.2e-16 and I think it should be right for the whole set, but that's not what I need, because I only have to use it for "cbr" and "smoking" -- View this message in context: http://r.789695.n4.nabble.com/need-help-for-chi-squared-test-tp3177925p3178052.html Sent from the R help mailing list archive at Nabble.com. From frodo.jedi at yahoo.com Thu Jan 6 20:56:37 2011 From: frodo.jedi at yahoo.com (Frodo Jedi) Date: Thu, 6 Jan 2011 11:56:37 -0800 (PST) Subject: [R] Assumptions for ANOVA: the right way to check the normality In-Reply-To: References: <9521.53053.qm@web57907.mail.re3.yahoo.com> <262855.6389.qm@web57903.mail.re3.yahoo.com> Message-ID: <333057.17814.qm@web57904.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From andy_liaw at merck.com Thu Jan 6 21:06:54 2011 From: andy_liaw at merck.com (Liaw, Andy) Date: Thu, 6 Jan 2011 15:06:54 -0500 Subject: [R] Where is a package NEWS.Rd located? In-Reply-To: <4D261830.8000701@gmail.com> References: <4D261830.8000701@gmail.com> Message-ID: I was communicating with Kevin off-list. The problem seems to be run time, not install time. News() calls tools:::.build_news_db(), and the 2nd line of that function is: nfile <- file.path(dir, "inst", "NEWS.Rd") and that's the problem: an installed package shouldn't have an inst/ subdirectory, right? Andy > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Duncan Murdoch > Sent: Thursday, January 06, 2011 2:30 PM > To: Kevin Wright > Cc: R list > Subject: Re: [R] Where is a package NEWS.Rd located? > > On 06/01/2011 2:19 PM, Kevin Wright wrote: > > Yes, exactly. But the problem is with NEWS.Rd, not NEWS. > > I'm not sure who you are arguing with, but if you do file a > bug report, > please also put together a simple reproducible example, e.g. a small > package containing NEWS.Rd in the inst directory (which is where the > docs say it should go) and code that shows why this is bad. > Don't just > talk about internal functions used for building packages; as > far as we > can tell so far tools:::.build_news_db is doing exactly what > it should > be doing. > > Duncan Murdoch > > > pkg/inst/NEWS.Rd is moved to pkg/NEWS.Rd at build time, but for > > installed packages, "news" tried to load "pkg/inst/NEWS.Rd". > > > > I'm going to file a bug report. > > > > Kevin > > > > > > On Thu, Jan 6, 2011 at 7:29 AM, Kevin > Wright wrote: > > > If you look at tools:::.build_news_db, the plain text > NEWS file is > > > searched for in pkg/NEWS and pkg/inst/NEWS, but NEWS.Rd in only > > > searched for in pkg/inst/NEWS.Rd. > > > > > > Looks like a bug to me. > > > > > > I *think*. > > > > > > Thanks, > > > > > > Kevin > > > > > > > > > On Thu, Jan 6, 2011 at 7:09 AM, Kevin > Wright wrote: > > >> Hopefully a quick question. My package has a NEWS.Rd > file that is not > > >> being found by "news". > > >> > > >> The "news" function calls "tools:::.build_news_db" > which has this line: > > >> > > >> nfile<- file.path(dir, "inst", "NEWS.Rd") > > >> > > >> So it appears that the "news" function is searching for > > >> "mypackage/inst/NEWS.Rd". > > >> > > >> However, "Writing R extensions" says "The contents of the inst > > >> subdirectory will be copied recursively to the > installation directory" > > >> > > >> During the installation, mypackage/inst/NEWS.Rd is > copied into the > > >> "mypackage" directory, not "mypackage/inst". > > >> > > >> What am I doing wrong, or is this a bug? > > >> > > >> Kevin Wright > > >> > > >> > > >> > > >> -- > > >> Kevin Wright > > >> > > > > > > > > > > > > -- > > > Kevin Wright > > > > > > > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Notice: This e-mail message, together with any attachme...{{dropped:11}} From dwinsemius at comcast.net Thu Jan 6 21:14:16 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 6 Jan 2011 15:14:16 -0500 Subject: [R] need help for chi-squared test In-Reply-To: <1294342483043-3178052.post@n4.nabble.com> References: <1294338217815-3177925.post@n4.nabble.com> <22E28994-346B-4DA6-8255-B0A6BFD399F8@comcast.net> <1294342483043-3178052.post@n4.nabble.com> Message-ID: <59D9A712-2B31-464D-99A6-B8FDD853658E@comcast.net> On Jan 6, 2011, at 2:34 PM, kiotoqq wrote: > > I used chisq.test(read.table("C:/Users/Maggy/Downloads/dust.asc", > header=TRUE)) So, where did you download this data and when is your homework due? > > and got this > > Pearson's Chi-squared test > > data: read.table("C:/Users/Maggy/Downloads/dust.asc", header = TRUE) > X-squared = 5226.164, df = 3735, p-value < 2.2e-16 > > and I think it should be right for the whole set, I, on the other hand. now suspect it is a meaningless set of numbers. > but that's not what I > need, because I only have to use it for "cbr" and "smoking" Do you mean you have an understanding of the potential values of cbr and smoking in that data and that you want to restrict your analysis to some subset defined by particular values of those variables? > -- > View this message in context: http://r.789695.n4.nabble.com/need-help-for-chi-squared-test-tp3177925p3178052.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From Greg.Snow at imail.org Thu Jan 6 21:29:36 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Thu, 6 Jan 2011 13:29:36 -0700 Subject: [R] Assumptions for ANOVA: the right way to check the normality In-Reply-To: <333057.17814.qm@web57904.mail.re3.yahoo.com> References: <9521.53053.qm@web57907.mail.re3.yahoo.com> <262855.6389.qm@web57903.mail.re3.yahoo.com> <333057.17814.qm@web57904.mail.re3.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From kw.stat at gmail.com Thu Jan 6 21:35:25 2011 From: kw.stat at gmail.com (Kevin Wright) Date: Thu, 6 Jan 2011 14:35:25 -0600 Subject: [R] Where is a package NEWS.Rd located? In-Reply-To: References: <4D261830.8000701@gmail.com> Message-ID: Andy, thanks for providing a clear way of saying it. I thought I was clear in the first place, but oh well). Here is the structure of my source files: hwpkg/DESCRIPTION hwpkg/R/hw.R hwpkg/inst/NEWS.Rd I'm using Windows XP. When I install this package, I do this: Rcmd INSTALL hwpkg Which results in ls c:/r/r-2.12.0/library/hwpkg/ -rwxr-x---+ 1 wrightkevi 355 Jan 6 14:19 DESCRIPTION drwxrwx---+ 2 wrightkevi 0 Jan 6 14:19 Meta -rwxr-x---+ 1 wrightkevi 18 Jan 6 14:19 NEWS.Rd drwxrwx---+ 2 wrightkevi 0 Jan 6 14:19 R drwxrwx---+ 2 wrightkevi 0 Jan 6 14:19 help drwxrwx---+ 2 wrightkevi 0 Jan 6 14:19 html As you see, there is no "inst/NEWS.Rd" file (NEWS.Rd has been moved UP a level), and so news(package="hwpkg") returns nothing. If I build the package into a zipfile and then install.packages(zipfile), the same problem occurs. Kevin On Thu, Jan 6, 2011 at 2:06 PM, Liaw, Andy wrote: > I was communicating with Kevin off-list. > > The problem seems to be run time, not install time. ?News() calls > tools:::.build_news_db(), and the 2nd line of that function is: > > ?nfile <- file.path(dir, "inst", "NEWS.Rd") > > and that's the problem: ?an installed package shouldn't have an inst/ > subdirectory, right? > > Andy > > >> -----Original Message----- >> From: r-help-bounces at r-project.org >> [mailto:r-help-bounces at r-project.org] On Behalf Of Duncan Murdoch >> Sent: Thursday, January 06, 2011 2:30 PM >> To: Kevin Wright >> Cc: R list >> Subject: Re: [R] Where is a package NEWS.Rd located? >> >> On 06/01/2011 2:19 PM, Kevin Wright wrote: >> > Yes, exactly. ?But the problem is with NEWS.Rd, not NEWS. >> >> I'm not sure who you are arguing with, but if you do file a >> bug report, >> please also put together a simple reproducible example, e.g. a small >> package containing NEWS.Rd in the inst directory (which is where the >> docs say it should go) and code that shows why this is bad. >> Don't just >> talk about internal functions used for building packages; as >> far as we >> can tell so far tools:::.build_news_db is doing exactly what >> it should >> be doing. >> >> Duncan Murdoch >> >> > pkg/inst/NEWS.Rd is moved to pkg/NEWS.Rd at build time, but for >> > installed packages, "news" tried to load "pkg/inst/NEWS.Rd". >> > >> > I'm going to file a bug report. >> > >> > Kevin >> > >> > >> > On Thu, Jan 6, 2011 at 7:29 AM, Kevin >> Wright ?wrote: >> > > ?If you look at tools:::.build_news_db, the plain text >> NEWS file is >> > > ?searched for in pkg/NEWS and pkg/inst/NEWS, but NEWS.Rd in only >> > > ?searched for in pkg/inst/NEWS.Rd. >> > > >> > > ?Looks like a bug to me. >> > > >> > > ?I *think*. >> > > >> > > ?Thanks, >> > > >> > > ?Kevin >> > > >> > > >> > > ?On Thu, Jan 6, 2011 at 7:09 AM, Kevin >> Wright ?wrote: >> > >> ?Hopefully a quick question. ?My package has a NEWS.Rd >> file that is not >> > >> ?being found by "news". >> > >> >> > >> ?The "news" function calls "tools:::.build_news_db" >> which has this line: >> > >> >> > >> ?nfile<- file.path(dir, "inst", "NEWS.Rd") >> > >> >> > >> ?So it appears that the "news" function is searching for >> > >> ?"mypackage/inst/NEWS.Rd". >> > >> >> > >> ?However, "Writing R extensions" says "The contents of the inst >> > >> ?subdirectory will be copied recursively to the >> installation directory" >> > >> >> > >> ?During the installation, mypackage/inst/NEWS.Rd is >> copied into the >> > >> ?"mypackage" directory, not "mypackage/inst". >> > >> >> > >> ?What am I doing wrong, or is this a bug? >> > >> >> > >> ?Kevin Wright >> > >> >> > >> >> > >> >> > >> ?-- >> > >> ?Kevin Wright >> > >> >> > > >> > > >> > > >> > > ?-- >> > > ?Kevin Wright >> > > >> > >> > >> > >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > Notice: ?This e-mail message, together with any attachments, contains > information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, > New Jersey, USA 08889), and/or its affiliates Direct contact information > for affiliates is available at > http://www.merck.com/contact/contacts.html) that may be confidential, > proprietary copyrighted and/or legally privileged. It is intended solely > for the use of the individual or entity named on this message. If you are > not the intended recipient, and have received this message in error, > please notify us immediately by reply e-mail and then delete it from > your system. > > -- Kevin Wright From Greg.Snow at imail.org Thu Jan 6 21:40:33 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Thu, 6 Jan 2011 13:40:33 -0700 Subject: [R] Problem with 2-ways ANOVA interactions In-Reply-To: <948126.53730.qm@web57907.mail.re3.yahoo.com> References: <498111.46032.qm@web57908.mail.re3.yahoo.com> <948126.53730.qm@web57907.mail.re3.yahoo.com> Message-ID: See inline > From: Frodo Jedi [mailto:frodo.jedi at yahoo.com] > Sent: Thursday, January 06, 2011 12:37 PM > To: Greg Snow; r-help at r-project.org > Subject: Re: [R] Problem with 2-ways ANOVA interactions > > Dear Greg, > thanks so much, I think that now I have understood. Please confirm me this reading what follows ;-) > > To summarize from the beginning, the table I analyzed is the result of a simple experiment. Subjects where exposed to some stimuli > and they where asked to evaluate the degree of realism of the stimuli on a 7 point scale (i.e., data in column "response"). > Each stimulus was presented in two conditions, "A" and "AH", where AH is the condition A plus another thing (let?s call it "H"). > > Before I wrongly thought that if I do the analysis anova(response ~ stimulus*condition) I would have got the comparison between > the same stimulus in condition A and in condition AH (e.g. stimulus_1_A, stimulus_1_AH). > Instad, apparently, the interaction stimulus:condition means that I find the differences between the stimuli keeping fixed the condition!! > If this is true then doing the anova with the interaction stimulus:condition is equivalent to do the ONE WAY ANOVA first on > the subset where all the conditions are A and then on the subset where all the conditions are AH? Right? I think you are closer, but not quite there. The test on the interaction tests if the difference between A and AH is the same across the different stimuli. The main effect for condition tests if there is a difference between A and AH. > > So if all before is correct, my final question is: how by means of ANOVA can I track the significative differences between the stimuli > presented in A and AH condition whitout passing for the t-test? Indeed my goal was to find in one hand if globally the condition > AH bring to better results than condition A, and on the other hand I needed to know for which stimuli the condition AH brings > better results than condition A. > > > > Finally, Iam burning with curiosity to know the answers to the following two questions: > 1- is there a difference between anova(response ~ stimulus*condition) and anova(response ~ condition*stimulus) > concerning the interaction part? The interaction part should be identical (with the exception of possible rounding error). > 2- doing the anova(response ~ stimulus + condition) give the same results of two ONE WAY ANOVA > anova(response ~ stimulus) and anova(response ~ condition) but the advantage is that they are presented together in one single output? Only if there is absolutely no effect in one or more of the terms. Doing both together allows you to compare the differences in the conditions given the different stimuli. The one-way anova on condition groups and differences from the different stimuli in with the overall error (in your case that was not much, so you will not see much difference, but in general there can be big differences). > > > Looking forward to knowing your response! > > > Best regards -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 From Greg.Snow at imail.org Thu Jan 6 21:52:56 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Thu, 6 Jan 2011 13:52:56 -0700 Subject: [R] need help for chi-squared test In-Reply-To: <59D9A712-2B31-464D-99A6-B8FDD853658E@comcast.net> References: <1294338217815-3177925.post@n4.nabble.com> <22E28994-346B-4DA6-8255-B0A6BFD399F8@comcast.net> <1294342483043-3178052.post@n4.nabble.com> <59D9A712-2B31-464D-99A6-B8FDD853658E@comcast.net> Message-ID: David, I think the poster wants to use one of the columns as x and the other as y, ignoring the remaining columns. If that is the case then he/she needs to read the section in "Introduction to R" on subsetting data frames. I agree that the output so far is meaningless, from the degrees of freedom it looks like chisq.test is interpreting the data frame as a 1245 by 4 contingency table. The thing that concerns me is the lack of any warnings or errors, if there truly were not any warnings then how is it interpreting counts of 0.25? and I would expect the number of 0's in the part shown to generate the warning about cell sizes too small. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of David Winsemius > Sent: Thursday, January 06, 2011 1:14 PM > To: kiotoqq > Cc: r-help at r-project.org > Subject: Re: [R] need help for chi-squared test > > > On Jan 6, 2011, at 2:34 PM, kiotoqq wrote: > > > > > I used chisq.test(read.table("C:/Users/Maggy/Downloads/dust.asc", > > header=TRUE)) > > So, where did you download this data and when is your homework due? > > > > > and got this > > > > Pearson's Chi-squared test > > > > data: read.table("C:/Users/Maggy/Downloads/dust.asc", header = TRUE) > > X-squared = 5226.164, df = 3735, p-value < 2.2e-16 > > > > and I think it should be right for the whole set, > > I, on the other hand. now suspect it is a meaningless set of numbers. > > > > but that's not what I > > need, because I only have to use it for "cbr" and "smoking" > > Do you mean you have an understanding of the potential values of cbr > and smoking in that data and that you want to restrict your analysis > to some subset defined by particular values of those variables? > > > > -- > > View this message in context: http://r.789695.n4.nabble.com/need- > help-for-chi-squared-test-tp3177925p3178052.html > > Sent from the R help mailing list archive at Nabble.com. > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From jrkrideau at yahoo.ca Thu Jan 6 21:59:34 2011 From: jrkrideau at yahoo.ca (John Kane) Date: Thu, 6 Jan 2011 12:59:34 -0800 (PST) Subject: [R] Accessing data via url In-Reply-To: Message-ID: <886007.8826.qm@web38407.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From benjamin.ward at bathspa.org Thu Jan 6 22:00:17 2011 From: benjamin.ward at bathspa.org (Ben Ward) Date: Thu, 6 Jan 2011 21:00:17 +0000 Subject: [R] Assumptions for ANOVA: the right way to check the normality In-Reply-To: References: <9521.53053.qm@web57907.mail.re3.yahoo.com> <262855.6389.qm@web57903.mail.re3.yahoo.com> <333057.17814.qm@web57904.mail.re3.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djmuser at gmail.com Thu Jan 6 22:13:57 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Thu, 6 Jan 2011 13:13:57 -0800 Subject: [R] Cairo pdf canvas size In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jrkrideau at yahoo.ca Thu Jan 6 22:30:39 2011 From: jrkrideau at yahoo.ca (John Kane) Date: Thu, 6 Jan 2011 13:30:39 -0800 (PST) Subject: [R] Accessing data via url In-Reply-To: Message-ID: <304671.81823.qm@web38405.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djmuser at gmail.com Thu Jan 6 22:32:52 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Thu, 6 Jan 2011 13:32:52 -0800 Subject: [R] Help spruce up a ggplot graph In-Reply-To: <2A85392715B78D4E87D89D4AD8199C0E0E5F1656D5@W0803.EducationNorthWest.Local> References: <2A85392715B78D4E87D89D4AD8199C0E0E5F1656D5@W0803.EducationNorthWest.Local> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Sebastien.Bihorel at cognigencorp.com Thu Jan 6 22:37:15 2011 From: Sebastien.Bihorel at cognigencorp.com (Sebastien Bihorel) Date: Thu, 06 Jan 2011 16:37:15 -0500 Subject: [R] Stop and call objects In-Reply-To: <77EB52C6DD32BA4D87471DCD70C8D70003C2E1D9@NA-PA-VBE03.na.tibco.com> References: <4D247EEC.4030004@cognigencorp.com> <77EB52C6DD32BA4D87471DCD70C8D70003C2E1D9@NA-PA-VBE03.na.tibco.com> Message-ID: <4D26360B.6020407@cognigencorp.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Sebastien.Bihorel at cognigencorp.com Thu Jan 6 22:45:06 2011 From: Sebastien.Bihorel at cognigencorp.com (Sebastien Bihorel) Date: Thu, 06 Jan 2011 16:45:06 -0500 Subject: [R] Global variables Message-ID: <4D2637E2.8030409@cognigencorp.com> Dear R-users, Is there a way I can prevent global variables to be visible within my functions? Sebastien From murdoch.duncan at gmail.com Thu Jan 6 22:59:08 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Thu, 06 Jan 2011 16:59:08 -0500 Subject: [R] Global variables In-Reply-To: <4D2637E2.8030409@cognigencorp.com> References: <4D2637E2.8030409@cognigencorp.com> Message-ID: <4D263B2C.1070208@gmail.com> On 06/01/2011 4:45 PM, Sebastien Bihorel wrote: > Dear R-users, > > Is there a way I can prevent global variables to be visible within my > functions? Yes, but you probably shouldn't. You would do it by setting the environment of the function to something that doesn't have the global environment as a parent, or grandparent, etc. The only common examples of that are baseenv() and emptyenv(). For example, x <- 1 f <- function() print(x) Then f() will work, and print the 1. But if I do environment(f) <- baseenv() then it won't work: > f() Error in print(x) : object 'x' not found The problem with doing this is that it is not the way users expect functions to work, and it will probably have weird side effects. It is not the way things work in packages (even packages with namespaces will eventually search the global environment, the namespace just comes first). There's no simple way to do it and yet get access to functions in other packages besides base without explicitly specifying them (e.g. you'd need to use stats::lm(), not just lm(), etc.) Duncan Murdoch From moleps2 at gmail.com Thu Jan 6 23:12:43 2011 From: moleps2 at gmail.com (moleps) Date: Thu, 6 Jan 2011 23:12:43 +0100 Subject: [R] Hmisc, summary.formula and catTest Message-ID: Dear all, I?m specifying the fisher.exact test for use with summary.formula as follows: u<-function(a,b){ j<-fisher.test(a) p<-list(P=j$p.value,stat=NA,df=NA,testname=j$method,statname="") return(p) } However I?m also required to specify stat & df. However this doesnt apply to the fisher test. I?ve tried specifying them as NA and "" without success-throws either a blank or an error msg trying to round a non-numeric value respectively. reproducible example: ex<-pbc summary(trt~sex+ascites,data=ex,test=T,method="reverse") summary(trt~sex+ascites,data=ex,test=T,method="reverse",catTest=u) The closest I get is u<-function(a,b){ j<-fisher.test(a) p<-list(P=j$p.value,stat=1,df=1,testname=j$method,statname="") return(p) } However then I manually have to edit the output. Is there a smart way of doing this? //M From ggrothendieck at gmail.com Thu Jan 6 23:15:31 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Thu, 6 Jan 2011 17:15:31 -0500 Subject: [R] Global variables In-Reply-To: <4D263B2C.1070208@gmail.com> References: <4D2637E2.8030409@cognigencorp.com> <4D263B2C.1070208@gmail.com> Message-ID: On Thu, Jan 6, 2011 at 4:59 PM, Duncan Murdoch wrote: > On 06/01/2011 4:45 PM, Sebastien Bihorel wrote: >> >> Dear R-users, >> >> Is there a way I can prevent global variables to be visible within my >> functions? > > > Yes, but you probably shouldn't. ?You would do it by setting the environment > of the function to something that doesn't have the global environment as a > parent, or grandparent, etc. ?The only common examples of that are baseenv() > and emptyenv(). ?For example, > > x <- 1 > f <- function() print(x) > > Then f() will work, and print the 1. ?But if I do > > environment(f) <- baseenv() > > then it won't work: > >> f() > Error in print(x) : object 'x' not found > > The problem with doing this is that it is not the way users expect functions > to work, and it will probably have weird side effects. ?It is not the way > things work in packages (even packages with namespaces will eventually > search the global environment, the namespace just comes first). ?There's no > simple way to do it and yet get access to functions in other packages > besides base without explicitly specifying them (e.g. you'd need to use > stats::lm(), not just lm(), etc.) > A variation of this would be: environment(f) <- as.environment(2) which would skip over the global environment, .GlobEnv, but would still search the loaded packages. In the example above x would not be found but it still could find lm, etc. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From arrayprofile at yahoo.com Thu Jan 6 23:16:38 2011 From: arrayprofile at yahoo.com (array chip) Date: Thu, 6 Jan 2011 14:16:38 -0800 (PST) Subject: [R] algorithm help Message-ID: <708598.4676.qm@web56306.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From eriki at ccbr.umn.edu Thu Jan 6 23:21:10 2011 From: eriki at ccbr.umn.edu (Erik Iverson) Date: Thu, 06 Jan 2011 16:21:10 -0600 Subject: [R] Hmisc, summary.formula and catTest In-Reply-To: References: Message-ID: <4D264056.1050505@ccbr.umn.edu> > The closest I get is > > > u<-function(a,b){ > > j<-fisher.test(a) > p<-list(P=j$p.value,stat=1,df=1,testname=j$method,statname="") > return(p) > > } > > However then I manually have to edit the output. Is there a smart way of doing this? You're not explaining what the output is and what you are doing when you 'manually have to edit' it. Does the prtest argument help when you actually use the 'print' function around your summary.formula object? I think that's how I solve it. prtest: a vector of test statistic components to print if ?test=TRUE? was in effect when ?summary.formula? was called. Defaults to printing all components. Specify ?prtest=FALSE? or ?prtest="none"? to not print any tests. This applies to ?print?, ?latex?, and ?plot? methods for ?method='reverse'?. From wdunlap at tibco.com Thu Jan 6 23:21:37 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 6 Jan 2011 14:21:37 -0800 Subject: [R] Stop and call objects In-Reply-To: <4D26360B.6020407@cognigencorp.com> References: <4D247EEC.4030004@cognigencorp.com> <77EB52C6DD32BA4D87471DCD70C8D70003C2E1D9@NA-PA-VBE03.na.tibco.com> <4D26360B.6020407@cognigencorp.com> Message-ID: <77EB52C6DD32BA4D87471DCD70C8D70003C2E4A5@NA-PA-VBE03.na.tibco.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From eriki at ccbr.umn.edu Thu Jan 6 23:24:56 2011 From: eriki at ccbr.umn.edu (Erik Iverson) Date: Thu, 06 Jan 2011 16:24:56 -0600 Subject: [R] Hmisc, summary.formula and catTest In-Reply-To: <4D264056.1050505@ccbr.umn.edu> References: <4D264056.1050505@ccbr.umn.edu> Message-ID: <4D264138.4010204@ccbr.umn.edu> > > Does the prtest argument help when you actually use the 'print' function > around your summary.formula object? I think that's how I > solve it. I.e., sf1 <- summary(trt~sex+ascites,data=ex,test=T,method="reverse",catTest=u) print(sf1, prtest = "P") Descriptive Statistics by trt +-------+---+---------+---------+-------+ | |N |1 |2 |P-value| | | |(N=158) |(N=154) | | +-------+---+---------+---------+-------+ |sex : f|418|87% (137)|90% (139)| 0.377| +-------+---+---------+---------+-------+ |ascites|312| 9% ( 14)| 6% ( 10)| 0.526| +-------+---+---------+---------+-------+ From moleps2 at gmail.com Thu Jan 6 23:29:34 2011 From: moleps2 at gmail.com (moleps) Date: Thu, 6 Jan 2011 23:29:34 +0100 Subject: [R] Hmisc, summary.formula and catTest In-Reply-To: <4D264138.4010204@ccbr.umn.edu> References: <4D264056.1050505@ccbr.umn.edu> <4D264138.4010204@ccbr.umn.edu> Message-ID: Allright..Works like a charm. However I do believe that the prtest vector should have been mentioned in the catTest or conTest option. Appreciate your time and effort. Best, //M On 6. jan. 2011, at 23.24, Erik Iverson wrote: > >> Does the prtest argument help when you actually use the 'print' function >> around your summary.formula object? I think that's how I >> solve it. > > I.e., > > sf1 <- summary(trt~sex+ascites,data=ex,test=T,method="reverse",catTest=u) > > print(sf1, prtest = "P") > > > Descriptive Statistics by trt > > +-------+---+---------+---------+-------+ > | |N |1 |2 |P-value| > | | |(N=158) |(N=154) | | > +-------+---+---------+---------+-------+ > |sex : f|418|87% (137)|90% (139)| 0.377| > +-------+---+---------+---------+-------+ > |ascites|312| 9% ( 14)| 6% ( 10)| 0.526| > +-------+---+---------+---------+-------+ From carl at witthoft.com Thu Jan 6 23:39:33 2011 From: carl at witthoft.com (Carl Witthoft) Date: Thu, 06 Jan 2011 17:39:33 -0500 Subject: [R] Waaaayy off topic...Statistical methods, pub bias, scientific validity Message-ID: <4D2644A5.9040303@witthoft.com> The next week's New Yorker has some decent rebuttal letters. The case is hardly as clear-cut as the author would like to believe. Carl From carl at witthoft.com Thu Jan 6 23:41:13 2011 From: carl at witthoft.com (Carl Witthoft) Date: Thu, 06 Jan 2011 17:41:13 -0500 Subject: [R] algorithm help Message-ID: <4D264509.4090506@witthoft.com> try this: ?rle Carl ****** From: array chip Date: Thu, 06 Jan 2011 14:16:38 -0800 (PST) Hi, I am seeking help on designing an algorithm to identify the locations of stretches of 1s in a vector of 0s and 1s. Below is an simple example: > dat<-as.data.frame(cbind(a=c(F,F,T,T,T,T,F,F,T,T,F,T,T,T,T,F,F,F,F,T) ,b=c(4,12,13,16,18,20,28,30,34,46,47,49,61,73,77,84,87,90,95,97))) > dat a b 1 0 4 2 0 12 3 1 13 4 1 16 5 1 18 6 1 20 7 0 28 8 0 30 9 1 34 10 1 46 11 0 47 12 1 49 13 1 61 14 1 73 15 1 77 16 0 84 17 0 87 18 0 90 19 0 95 20 1 97 In this dataset, "b" is sorted and denotes the location for each number in "a". So I would like to find the starting & ending locations for each stretch of 1s within "a", also counting the number of 1s in each stretch as well. Hope the results from the algorithm would be: From ssefick at gmail.com Thu Jan 6 23:41:52 2011 From: ssefick at gmail.com (stephen sefick) Date: Thu, 6 Jan 2011 16:41:52 -0600 Subject: [R] [zoo] - Individual zoo or data frames from non-continuous zoo series Message-ID: #Is there a way to break the below zoo object into non-NA data frames algorithmically #this is a small example of a much larger problem. #It is really no even necessary to have the continuous chunks #end up as zoo objects but it is important to have them end #up with the index column. #thanks for all of your help in advance, and #if you need anything else please let me know library(zoo) ind. <- 1:200 data <- c(1:50, rep(NA, 50), 1:50, rep(NA, 50)) z <- zoo(data, ind.) -- Stephen Sefick ____________________________________ | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? | | Department of Biological Sciences? ? ? ? ?? | | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? | | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | |___________________________________| | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? | | http://www.auburn.edu/~sas0025? ? ? ? ? ?? | |___________________________________| Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods.? We are mammals, and have not exhausted the annoying little problems of being mammals. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis "A big computer, a complex algorithm and a long time does not equal science." ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman From moleps2 at gmail.com Thu Jan 6 23:43:06 2011 From: moleps2 at gmail.com (moleps) Date: Thu, 6 Jan 2011 23:43:06 +0100 Subject: [R] Hmisc, summary.formula and catTest In-Reply-To: <4D264138.4010204@ccbr.umn.edu> References: <4D264056.1050505@ccbr.umn.edu> <4D264138.4010204@ccbr.umn.edu> Message-ID: <0F348E52-E63A-4A3E-B03E-F1941C05F39A@gmail.com> Is it at all possible to specify this so that different tests display different parameters, ie have the continous test display F, df and p while tes categorical test display only P values? sf1 <- summary(trt~sex+ascites+age,data=ex,test=T,method="reverse",catTest=u) print(sf1, prtest = "P") //M On 6. jan. 2011, at 23.24, Erik Iverson wrote: > >> Does the prtest argument help when you actually use the 'print' function >> around your summary.formula object? I think that's how I >> solve it. > > I.e., > > sf1 <- summary(trt~sex+ascites,data=ex,test=T,method="reverse",catTest=u) > > print(sf1, prtest = "P") > > > Descriptive Statistics by trt > > +-------+---+---------+---------+-------+ > | |N |1 |2 |P-value| > | | |(N=158) |(N=154) | | > +-------+---+---------+---------+-------+ > |sex : f|418|87% (137)|90% (139)| 0.377| > +-------+---+---------+---------+-------+ > |ascites|312| 9% ( 14)| 6% ( 10)| 0.526| > +-------+---+---------+---------+-------+ From ted.harding at wlandres.net Thu Jan 6 23:57:47 2011 From: ted.harding at wlandres.net ( (Ted Harding)) Date: Thu, 06 Jan 2011 22:57:47 -0000 (GMT) Subject: [R] algorithm help In-Reply-To: <708598.4676.qm@web56306.mail.re3.yahoo.com> Message-ID: On 06-Jan-11 22:16:38, array chip wrote: > Hi, I am seeking help on designing an algorithm to identify the > locations of stretches of 1s in a vector of 0s and 1s. Below is > an simple example: > >> dat<-as.data.frame(cbind(a=c(F,F,T,T,T,T,F,F,T,T,F,T,T,T,T,F,F,F,F,T) > ,b=c(4,12,13,16,18,20,28,30,34,46,47,49,61,73,77,84,87,90,95,97))) > >> dat > a b > 1 0 4 > 2 0 12 > 3 1 13 > 4 1 16 > 5 1 18 > 6 1 20 > 7 0 28 > 8 0 30 > 9 1 34 > 10 1 46 > 11 0 47 > 12 1 49 > 13 1 61 > 14 1 73 > 15 1 77 > 16 0 84 > 17 0 87 > 18 0 90 > 19 0 95 > 20 1 97 > > In this dataset, "b" is sorted and denotes the location for each > number in "a". > So I would like to find the starting & ending locations for each > stretch of 1s within "a", also counting the number of 1s in each > stretch as well. > Hope the results from the algorithm would be: > > stretch start end No.of.1s > 1 13 20 4 > 2 34 46 2 > 3 49 77 4 > 4 97 97 1 > > I can imagine using for loops can do the job, but I feel it's not a > clever way to do this. Is there an efficient algorithm that can do > this fast? > > Thanks for any suggestions. > John The basic information you need can be got using rle() ("run length encoding"). See '?rle'. In your example: rle(dat$a) # Run Length Encoding # lengths: int [1:8] 2 4 2 2 1 4 4 1 # values : num [1:8] 0 1 0 1 0 1 0 1 ## Note: F -> 0, T -> 1 The following has a somewhat twisted logic at the end, and may be flawed, but you can probably adapt it! L <- rle(dat$a)$lengths V <- rle(dat$a)$values pos <- c(1,cumsum(L)) V1 <- c(-1,V) 1+pos[V1==0] # [1] 3 9 12 20 ## Positions in the series dat$a where each run of "T" (i.e. 1) ## starts Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 06-Jan-11 Time: 22:57:44 ------------------------------ XFMail ------------------------------ From macqueen1 at llnl.gov Fri Jan 7 00:05:17 2011 From: macqueen1 at llnl.gov (MacQueen, Don) Date: Thu, 6 Jan 2011 15:05:17 -0800 Subject: [R] What are the necessary Oracle software to install and run ROracle ? In-Reply-To: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From lawrence.michael at gene.com Fri Jan 7 00:25:59 2011 From: lawrence.michael at gene.com (Michael Lawrence) Date: Thu, 6 Jan 2011 15:25:59 -0800 Subject: [R] RGtk2 compilation problem In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From arrayprofile at yahoo.com Fri Jan 7 00:29:03 2011 From: arrayprofile at yahoo.com (array chip) Date: Thu, 6 Jan 2011 15:29:03 -0800 (PST) Subject: [R] algorithm help In-Reply-To: References: Message-ID: <835302.8321.qm@web56304.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From shigesong at gmail.com Fri Jan 7 00:32:49 2011 From: shigesong at gmail.com (Shige Song) Date: Thu, 6 Jan 2011 18:32:49 -0500 Subject: [R] RGtk2 compilation problem In-Reply-To: References: Message-ID: Yes, the new version works fine. Many thanks. Best, Shige On Thu, Jan 6, 2011 at 6:25 PM, Michael Lawrence wrote: > > > On Thu, Jan 6, 2011 at 8:53 AM, Prof Brian Ripley > wrote: >> >> You need RGtk2 2.20.7 which is now on CRAN. ?Others have seen this, but it >> has taken a while to track down the exact cause. >> >> The diagnosis was that ML used a recent GNU tar which created a tarball >> with hard links that R's untar was not prepared to deal with. We consider >> that is a bug in GNU tar, but untar() has been updated in R-patched to cope. >> > > After a lot of back and forth with the GNU tar guys, it turns out they do > not consider this to be a bug. I had to refresh my knowledge of hard linking > to understand. A hard link is from a file name to the actual inode in the > file system. Typically every file has a single hard link (the name of the > file). The -h option used to resolve a symbolic link differently, based on > whether the hard link count of the target was 1 or >=2. This was practically > useful in my mind, because symlinks to any files without any explicitly > added hard links would become a regular file in the archive. They have now > dropped this distinction, calling it an inconsistency (apparently other > implementations of tar have never made such a distinction). So symlinks now > become hard links in the archive (as long as the target is in the archive). > We may need to keep the fix in untar() to handle this. Either way, RGtk2 > 2.20.7 should work now. > > Thanks, > Michael > >> >> If you have such a tarball, try setting the environment variable >> R_INSTALL_TAR to 'tar' (or whatever GNU tar is called on your system) when >> installing the tarball. >> >> For those packaging source packages: in the unusual event that your >> package sources contains symbolic (or even hard) links, don't use GNU tar >> 1.24 or 1.25. >> >> On Thu, 6 Jan 2011, Shige Song wrote: >> >>> Look forward to it. >>> >>> Thanks. >>> >>> Shige >>> >>> On Sat, Jan 1, 2011 at 8:45 AM, Michael Lawrence >>> wrote: >>>> >>>> Please watch for 2.20.5 and let me know if it helps. Not really sure >>>> what is >>>> going on here, but someone else has reported the same issue. >>>> >>>> Thanks, >>>> Michael >>>> >>>> On Wed, Dec 29, 2010 at 6:44 AM, Shige Song wrote: >>>>> >>>>> Dear All, >>>>> >>>>> I am trying to compile&install the package "RGtk2" on my Ubuntu 10.04 >>>>> box. I did not have problem with earlier versions, but with the new >>>>> version, I got the following error message : >> >> ... >> >> -- >> Brian D. Ripley, ? ? ? ? ? ? ? ? ?ripley at stats.ox.ac.uk >> Professor of Applied Statistics, ?http://www.stats.ox.ac.uk/~ripley/ >> University of Oxford, ? ? ? ? ? ? Tel: ?+44 1865 272861 (self) >> 1 South Parks Road, ? ? ? ? ? ? ? ? ? ? +44 1865 272866 (PA) >> Oxford OX1 3TG, UK ? ? ? ? ? ? ? ?Fax: ?+44 1865 272595 > > From wdunlap at tibco.com Fri Jan 7 00:52:47 2011 From: wdunlap at tibco.com (William Dunlap) Date: Thu, 6 Jan 2011 15:52:47 -0800 Subject: [R] algorithm help In-Reply-To: <835302.8321.qm@web56304.mail.re3.yahoo.com> References: <835302.8321.qm@web56304.mail.re3.yahoo.com> Message-ID: <77EB52C6DD32BA4D87471DCD70C8D70003C2E4E9@NA-PA-VBE03.na.tibco.com> > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of array chip > Sent: Thursday, January 06, 2011 3:29 PM > To: ted.harding at wlandres.net > Cc: r-help at stat.math.ethz.ch > Subject: Re: [R] algorithm help > > Thanks very much, Ted. Yes, it does what I need! > > I made a routine to do this: > > f.fragment<-function(a,b) { > dat<-as.data.frame(cbind(a,b)) > > L <- rle(dat$a)$lengths > V <- rle(dat$a)$values > pos <- c(1,cumsum(L)) > V1 <- c(-1,V) > start<-1+pos[V1==0] > end<-pos[V1==1] > > cbind(stretch=1:length(start),start=dat$b[start] > ,end=dat$b[end],no.of.1s=L[V==1]) > > } > > f.fragment(dat$a,dat$b) > > stretch start end no.of.1s > [1,] 1 13 20 4 > [2,] 2 34 46 2 > [3,] 3 49 77 4 > [4,] 4 97 97 1 You need to be more careful about the first and last rows in the dataset. I think yours only works when a starts with 0 and ends with 1. > f.fragment(c(1,1,0,0), c(11,12,13,14)) stretch start end no.of.1s [1,] 1 NA 12 2 > f.fragment(c(1,1,0,1), c(11,12,13,14)) stretch start end no.of.1s [1,] 1 14 12 2 [2,] 1 14 14 1 > f.fragment(c(0,1,0,1), c(11,12,13,14)) stretch start end no.of.1s [1,] 1 12 12 1 [2,] 2 14 14 1 > f.fragment(c(0,1,0,0), c(11,12,13,14)) stretch start end no.of.1s [1,] 1 12 12 1 [2,] 2 NA 12 1 > f.fragment(c(1,1,1,1), c(11,12,13,14)) stretch end no.of.1s [1,] 1 14 4 [2,] 0 14 4 > f.fragment(c(0,0,0,0), c(11,12,13,14)) stretch start [1,] 1 NA The following does better. It keeps things as logical vectors as long as possible, which tends to work better when dealing with runs. f <- function(a, b) { isFirstIn1Run <- c(TRUE, a[-1] != a[-length(a)]) & a==1 isLastIn1Run <- c(a[-1] != a[-length(a)], TRUE) & a==1 data.frame(stretch=seq_len(sum(isFirstIn1Run)), start = b[isFirstIn1Run], end = b[isLastIn1Run], no.of.1s = which(isLastIn1Run) - which(isFirstIn1Run) + 1) } > f(c(1,1,0,0), c(11,12,13,14)) stretch start end no.of.1s 1 1 11 12 2 > f(c(1,1,0,1), c(11,12,13,14)) stretch start end no.of.1s 1 1 11 12 2 2 2 14 14 1 > f(c(0,1,0,1), c(11,12,13,14)) stretch start end no.of.1s 1 1 12 12 1 2 2 14 14 1 > f(c(0,1,0,0), c(11,12,13,14)) stretch start end no.of.1s 1 1 12 12 1 > f(c(1,1,1,1), c(11,12,13,14)) stretch start end no.of.1s 1 1 11 14 4 > f(c(0,0,0,0), c(11,12,13,14)) [1] stretch start end no.of.1s <0 rows> (or 0-length row.names) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > John > > > > > ________________________________ > From: "ted.harding at wlandres.net" > > Cc: r-help at stat.math.ethz.ch > Sent: Thu, January 6, 2011 2:57:47 PM > Subject: RE: [R] algorithm help > > On 06-Jan-11 22:16:38, array chip wrote: > > Hi, I am seeking help on designing an algorithm to identify the > > locations of stretches of 1s in a vector of 0s and 1s. Below is > > an simple example: > > > >> > dat<-as.data.frame(cbind(a=c(F,F,T,T,T,T,F,F,T,T,F,T,T,T,T,F,F,F,F,T) > > ,b=c(4,12,13,16,18,20,28,30,34,46,47,49,61,73,77,84,87,90,95,97))) > > > >> dat > > a b > > 1 0 4 > > 2 0 12 > > 3 1 13 > > 4 1 16 > > 5 1 18 > > 6 1 20 > > 7 0 28 > > 8 0 30 > > 9 1 34 > > 10 1 46 > > 11 0 47 > > 12 1 49 > > 13 1 61 > > 14 1 73 > > 15 1 77 > > 16 0 84 > > 17 0 87 > > 18 0 90 > > 19 0 95 > > 20 1 97 > > > > In this dataset, "b" is sorted and denotes the location for each > > number in "a". > > So I would like to find the starting & ending locations for each > > stretch of 1s within "a", also counting the number of 1s in each > > stretch as well. > > Hope the results from the algorithm would be: > > > > stretch start end No.of.1s > > 1 13 20 4 > > 2 34 46 2 > > 3 49 77 4 > > 4 97 97 1 > > > > I can imagine using for loops can do the job, but I feel it's not a > > clever way to do this. Is there an efficient algorithm that can do > > this fast? > > > > Thanks for any suggestions. > > John > > The basic information you need can be got using rle() ("run length > encoding"). See '?rle'. In your example: > > rle(dat$a) > # Run Length Encoding > # lengths: int [1:8] 2 4 2 2 1 4 4 1 > # values : num [1:8] 0 1 0 1 0 1 0 1 > ## Note: F -> 0, T -> 1 > > The following has a somewhat twisted logic at the end, and may > [[elided Yahoo spam]] > > L <- rle(dat$a)$lengths > V <- rle(dat$a)$values > pos <- c(1,cumsum(L)) > V1 <- c(-1,V) > 1+pos[V1==0] > # [1] 3 9 12 20 > ## Positions in the series dat$a where each run of "T" (i.e. 1) > ## starts > > Hoping this helps, > Ted. > > -------------------------------------------------------------------- > E-Mail: (Ted Harding) > Fax-to-email: +44 (0)870 094 0861 > Date: 06-Jan-11 Time: 22:57:44 > ------------------------------ XFMail ------------------------------ > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From ggrothendieck at gmail.com Fri Jan 7 00:57:49 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Thu, 6 Jan 2011 18:57:49 -0500 Subject: [R] [zoo] - Individual zoo or data frames from non-continuous zoo series In-Reply-To: References: Message-ID: On Thu, Jan 6, 2011 at 5:41 PM, stephen sefick wrote: > #Is there a way to break the below zoo object into non-NA data frames > algorithmically > #this is a small example of a much larger problem. > #It is really no even necessary to have the continuous chunks > #end up as zoo objects but it is important to have them end > #up with the index column. > #thanks for all of your help in advance, and > #if you need anything else please let me know > > library(zoo) > > ind. <- 1:200 > data <- c(1:50, rep(NA, 50), 1:50, rep(NA, 50)) > > z <- zoo(data, ind.) > Below c(TRUE, diff(is.na(z)) != 0) gives a logical vector which is TRUE at the first position of any run of NAs or non-NAs. The cumsum of that gives a vector the same length as z such that each position of the first run is 1, each position of the second run is 2, etc. The NA runs are then set to 0 in the second line. In the third line we split z on g and drop the portion that corresponds to the NAs. > g <- cumsum(c(TRUE, diff(is.na(z)) != 0)) > g[is.na(z)] <- 0 > split(z, g)[-1] $`1` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 $`3` 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 141 142 143 144 145 146 147 148 149 150 41 42 43 44 45 46 47 48 49 50 This could be combined into a multivariate series like this: do.call("merge", split(z, g)[-1]) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From arrayprofile at yahoo.com Fri Jan 7 01:01:28 2011 From: arrayprofile at yahoo.com (array chip) Date: Thu, 6 Jan 2011 16:01:28 -0800 (PST) Subject: [R] algorithm help In-Reply-To: <77EB52C6DD32BA4D87471DCD70C8D70003C2E4E9@NA-PA-VBE03.na.tibco.com> References: <835302.8321.qm@web56304.mail.re3.yahoo.com> <77EB52C6DD32BA4D87471DCD70C8D70003C2E4E9@NA-PA-VBE03.na.tibco.com> Message-ID: <701847.86485.qm@web56301.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From moshersteven at gmail.com Fri Jan 7 01:11:48 2011 From: moshersteven at gmail.com (steven mosher) Date: Thu, 6 Jan 2011 16:11:48 -0800 Subject: [R] Accessing data via url In-Reply-To: <352569.97781.qm@web38406.mail.mud.yahoo.com> References: <352569.97781.qm@web38406.mail.mud.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From moshersteven at gmail.com Fri Jan 7 01:18:34 2011 From: moshersteven at gmail.com (steven mosher) Date: Thu, 6 Jan 2011 16:18:34 -0800 Subject: [R] Accessing data via url In-Reply-To: <352569.97781.qm@web38406.mail.mud.yahoo.com> References: <352569.97781.qm@web38406.mail.mud.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From marcelolaia at gmail.com Fri Jan 7 02:15:08 2011 From: marcelolaia at gmail.com (Marcelo Luiz de Laia) Date: Fri, 7 Jan 2011 01:15:08 +0000 (UTC) Subject: [R] RGtk2 compilation problem References: Message-ID: Shige Song gmail.com> writes: > > Dear All, > > I am trying to compile&install the package "RGtk2" on my Ubuntu 10.04 (...) > ./RGtk2/gdkClasses.h:4:23: error: RGtk2/gdk.h: No such file or directory Hi, In a few days ago, I have had the same error on my Debian Testing. After a lot of spent time, I found that Debian's developers already have has compiled the Gtk2 on a deb package (http://packages.debian.org/squeeze/r-cran-rgtk2). So, I did: apt-get update&&apt-get install r-cran-rgtk2 For Ubuntu, there is a same package (https://launchpad.net/ubuntu/lucid/+package/r-cran-rgtk2) that could be installed in the same way. Marcelo >From Brazil From aaditya.nanduri at gmail.com Fri Jan 7 03:23:46 2011 From: aaditya.nanduri at gmail.com (Aaditya Nanduri) Date: Thu, 6 Jan 2011 20:23:46 -0600 Subject: [R] R not recognized in command line In-Reply-To: References: <4D249120.2000108@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michael.bedward at gmail.com Fri Jan 7 03:44:07 2011 From: michael.bedward at gmail.com (Michael Bedward) Date: Fri, 7 Jan 2011 13:44:07 +1100 Subject: [R] How to make a Cluster of Clusters In-Reply-To: References: Message-ID: Hello Diego, This might not be relevant, but on reading your question the first idea that struck me was that ordination trajectories of your lakes over time might be more informative than clustering. Michael On 5 January 2011 01:31, Diego Pujoni wrote: > Dear R-help, > > In my Master thesis I measured 10 variables from 18 lakes. These > measurements were taken 4 times a year in 3 depths, so I have 12 > samples from each lake. I know that 12 samples can not be treated as > replications, since they don't correspond to the same environmental > characteristics and are not statistically independent, but I want to > use these 12 samples as an estimate of an annual range the 18 lakes > have of the 10 variables. > > I want to make a cluster analysis of the 18 lakes and my known > possibilities were: > 1- Make an average of the 12 samples from each lake and make the > cluster (Using ward's method); > 2- Use all 216 samples (18*12) to make the cluster (Which yields a mess). > > But I thought I could begin the cluster algorithm already with 18 > clusters (Lakes) each with 12 individuals (samples) and normally > proceed with the calculations (using ward's method). So I will obtain > a cluster of the 18 lakes, but using the 12 samples. > > I got the cluster Fortran algorithm and I'm trying to translate it to > the R language to see how it works and maybe implement this kind of > cluster of cluster analysis. > > Does anyone knows if there is an algorithm that does this? Actually I > did it by hand and got very good and meaningful results, but I want to > implement it to try another merging criterias. > > Thanks > > Diego Pujoni > Zooplankton Ecology Laboratory > Biological Sciences Institute > Federal University of Minas Gerais > Brazil > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From albertonegron at gmail.com Thu Jan 6 21:38:20 2011 From: albertonegron at gmail.com (Alberto Negron) Date: Thu, 6 Jan 2011 20:38:20 +0000 Subject: [R] Accessing data via url In-Reply-To: <352569.97781.qm@web38406.mail.mud.yahoo.com> References: <352569.97781.qm@web38406.mail.mud.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From albertonegron at gmail.com Thu Jan 6 22:20:48 2011 From: albertonegron at gmail.com (Alberto Negron) Date: Thu, 6 Jan 2011 21:20:48 +0000 Subject: [R] Accessing data via url In-Reply-To: <886007.8826.qm@web38407.mail.mud.yahoo.com> References: <886007.8826.qm@web38407.mail.mud.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From diasandre at gmail.com Thu Jan 6 21:21:33 2011 From: diasandre at gmail.com (ADias) Date: Thu, 6 Jan 2011 12:21:33 -0800 (PST) Subject: [R] Help with IF operator Message-ID: <1294345293111-3178129.post@n4.nabble.com> Hi, I am with a problem on how to do a comparison of values. My script is as follows: repeat{ cat("How many teams to use? (to end write 0) ") nro<-scan(n=1) if(nro==0)break cat("write the", nro, "teams names \n") teams<-readLines(n=nro) if (teams[1]==teams[2)next else print(teams) } On this example I only compare teams 1 name with teams 2 name, and if they are the same the scrip starts again. If I had 10 teams how could I make it compare the "nro" number of teams names in order to check if the same name has been written more then once? The idea is, if the same name is written more then once it should give an error and start the scrip again by asking the teams names again. Two more things: With the next function the script stats from top, I mean starts by asking the number of teams to use. Can I make it that it goes directly to asking teams names? And when it checks anc find out that a certain name has been written twice can it produce a message warning that this error happened before asking the teams names again? Many thanks Regards, A.Dias. -- View this message in context: http://r.789695.n4.nabble.com/Help-with-IF-operator-tp3178129p3178129.html Sent from the R help mailing list archive at Nabble.com. From diasandre at gmail.com Thu Jan 6 22:34:31 2011 From: diasandre at gmail.com (ADias) Date: Thu, 6 Jan 2011 13:34:31 -0800 (PST) Subject: [R] Creating a Matrix from a vector with some conditions Message-ID: <1294349671070-3178219.post@n4.nabble.com> Hi Suppose we have an object with strings: A<-c("a","b","c","d") Now I do: B<-matrix(A,4,4, byrow=F) and I get a a a a b b b b c c c c d d d d But what I really want is: a b c d b c d a c d a b d a b c How can I do this? thank you A. Dias -- View this message in context: http://r.789695.n4.nabble.com/Creating-a-Matrix-from-a-vector-with-some-conditions-tp3178219p3178219.html Sent from the R help mailing list archive at Nabble.com. From egregory2007 at yahoo.com Fri Jan 7 04:00:58 2011 From: egregory2007 at yahoo.com (Erik Gregory) Date: Thu, 6 Jan 2011 19:00:58 -0800 (PST) Subject: [R] R not recognized in command line In-Reply-To: References: <4D249120.2000108@gmail.com> Message-ID: <209730.33598.qm@web37403.mail.mud.yahoo.com> Aaditya, I was also having some trouble using RPy2 (spoiler alert: I gave up!) to write a GUI for some R scripts I've written. I found a workaround to integrate R and python without using that module. The idea is: 1. Write R scripts you want to use in python. I have a file with all of the functions I'll need to use. 2. Write python code that writes R code (according to whatever inputs you want...) calling the functions in your script. For example, if the R function I wanted to use was "sum", I'd have users input two numbers in python, and have python create the string "sum(867, 5309)" 3. Open the file with the Rscript using python, and add "sum(867, 5309)" to the end of it. 4. Create the string t = "cd C:\Program Files\R\R-2.11.0\bin&Rscript sum.r" in python, where "sum.r" is a file with the "sum" function, to which we added the string "sum(867, 5309)" using python and "C:\...\bin" is whatever directory your R installation is in. 5. Finally, open the shell as a subprocess using python and send "t" to the shell. This method works if you don't need to see the output of the R script in the python console. If you really want to see that, have R write the output into some file that python can open and open that with python script! I did all this without any experience with python (I'm alright with R), so you may even find some shortcuts. Good Luck, Erik Gregory, Student Assistant, California EPA CSU Sacramento, Mathematics ----- Original Message ---- From: Aaditya Nanduri To: Gabor Grothendieck Cc: r-help at r-project.org Sent: Thu, January 6, 2011 6:23:46 PM Subject: Re: [R] R not recognized in command line I really appreciate all your help but I've already tried everything that has been suggested. I changed the path to every possible combination that leads to an R executable...and nothing seems to work. I've checked to see that Im typing it right. I've also asked my sister to make sure (a fresh set of eyes is always helpful). The only thing that work are navigating to the folder holding the R executable (R_HOME/bin) and the batchscripts. However, this doesnt help me in working with rpy2. On Wed, Jan 5, 2011 at 9:59 AM, Gabor Grothendieck wrote: > On Wed, Jan 5, 2011 at 10:41 AM, Duncan Murdoch > wrote: > > On 11-01-05 8:51 AM, Joshua Wiley wrote: > >> > >> Hi Aaditya, > >> > >> I assume you are running some variant of Windows and by the "prompt in > >> DOS" you are using cmd.exe. > >> > >> Perhaps you are already, but from your examples it looks like either > >> A) you are not in the same directory as R or B) are not adding the > >> path to R in the command. For example, on Windows I always install R > >> under C:\R\ so for me inside cmd.exe: > >> > >> C:\directory> C:\R\R-devel\bin\x64\R > >> > >> [[[R starts here]]] > >> > >> alternately you could switch directories over and then just type "R" > >> at the console: > >> > >> C:\directory> cd C:\R\R-devel\bin\x64\ > >> C:\R\R-devel\bin\x64> R > >> > >> [[[R starts here]]] > >> > >> or since you have set the environment variables: > >> > >> C:\directory> %R_HOME%\bin\x64\R > >> > >> [[[R starts here]]] > >> > >> Alternately, edit the PATH environment variable in Windows and add the > >> path to R (i.e., R_HOME\bin\i386\ or whatever it is for you), and you > >> should be able to just enter "R" at the command prompt and have it > >> start. > > > > Editing the PATH is probably the best approach, but a lot of people get > it > > wrong because of misunderstanding how it works: > > > > - If you change PATH in one process the changes won't propagate > anywhere > > else, and will be lost as soon as you close that process. That could be > a > > cmd window, or an R session, or just about any other process that lets > you > > change environment variables. > > > > - If you want to make global changes to the PATH, you need to do it in > the > > control panel "System|Advanced|Environment variables" entries. > > > > - Often it is good enough to use a more Unix-like approach, and only > make > > the change at startup of the cmd processor. You use the /k option when > > starting cmd if you want to run something on startup. > > > > You can also use Rcmd.bat, R.bat, Rgui.bat, etc. found at > http://batchfiles.googlecode.com > > Just put any you wish to use anywhere on your path and it will work on > all cmd instances and will also work when you install a new version of > R since it looks up R's location in the registry. > > -- > Statistics & Software Consulting > GKX Group, GKX Associates Inc. > tel: 1-877-GKX-GROUP > email: ggrothendieck at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Aaditya Nanduri aaditya.nanduri at gmail.com [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From Eric.Taylor at gov.bc.ca Fri Jan 7 00:13:08 2011 From: Eric.Taylor at gov.bc.ca (Taylor, Eric HLS:EX) Date: Thu, 6 Jan 2011 15:13:08 -0800 Subject: [R] Plotting Factors -- Sorting x-axis Message-ID: <2C27F4FCA119834888BDB11F82F6014F0477C54E@e7mbx05.idir.bcgov> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From frodo.jedi at yahoo.com Thu Jan 6 21:33:04 2011 From: frodo.jedi at yahoo.com (Frodo Jedi) Date: Thu, 6 Jan 2011 12:33:04 -0800 (PST) Subject: [R] Assumptions for ANOVA: the right way to check the normality In-Reply-To: References: <9521.53053.qm@web57907.mail.re3.yahoo.com> <262855.6389.qm@web57903.mail.re3.yahoo.com> <333057.17814.qm@web57904.mail.re3.yahoo.com> Message-ID: <261175.94266.qm@web57907.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jroll at lcog.org Fri Jan 7 04:10:54 2011 From: jroll at lcog.org (LCOG1) Date: Thu, 6 Jan 2011 19:10:54 -0800 (PST) Subject: [R] Dont show zero values in line graph Message-ID: <1294369854437-3178566.post@n4.nabble.com> Hey everyone, Im getting better at plotting my data but cant for the life of me figure out how to show a line graph with missing data that doesnt continue the line down to zero then back up to the remaining values. Consider the following x<-c(1:5,0,0,8:10) y<-1:10 plot(0,0,xlim=c(0,10), ylim=c(0,10),type="n",main="Dont show the bloody 0 values!!") lines(x~y, col="blue", lwd=2,) My data is missing the 6th and 7th values and they come in as NA's so i change them to 0s but then the plot has these ugly lines that dive toward the x axis then back up. I would do bar plots but i need to show multiple sets of data on the same and side by side bars doesnt do it for me. So i need a line graph that starts and stops where 0s or missing values exist. Thoughts? JR -- View this message in context: http://r.789695.n4.nabble.com/Dont-show-zero-values-in-line-graph-tp3178566p3178566.html Sent from the R help mailing list archive at Nabble.com. From kara.mooreoleary at gmail.com Fri Jan 7 03:04:34 2011 From: kara.mooreoleary at gmail.com (karamoo) Date: Thu, 6 Jan 2011 18:04:34 -0800 (PST) Subject: [R] problems with rJava In-Reply-To: References: Message-ID: <1294365874095-3178529.post@n4.nabble.com> Hi All, and Heberto, Did you ever resolve your installation problem with rJava? I have a new windows 7 machine and can't seem to get it installed correctly. I do have Java installed. I download rJava without proble, then: > install.packages("rJava") Installing package(s) into ?C:\Users\Patrick\Documents/R/win-library/2.12? (as ?lib? is unspecified) trying URL 'http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.12/rJava_0.8-8.zip' Content type 'application/zip' length 637125 bytes (622 Kb) opened URL downloaded 622 Kb The downloaded packages are in C:\Users\Patrick\AppData\Local\Temp\Rtmp9zcM7g\downloaded_packages Looks like it downloaded fine to my eyes. But in the next step I get an error: > library(rJava) Error in utils::readRegistry(key, "HLM", 2) : Registry key 'Software\JavaSoft\Java Runtime Environment' not found Error in utils::readRegistry(key, "HLM", 2) : Registry key 'Software\JavaSoft\Java Development Kit' not found Error : .onLoad failed in loadNamespace() for 'rJava', details: call: fun(...) error: JAVA_HOME cannot be found from the Registry Error: package/namespace load failed for 'rJava' I'm not sure how to check where R is looking for rJava vs where it lives, this seems to be an issue in other posts, although they have different errors. I'm a novice still at R. I see a lot of posts on this topic, but few are recent. Help would be much appreciated. Thanks! Kara Moore Evolution and Ecology University of California, Davis, USA -- View this message in context: http://r.789695.n4.nabble.com/problems-with-rJava-tp3050524p3178529.html Sent from the R help mailing list archive at Nabble.com. From nik at nikosalexandris.net Fri Jan 7 04:03:39 2011 From: nik at nikosalexandris.net (Nikos Alexandris) Date: Fri, 7 Jan 2011 05:03:39 +0200 Subject: [R] How to export/save an "mrpp" object? In-Reply-To: <201012230713.28514.nikos.alexandris@felis.uni-freiburg.de> References: <201012220557.18804.nikos.alexandris@felis.uni-freiburg.de> <201012230713.28514.nikos.alexandris@felis.uni-freiburg.de> Message-ID: <201101070503.40394.nik@nikosalexandris.net> Greets (again) :-) I finally ran mrpp tests. I think all is fine but one very important issue: I have no idea how to export/save an "mrpp" object. Tried anything I know and searched the archives but found nothing. Any ideas? Is really copy-pasting the mrpp results the only way? Thank you for your attention, Nikos From f.harrell at vanderbilt.edu Fri Jan 7 05:01:58 2011 From: f.harrell at vanderbilt.edu (Frank Harrell) Date: Thu, 6 Jan 2011 20:01:58 -0800 (PST) Subject: [R] Waaaayy off topic...Statistical methods, pub bias, scientific validity In-Reply-To: <4D2644A5.9040303@witthoft.com> References: <4D2644A5.9040303@witthoft.com> Message-ID: <1294372918262-3178603.post@n4.nabble.com> I was very impressed with Lehrer's article. I look forward to seeing what the rebuttals come up with. The picture that Lehrer paints of the quality of scientific publications is very dark, and it seems to me, quite plausible. Note that Lehrer is the author of "Proust Was a Neuroscientist" which is one of the best non-fiction books I've ever come across. Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Waaaayy-off-topic-Statistical-methods-pub-bias-scientific-validity-tp3177982p3178603.html Sent from the R help mailing list archive at Nabble.com. From rspark at fas.harvard.edu Fri Jan 7 05:38:15 2011 From: rspark at fas.harvard.edu (Rachel Park) Date: Thu, 6 Jan 2011 23:38:15 -0500 Subject: [R] Adjusting MaxNwts in MICE Package Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Fri Jan 7 05:53:04 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 6 Jan 2011 23:53:04 -0500 Subject: [R] Creating a Matrix from a vector with some conditions In-Reply-To: <1294349671070-3178219.post@n4.nabble.com> References: <1294349671070-3178219.post@n4.nabble.com> Message-ID: <60EEB50B-A1D3-41FD-98DC-75CC2C482365@comcast.net> On Jan 6, 2011, at 4:34 PM, ADias wrote: > > Hi > > Suppose we have an object with strings: > > A<-c("a","b","c","d") > > Now I do: > > B<-matrix(A,4,4, byrow=F) > > and I get > > a a a a > b b b b > c c c c > d d d d > > But what I really want is: > > a b c d > b c d a > c d a b > d a b c > > How can I do this? How else? B<-matrix(A,4,4, byrow=TRUE) > > thank you > > A. Dias > -- > View this message in context: http://r.789695.n4.nabble.com/Creating-a-Matrix-from-a-vector-with-some-conditions-tp3178219p3178219.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Fri Jan 7 05:57:00 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 6 Jan 2011 23:57:00 -0500 Subject: [R] Plotting Factors -- Sorting x-axis In-Reply-To: <2C27F4FCA119834888BDB11F82F6014F0477C54E@e7mbx05.idir.bcgov> References: <2C27F4FCA119834888BDB11F82F6014F0477C54E@e7mbx05.idir.bcgov> Message-ID: On Jan 6, 2011, at 6:13 PM, Taylor, Eric HLS:EX wrote: > Hello; > > How do I plot these data in R without the Months being ordered > alphabetically? > > Months Prec > 1 Jan 102.1 > 2 Feb 69.7 > 3 Mar 44.7 > 4 Apr 32.1 > 5 May 24.0 > 6 Jun 18.7 > 7 Jul 14.0 > 8 Aug 20.0 > 9 Sep 32.4 > 10 Oct 58.9 > 11 Nov 94.5 > 12 Dec 108.2 Since they are most likely factor variables (but even if not): dfrm$Months <- factor(dfrm$Months, levels= month.abb) Then the level ordering will be as expected. David Winsemius, MD West Hartford, CT From smckinney at bccrc.ca Fri Jan 7 05:57:03 2011 From: smckinney at bccrc.ca (Steven McKinney) Date: Thu, 6 Jan 2011 20:57:03 -0800 Subject: [R] Dont show zero values in line graph In-Reply-To: <10697_1294373249_1294373249_1294369854437-3178566.post@n4.nabble.com> References: <10697_1294373249_1294373249_1294369854437-3178566.post@n4.nabble.com> Message-ID: How about this? > x<-c(1:5,NA,NA,8:10) > y<-1:10 > plot(0,0,xlim=c(0,10), ylim=c(0,10),type="n",main="Dont show the bloody 0 values!!") > lines(x~y, col="blue", lwd=2, subset = !is.na(x)) NAs let you do lots of useful manipulations in R. Steven McKinney ________________________________________ From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of LCOG1 [jroll at lcog.org] Sent: January 6, 2011 7:10 PM To: r-help at r-project.org Subject: [R] Dont show zero values in line graph Hey everyone, Im getting better at plotting my data but cant for the life of me figure out how to show a line graph with missing data that doesnt continue the line down to zero then back up to the remaining values. Consider the following x<-c(1:5,0,0,8:10) y<-1:10 plot(0,0,xlim=c(0,10), ylim=c(0,10),type="n",main="Dont show the bloody 0 values!!") lines(x~y, col="blue", lwd=2,) My data is missing the 6th and 7th values and they come in as NA's so i change them to 0s but then the plot has these ugly lines that dive toward the x axis then back up. I would do bar plots but i need to show multiple sets of data on the same and side by side bars doesnt do it for me. So i need a line graph that starts and stops where 0s or missing values exist. Thoughts? JR -- View this message in context: http://r.789695.n4.nabble.com/Dont-show-zero-values-in-line-graph-tp3178566p3178566.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From dwinsemius at comcast.net Fri Jan 7 06:01:35 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 7 Jan 2011 00:01:35 -0500 Subject: [R] How to export/save an "mrpp" object? In-Reply-To: <201101070503.40394.nik@nikosalexandris.net> References: <201012220557.18804.nikos.alexandris@felis.uni-freiburg.de> <201012230713.28514.nikos.alexandris@felis.uni-freiburg.de> <201101070503.40394.nik@nikosalexandris.net> Message-ID: On Jan 6, 2011, at 10:03 PM, Nikos Alexandris wrote: > Greets (again) :-) > > I finally ran mrpp tests. I think all is fine but one very important > issue: I > have no idea how to export/save an "mrpp" object. Tried anything I > know and > searched the archives but found nothing. And what happened when you tried what seems like the obvious: save(mrpp_obj, file=) # rm(list=ls() ) # Only uncomment if you are ready for your workspace to clear load("mrpp_store.Rdata") > > Any ideas? Is really copy-pasting the mrpp results the only way? Many of us have no idea what such an object is, since you have not described the packages and functions used to create it. If you want an ASCII version then dput or dump are also available. -- David Winsemius, MD West Hartford, CT From jeroenooms at gmail.com Fri Jan 7 06:08:06 2011 From: jeroenooms at gmail.com (Jeroen Ooms) Date: Thu, 6 Jan 2011 21:08:06 -0800 (PST) Subject: [R] Parsing JSON records to a dataframe Message-ID: <1294376886645-3178646.post@n4.nabble.com> What is the most efficient method of parsing a dataframe-like structure that has been json encoded in record-based format rather than vector based. For example a structure like this: [ {"name":"joe", "gender":"male", "age":41}, {"name":"anna", "gender":"female", "age":23} ] RJSONIO parses this as a list of lists, which I would then have to apply as.data.frame to and append them to an existing dataframe, which is terribly slow. -- View this message in context: http://r.789695.n4.nabble.com/Parsing-JSON-records-to-a-dataframe-tp3178646p3178646.html Sent from the R help mailing list archive at Nabble.com. From raji.sankaran at gmail.com Fri Jan 7 05:48:27 2011 From: raji.sankaran at gmail.com (Raji) Date: Thu, 6 Jan 2011 20:48:27 -0800 (PST) Subject: [R] R packages for R 2.11.1 Message-ID: <1294375707041-3178633.post@n4.nabble.com> Hi , I am using R 2.11.1 . I need to download few packages for the same for Windows.But in CRAN i see the latest packages for R 2.12.1 only. Can you help me out with the locations where i can find the packages for R 2.11.1 Windows zip? Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/R-packages-for-R-2-11-1-tp3178633p3178633.html Sent from the R help mailing list archive at Nabble.com. From peter.langfelder at gmail.com Fri Jan 7 06:10:17 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Thu, 6 Jan 2011 21:10:17 -0800 Subject: [R] Help with IF operator In-Reply-To: <1294345293111-3178129.post@n4.nabble.com> References: <1294345293111-3178129.post@n4.nabble.com> Message-ID: Several possibilities: if (length(teams)!=length(unique(teams)) stop("Some teams are duplicated") or if (max(table(teams))>1) stop("Some teams are duplicated") I'm sure there are others, too. On Thu, Jan 6, 2011 at 12:21 PM, ADias wrote: > > Hi, > > I am with a problem on how to do a comparison of values. My script is as > follows: > > repeat{ > cat("How many teams to use? (to end write 0) ") > nro<-scan(n=1) > if(nro==0)break > cat("write the", nro, "teams names \n") > teams<-readLines(n=nro) > if (teams[1]==teams[2)next > else print(teams) > } > > On this example I only compare teams 1 name with teams 2 name, and if they > are the same the scrip starts again. If I had 10 teams how could I make it > compare the "nro" number of teams names in order to check if the same name > has been written more then once? The idea is, if the same name is written > more then once it should give an error and start the scrip again by asking > the teams names again. > > Two more things: With the next function the script stats from top, I mean > starts by asking the number of teams to use. Can I make it that it goes > directly to asking teams names? > And when it checks anc find out that a certain name has been written twice > can it produce a message warning that this error happened before asking the > teams names again? > > Many thanks > > Regards, > A.Dias. > -- > View this message in context: http://r.789695.n4.nabble.com/Help-with-IF-operator-tp3178129p3178129.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From spencer.graves at structuremonitoring.com Fri Jan 7 06:13:23 2011 From: spencer.graves at structuremonitoring.com (Spencer Graves) Date: Thu, 06 Jan 2011 21:13:23 -0800 Subject: [R] Waaaayy off topic...Statistical methods, pub bias, scientific validity In-Reply-To: <4D2644A5.9040303@witthoft.com> References: <4D2644A5.9040303@witthoft.com> Message-ID: <4D26A0F3.3030902@structuremonitoring.com> Part of the phenomenon can be explained by the natural censorship in what is accepted for publication: Stronger results tend to have less difficulty getting published. Therefore, given that a result is published, it is evident that the estimated magnitude of the effect is in average larger than it is in reality, just by the fact that weaker results are less likely to be published. A study of the literature on this subject might yield an interesting and valuable estimate of the magnitude of this selection bias. A more insidious problem, that may not affect the work of Jonah Lehrer, is political corruption in the way research is funded, with less public and more private funding of research (http://portal.unesco.org/education/en/ev.php-URL_ID=21052&URL_DO=DO_TOPIC&URL_SECTION=201.html). For example, I've heard claims (which I cannot substantiate right now) that cell phone companies allegedly lobbied successfully to block funding for researchers they thought were likely to document health problems with their products. Related claims have been made by scientists in the US Food and Drug Administration that certain therapies were approved on political grounds in spite of substantive questions about the validity of the research backing the request for approval (e.g., www.naturalnews.com/025298_the_FDA_scientists.html). Some of these accusations of political corruption may be groundless. However, as private funding replaces tax money for basic science, we must expect an increase in research results that match the needs of the funding agency while degrading the quality of published research. This produces more research that can not be replicated -- effects that get smaller upon replication. (My wife and I routinely avoid certain therapies recommended by physicians, because the physicians get much of their information on recent drugs from the pharmaceuticals, who have a vested interest in presenting their products in the most positive light.) Spencer On 1/6/2011 2:39 PM, Carl Witthoft wrote: > The next week's New Yorker has some decent rebuttal letters. The case > is hardly as clear-cut as the author would like to believe. > > Carl > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From dwinsemius at comcast.net Fri Jan 7 06:18:20 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 7 Jan 2011 00:18:20 -0500 Subject: [R] R packages for R 2.11.1 In-Reply-To: <1294375707041-3178633.post@n4.nabble.com> References: <1294375707041-3178633.post@n4.nabble.com> Message-ID: <0CEFA3AB-7EB3-44A1-9AF9-AA3C76F5FCC9@comcast.net> On Jan 6, 2011, at 11:48 PM, Raji wrote: > > Hi , > > I am using R 2.11.1 . I need to download few packages for the same > for > Windows.But in CRAN i see the latest packages for R 2.12.1 only. Can > you > help me out with the locations where i can find the packages for R > 2.11.1 > Windows zip? At the bottom of the page of contributes packages is a link to Archives. They may need to be compiled and in Windows that in turn would require Murdoch's Windows RTools. -- David Winsemius, MD West Hartford, CT From jwiley.psych at gmail.com Fri Jan 7 06:23:09 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Thu, 6 Jan 2011 21:23:09 -0800 Subject: [R] R packages for R 2.11.1 In-Reply-To: <1294375707041-3178633.post@n4.nabble.com> References: <1294375707041-3178633.post@n4.nabble.com> Message-ID: I would try using the R 2.12.1 packages first, but if that does not work, then you can go here: http://cran.r-project.org/src/contrib/Archive/ to get older versions of the tar balls. I think you might have to build them yourself. I kind of doubt anyone is keeping entire duplicates of old CRAN packages in all forms. This is not that difficult though. If the packages you want to install do not have other code, I believe you can actually install them without any additional software. If there is compiled code (e.g., C or C++), then I think you'll need RTools as well as a compatible compiler. See the installation manual for details: http://cran.r-project.org/doc/manuals/R-admin.html Cheers, Josh On Thu, Jan 6, 2011 at 8:48 PM, Raji wrote: > > Hi , > > ?I am using R 2.11.1 . I need to download few packages for the same for > Windows.But in CRAN i see the latest packages for R 2.12.1 only. Can you > help me out with the locations where i can find the packages for R 2.11.1 > Windows zip? > > Thanks in advance > -- > View this message in context: http://r.789695.n4.nabble.com/R-packages-for-R-2-11-1-tp3178633p3178633.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From amelia_vettori at yahoo.co.nz Fri Jan 7 07:36:28 2011 From: amelia_vettori at yahoo.co.nz (Amelia Vettori) Date: Thu, 6 Jan 2011 22:36:28 -0800 (PST) Subject: [R] Calculating Returns : (Extremely sorry for earlier incomplete mail) Message-ID: <796439.99888.qm@web121404.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From raji.sankaran at gmail.com Fri Jan 7 07:38:10 2011 From: raji.sankaran at gmail.com (Raji) Date: Thu, 6 Jan 2011 22:38:10 -0800 (PST) Subject: [R] R packages for R 2.11.1 In-Reply-To: <0CEFA3AB-7EB3-44A1-9AF9-AA3C76F5FCC9@comcast.net> References: <1294375707041-3178633.post@n4.nabble.com> <0CEFA3AB-7EB3-44A1-9AF9-AA3C76F5FCC9@comcast.net> Message-ID: <1294382290224-3178698.post@n4.nabble.com> Hi, Thank you,I will try to build the packages with RTools. I found the following links for few packages. rJava http://www.rforge.net/rJava/files RJDBC http://www.rforge.net/RJDBC/files/ Regards, Raji -- View this message in context: http://r.789695.n4.nabble.com/R-packages-for-R-2-11-1-tp3178633p3178698.html Sent from the R help mailing list archive at Nabble.com. From peter.langfelder at gmail.com Fri Jan 7 08:06:44 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Thu, 6 Jan 2011 23:06:44 -0800 Subject: [R] Waaaayy off topic...Statistical methods, pub bias, scientific validity In-Reply-To: <4D26A0F3.3030902@structuremonitoring.com> References: <4D2644A5.9040303@witthoft.com> <4D26A0F3.3030902@structuremonitoring.com> Message-ID: >From a purely statistical and maybe somewhat naive point of view, published p-values should be corrected for the multiple testing that is effectively happening because of the large number of published studies. My experience is also that people will often try several statistical methods to get the most significant p-value but neglect to share that fact with the audience and/or at least attempt to correct the p-values for the selection bias. That being said, it would seem that biomedical sciences do make progress, so some of the published results are presumably correct :) Peter On Thu, Jan 6, 2011 at 9:13 PM, Spencer Graves wrote: > ? ? ?Part of the phenomenon can be explained by the natural censorship in > what is accepted for publication: ?Stronger results tend to have less > difficulty getting published. ?Therefore, given that a result is published, > it is evident that the estimated magnitude of the effect is in average > larger than it is in reality, just by the fact that weaker results are less > likely to be published. ?A study of the literature on this subject might > yield an interesting and valuable estimate of the magnitude of this > selection bias. > > > ? ? ?A more insidious problem, that may not affect the work of Jonah Lehrer, > is political corruption in the way research is funded, with less public and > more private funding of research > (http://portal.unesco.org/education/en/ev.php-URL_ID=21052&URL_DO=DO_TOPIC&URL_SECTION=201.html). > ?For example, I've heard claims (which I cannot substantiate right now) that > cell phone companies allegedly lobbied successfully to block funding for > researchers they thought were likely to document health problems with their > products. ?Related claims have been made by scientists in the US Food and > Drug Administration that certain therapies were approved on political > grounds in spite of substantive questions about the validity of the research > backing the request for approval (e.g., > www.naturalnews.com/025298_the_FDA_scientists.html). ?Some of these > accusations of political corruption may be groundless. ?However, as private > funding replaces tax money for basic science, we must expect an increase in > research results that match the needs of the funding agency while degrading > the quality of published research. ?This produces more research that can not > be replicated -- effects that get smaller upon replication. ?(My wife and I > routinely avoid certain therapies recommended by physicians, because the > physicians get much of their information on recent drugs from the > pharmaceuticals, who have a vested interest in presenting their products in > the most positive light.) > > > ? ? ?Spencer > > > On 1/6/2011 2:39 PM, Carl Witthoft wrote: >> >> The next week's New Yorker has some decent rebuttal letters. ?The case is >> hardly as clear-cut as the author would like to believe. >> >> Carl From noah at smartmediacorp.com Fri Jan 7 08:10:59 2011 From: noah at smartmediacorp.com (Noah Silverman) Date: Thu, 06 Jan 2011 23:10:59 -0800 Subject: [R] Stepwise SVM Variable selection Message-ID: <4D26BC83.4050906@smartmediacorp.com> I have a data set with about 30,000 training cases and 103 variable. I've trained an SVM (using the e1071 package) for a binary classifier {0,1}. The accuracy isn't great. I used a grid search over the C and G parameters with an RBF kernel to find the best settings. I remember that for least squares, R has a nice stepwise function that will try combining subsets of variables to find the optimal result. Clearly, this doesn't exist for SVMs as a built in function. As an experiment, I simply grabbed the first 50 variables and repeated the training/grid search procedure. The results were significantly better. Since the date is VERY noisy, my guess is that eliminating some of the variables eliminated some noise that resulted in better results. With a grid of 100 parameter settings (10 for C, 10 for G) and 106 variables, trying every combination would be prohibitively time consuming. Can anyone suggest an approach to seek the ideal subset of variables for my SVM classifier? Thanks! From mailinglist.honeypot at gmail.com Fri Jan 7 08:34:10 2011 From: mailinglist.honeypot at gmail.com (Steve Lianoglou) Date: Fri, 7 Jan 2011 02:34:10 -0500 Subject: [R] Stepwise SVM Variable selection In-Reply-To: <4D26BC83.4050906@smartmediacorp.com> References: <4D26BC83.4050906@smartmediacorp.com> Message-ID: Hi, On Fri, Jan 7, 2011 at 2:10 AM, Noah Silverman wrote: > I have a data set with about 30,000 training cases and 103 variable. > > I've trained an SVM (using the e1071 package) for a binary classifier {0,1}. > ?The accuracy isn't great. > > I used a grid search over the C and G parameters with an RBF kernel to find > the best settings. > > I remember that for least squares, R has a nice stepwise function that will > try combining subsets of variables to find the optimal result. ?Clearly, > this doesn't exist for SVMs as a built in function. > > As an experiment, I simply grabbed the first 50 variables and repeated the > training/grid search procedure. ?The results were significantly better. > ?Since the date is VERY noisy, my guess is that eliminating some of the > variables eliminated some noise that resulted in better results. > > With a grid of 100 parameter settings (10 for C, 10 for G) and 106 > variables, trying every combination would be prohibitively time consuming. > > Can anyone suggest an approach to seek the ideal subset of variables for my > SVM classifier? Sounds like a job for the types of approaches found in the penalizedSVM package: http://cran.r-project.org/web/packages/penalizedSVM/index.html -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact From lcn918 at gmail.com Fri Jan 7 08:46:33 2011 From: lcn918 at gmail.com (lcn) Date: Fri, 7 Jan 2011 15:46:33 +0800 Subject: [R] JRI & plot( ) In-Reply-To: <020f01cbadd5$77537ab0$65fa7010$@struq.com> References: <020f01cbadd5$77537ab0$65fa7010$@struq.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From noah at smartmediacorp.com Fri Jan 7 08:52:32 2011 From: noah at smartmediacorp.com (Noah Silverman) Date: Thu, 06 Jan 2011 23:52:32 -0800 Subject: [R] Stepwise SVM Variable selection In-Reply-To: References: <4D26BC83.4050906@smartmediacorp.com> Message-ID: <4D26C640.9000308@smartmediacorp.com> I'll give it a try, Thanks! -N On 1/6/11 11:34 PM, Steve Lianoglou wrote: > Hi, > > On Fri, Jan 7, 2011 at 2:10 AM, Noah Silverman wrote: >> I have a data set with about 30,000 training cases and 103 variable. >> >> I've trained an SVM (using the e1071 package) for a binary classifier {0,1}. >> The accuracy isn't great. >> >> I used a grid search over the C and G parameters with an RBF kernel to find >> the best settings. >> >> I remember that for least squares, R has a nice stepwise function that will >> try combining subsets of variables to find the optimal result. Clearly, >> this doesn't exist for SVMs as a built in function. >> >> As an experiment, I simply grabbed the first 50 variables and repeated the >> training/grid search procedure. The results were significantly better. >> Since the date is VERY noisy, my guess is that eliminating some of the >> variables eliminated some noise that resulted in better results. >> >> With a grid of 100 parameter settings (10 for C, 10 for G) and 106 >> variables, trying every combination would be prohibitively time consuming. >> >> Can anyone suggest an approach to seek the ideal subset of variables for my >> SVM classifier? > Sounds like a job for the types of approaches found in the penalizedSVM package: > > http://cran.r-project.org/web/packages/penalizedSVM/index.html > > -steve > From dieter.menne at menne-biomed.de Fri Jan 7 09:05:10 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Fri, 7 Jan 2011 00:05:10 -0800 (PST) Subject: [R] Parsing JSON records to a dataframe In-Reply-To: <1294376886645-3178646.post@n4.nabble.com> References: <1294376886645-3178646.post@n4.nabble.com> Message-ID: <1294387510403-3178753.post@n4.nabble.com> Jeroen Ooms wrote: > > What is the most efficient method of parsing a dataframe-like structure > that has been json encoded in record-based format rather than vector > based. For example a structure like this: > > [ {"name":"joe", "gender":"male", "age":41}, {"name":"anna", > "gender":"female", "age":23} ] > > RJSONIO parses this as a list of lists, which I would then have to apply > as.data.frame to and append them to an existing dataframe, which is > terribly slow. > > unlist is pretty fast. The solution below assumes that you know how your structure is, so it is not very flexible, but it should show you that the conversion to data.frame is not the bottleneck. # json library(RJSONIO) # [ {"name":"joe", "gender":"male", "age":41}, # {"name":"anna", "gender":"female", "age":23} ] n = 300000 d = data.frame(name=rep(c("joe","anna"),n), gender=rep(c("male","female"),n), age = rep(c("23","41"),n)) dj = toJSON(d) system.time(d1 <- fromJSON(dj)) # user system elapsed # 4.06 0.26 4.32 system.time( dd <- data.frame( name = unlist(d1$name), gender = unlist(d1$gender), age=as.numeric(unlist(d1$age))) ) # user system elapsed # 1.13 0.05 1.18 -- View this message in context: http://r.789695.n4.nabble.com/Parsing-JSON-records-to-a-dataframe-tp3178646p3178753.html Sent from the R help mailing list archive at Nabble.com. From ripley at stats.ox.ac.uk Fri Jan 7 09:08:28 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Fri, 7 Jan 2011 08:08:28 +0000 (GMT) Subject: [R] Accessing data via url In-Reply-To: <352569.97781.qm@web38406.mail.mud.yahoo.com> References: <352569.97781.qm@web38406.mail.mud.yahoo.com> Message-ID: ?read.table says ?file? can also be a complete URL. This is implemented by url(): see the section on URLs on its help page. You haven't followed the posting guide and told us your OS, and what the section says does depend on the OS. On Thu, 6 Jan 2011, John Kane wrote: > # Can anyone suggest why this works > > datafilename <- "http://personality-project.org/r/datasets/maps.mixx.epi.bfi.data" > person.data <- read.table(datafilename,header=TRUE) > > # but this does not? > > dd <- "https://sites.google.com/site/jrkrideau/home/general-stores/trees.txt" > treedata <- read.table(dd, header=TRUE) > > =================================================================== > > Error in file(file, "rt") : cannot open the connection > In addition: Warning message: > In file(file, "rt") : unsupported URL scheme > > # I can access both through a hyperlink in OOO Calc. t > # Thanks -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From ripley at stats.ox.ac.uk Fri Jan 7 09:21:38 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Fri, 7 Jan 2011 08:21:38 +0000 (GMT) Subject: [R] R packages for R 2.11.1 In-Reply-To: References: <1294375707041-3178633.post@n4.nabble.com> Message-ID: On Thu, 6 Jan 2011, Joshua Wiley wrote: > I would try using the R 2.12.1 packages first, but if that does not On 32-bit Windows this will not work if compiled code is involved: both the compiler and the package layout changed at 2.12.0. However, binary packages for 2.11.x are still being run through the autobuiilder and are available on CRAN if they were built successfully (increasingly many are not). See the summary table at http://cran.r-project.org/bin/windows/contrib/checkSummaryWin.html Nevertheless (as the posting guide makes clear) the R developers do not support obsolete versions of R, so you are advised to update to R 2.12.1. > work, then you can go here: > http://cran.r-project.org/src/contrib/Archive/ > to get older versions of the tar balls. I think you might have to > build them yourself. I kind of doubt anyone is keeping entire > duplicates of old CRAN packages in all forms. This is not that > difficult though. If the packages you want to install do not have > other code, I believe you can actually install them without any > additional software. If there is compiled code (e.g., C or C++), then > I think you'll need RTools as well as a compatible compiler. See the > installation manual for details: > > http://cran.r-project.org/doc/manuals/R-admin.html > > Cheers, > > Josh > > > On Thu, Jan 6, 2011 at 8:48 PM, Raji wrote: >> >> Hi , >> >> ?I am using R 2.11.1 . I need to download few packages for the same for >> Windows.But in CRAN i see the latest packages for R 2.12.1 only. Can you >> help me out with the locations where i can find the packages for R 2.11.1 >> Windows zip? > -- > Joshua Wiley > Ph.D. Student, Health Psychology > University of California, Los Angeles > http://www.joshuawiley.com/ -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From dieter.menne at menne-biomed.de Fri Jan 7 09:24:19 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Fri, 7 Jan 2011 00:24:19 -0800 (PST) Subject: [R] Accessing data via url In-Reply-To: <352569.97781.qm@web38406.mail.mud.yahoo.com> References: <352569.97781.qm@web38406.mail.mud.yahoo.com> Message-ID: <1294388659726-3178773.post@n4.nabble.com> John Kane-2 wrote: > > # Can anyone suggest why this works > > datafilename <- > "http://personality-project.org/r/datasets/maps.mixx.epi.bfi.data" > person.data <- read.table(datafilename,header=TRUE) > > # but this does not? > > dd <- > "https://sites.google.com/site/jrkrideau/home/general-stores/trees.txt" > treedata <- read.table(dd, header=TRUE) > > =================================================================== > > Error in file(file, "rt") : cannot open the connection > Your original file is no longer there, but when I try RCurl with a png file that is present, I get a certificate error: Dieter -------- library(RCurl) sessionInfo() dd <- "https://sites.google.com/site/jrkrideau/home/general-stores/history.png" x = getBinaryURL(dd) ------------- > sessionInfo() R version 2.12.1 (2010-12-16) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C [5] LC_TIME=German_Germany.1252 attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] RCurl_1.5-0.1 bitops_1.0-4.1 loaded via a namespace (and not attached): [1] tools_2.12.1 > dd <- > "https://sites.google.com/site/jrkrideau/home/general-stores/history.png" > x = getBinaryURL(dd) Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) : SSL certificate problem, verify that the CA cert is OK. Details: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed -- View this message in context: http://r.789695.n4.nabble.com/Accessing-data-via-url-tp3178094p3178773.html Sent from the R help mailing list archive at Nabble.com. From ripley at stats.ox.ac.uk Fri Jan 7 09:30:06 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Fri, 7 Jan 2011 08:30:06 +0000 (GMT) Subject: [R] problems with rJava In-Reply-To: <1294365874095-3178529.post@n4.nabble.com> References: <1294365874095-3178529.post@n4.nabble.com> Message-ID: On Thu, 6 Jan 2011, karamoo wrote: > > Hi All, and Heberto, > > Did you ever resolve your installation problem with rJava? > > I have a new windows 7 machine and can't seem to get it installed correctly. > I do have Java installed. I download rJava without proble, then: As you didn't follow the posting guide, we don't know if this is 32- or 64-bit R on 32- or 64-bit Windows. If you are running 64-bit R you need 64-bit Java, and similarly for 32-bit. The message you show indicates that you do not have the appropriate Java installed. As I have pointed out already this week, rJava has its own mailing list, so please follow up there. > >> install.packages("rJava") > Installing package(s) into ?C:\Users\Patrick\Documents/R/win-library/2.12? > (as ?lib? is unspecified) > trying URL > 'http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.12/rJava_0.8-8.zip' > Content type 'application/zip' length 637125 bytes (622 Kb) > opened URL > downloaded 622 Kb > > > The downloaded packages are in > C:\Users\Patrick\AppData\Local\Temp\Rtmp9zcM7g\downloaded_packages > > Looks like it downloaded fine to my eyes. But in the next step I get an > error: > >> library(rJava) > Error in utils::readRegistry(key, "HLM", 2) : > Registry key 'Software\JavaSoft\Java Runtime Environment' not found > Error in utils::readRegistry(key, "HLM", 2) : > Registry key 'Software\JavaSoft\Java Development Kit' not found > Error : .onLoad failed in loadNamespace() for 'rJava', details: > call: fun(...) > error: JAVA_HOME cannot be found from the Registry > Error: package/namespace load failed for 'rJava' > > I'm not sure how to check where R is looking for rJava vs where it lives, > this seems to be an issue in other posts, although they have different > errors. > > I'm a novice still at R. I see a lot of posts on this topic, but few are > recent. Help would be much appreciated. > > > Thanks! > > Kara Moore > Evolution and Ecology > University of California, Davis, USA -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From r.m.krug at gmail.com Fri Jan 7 09:58:59 2011 From: r.m.krug at gmail.com (Rainer M Krug) Date: Fri, 07 Jan 2011 09:58:59 +0100 Subject: [R] algorithm help In-Reply-To: References: Message-ID: <4D26D5D3.70608@gmail.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 01/06/2011 11:57 PM, (Ted Harding) wrote: > On 06-Jan-11 22:16:38, array chip wrote: >> Hi, I am seeking help on designing an algorithm to identify the >> locations of stretches of 1s in a vector of 0s and 1s. Below is >> an simple example: >> >>> dat<-as.data.frame(cbind(a=c(F,F,T,T,T,T,F,F,T,T,F,T,T,T,T,F,F,F,F,T) >> ,b=c(4,12,13,16,18,20,28,30,34,46,47,49,61,73,77,84,87,90,95,97))) >> >>> dat >> a b >> 1 0 4 >> 2 0 12 >> 3 1 13 >> 4 1 16 >> 5 1 18 >> 6 1 20 >> 7 0 28 >> 8 0 30 >> 9 1 34 >> 10 1 46 >> 11 0 47 >> 12 1 49 >> 13 1 61 >> 14 1 73 >> 15 1 77 >> 16 0 84 >> 17 0 87 >> 18 0 90 >> 19 0 95 >> 20 1 97 >> >> In this dataset, "b" is sorted and denotes the location for each >> number in "a". >> So I would like to find the starting & ending locations for each >> stretch of 1s within "a", also counting the number of 1s in each >> stretch as well. >> Hope the results from the algorithm would be: >> >> stretch start end No.of.1s >> 1 13 20 4 >> 2 34 46 2 >> 3 49 77 4 >> 4 97 97 1 >> >> I can imagine using for loops can do the job, but I feel it's not a >> clever way to do this. Is there an efficient algorithm that can do >> this fast? >> >> Thanks for any suggestions. >> John > > The basic information you need can be got using rle() ("run length > encoding"). See '?rle'. In your example: > > rle(dat$a) > # Run Length Encoding > # lengths: int [1:8] 2 4 2 2 1 4 4 1 > # values : num [1:8] 0 1 0 1 0 1 0 1 > ## Note: F -> 0, T -> 1 > > The following has a somewhat twisted logic at the end, and may > be flawed, but you can probably adapt it! > > L <- rle(dat$a)$lengths > V <- rle(dat$a)$values > pos <- c(1,cumsum(L)) > V1 <- c(-1,V) > 1+pos[V1==0] > # [1] 3 9 12 20 > ## Positions in the series dat$a where each run of "T" (i.e. 1) > ## starts A different approach would be to use the diff() function: Where > diff(dat$a) [1] 0 1 0 0 0 -1 0 1 0 -1 1 0 0 0 -1 0 0 0 1 is not equal 0, the value is changing from 0 to 1 or one to 0. The indices of the first new value can be found by: > which(diff(dat$a)!=0) + 1 [1] 3 7 9 11 12 16 20 where it is changing from 0 to 1 is at > which(diff(dat$a)==1) + 1 [1] 3 9 12 20 where it is changing from 1 to 0 is at > which(diff(dat$a)==-1) + 1 [1] 7 11 16 By taking into consideration if the first value and the last values are 0 or 1, you can calculate the length. Cheers, Rainer > > Hoping this helps, > Ted. > > -------------------------------------------------------------------- > E-Mail: (Ted Harding) > Fax-to-email: +44 (0)870 094 0861 > Date: 06-Jan-11 Time: 22:57:44 > ------------------------------ XFMail ------------------------------ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. - -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: +33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: Rainer at krugs.de Skype: RMkrug -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk0m1dMACgkQoYgNqgF2egoQbACcCB3iFQ6SKYfL4KVX8AMAN9Gp 1awAn0Z+8KXnOmwCLu61gihc8xZIT++j =O+xA -----END PGP SIGNATURE----- From savicky at praha1.ff.cuni.cz Fri Jan 7 08:59:05 2011 From: savicky at praha1.ff.cuni.cz (Petr Savicky) Date: Fri, 7 Jan 2011 08:59:05 +0100 Subject: [R] Creating a Matrix from a vector with some conditions In-Reply-To: <1294349671070-3178219.post@n4.nabble.com> References: <1294349671070-3178219.post@n4.nabble.com> Message-ID: <20110107075905.GA31502@praha1.ff.cuni.cz> On Thu, Jan 06, 2011 at 01:34:31PM -0800, ADias wrote: > > Hi > > Suppose we have an object with strings: > > A<-c("a","b","c","d") > > Now I do: > > B<-matrix(A,4,4, byrow=F) > > and I get > > a a a a > b b b b > c c c c > d d d d > > But what I really want is: > > a b c d > b c d a > c d a b > d a b c > > How can I do this? Try the following A <- c("a","b","c","d") B <- matrix(A, 5, 4)[1:4, ] # [,1] [,2] [,3] [,4] #[1,] "a" "b" "c" "d" #[2,] "b" "c" "d" "a" #[3,] "c" "d" "a" "b" #[4,] "d" "a" "b" "c" Petr Savicky. From savicky at praha1.ff.cuni.cz Fri Jan 7 09:37:11 2011 From: savicky at praha1.ff.cuni.cz (Petr Savicky) Date: Fri, 7 Jan 2011 09:37:11 +0100 Subject: [R] Help with IF operator In-Reply-To: <1294345293111-3178129.post@n4.nabble.com> References: <1294345293111-3178129.post@n4.nabble.com> Message-ID: <20110107083711.GA1622@praha1.ff.cuni.cz> On Thu, Jan 06, 2011 at 12:21:33PM -0800, ADias wrote: > > Hi, > > I am with a problem on how to do a comparison of values. My script is as > follows: > > repeat{ > cat("How many teams to use? (to end write 0) ") > nro<-scan(n=1) > if(nro==0)break > cat("write the", nro, "teams names \n") > teams<-readLines(n=nro) > if (teams[1]==teams[2)next > else print(teams) > } > > On this example I only compare teams 1 name with teams 2 name, and if they > are the same the scrip starts again. If I had 10 teams how could I make it > compare the "nro" number of teams names in order to check if the same name > has been written more then once? The idea is, if the same name is written > more then once it should give an error and start the scrip again by asking > the teams names again. > > Two more things: With the next function the script stats from top, I mean > starts by asking the number of teams to use. Can I make it that it goes > directly to asking teams names? Consider also using readline(), which reads a single line, and %in% operator to compare the new name to the previous ones immediately. nro <- as.numeric(readline("no of teams ")) teams <- rep(NA, times=nro) for (i in seq(length=nro)) { repeat { current <- readline(paste("team", i, "")) if (current %in% teams) { cat("error - repeated name\n") } else { break } } teams[i] <- current } Petr Savicky. From maj at waikato.ac.nz Fri Jan 7 10:38:16 2011 From: maj at waikato.ac.nz (Murray Jorgensen) Date: Fri, 07 Jan 2011 22:38:16 +1300 Subject: [R] Converting Fortran or C++ etc to R In-Reply-To: References: <4D23B526.2020906@stats.waikato.ac.nz> Message-ID: <4D26DF08.6000408@waikato.ac.nz> I will wind this thread up with some happy comments. I have indeed succeeded in constructing an R program to do the same thing as my Fortran program for an EM algorithm. I have not done timings yet but it seems to run acceptably fast for my purposes. The key code to be replaced was the E and the M steps of the algorithm. I decided to try to replace all the loops with matrix operations such as %*%, t(), crossprod(), tcrossprod(). Other operations that I used were of the form A + v where dim(A) = c(a, b) and length(v) = a. Here the vector v operates term by term down columns, recycling for each new column. [ *, - and / also work similarly.] I was relived that matrices were as far as I needed to go, and I had visions of having to use tensor products of higher dimensioned arrays. Fortunately it did not come to that. I didn't actually translate from F to R. The original is itself a translation of my underlying maths, and it was easier to translate the maths into R directly. I preserved the form of my Fortran input and output files so that I will be able to run either version on the same files. As I mentioned earlier the main point of doing all this is so that I may try out some variants of the program. I expect this will be much easier to do in R! Thanks to all who replied. Murray -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: maj at waikato.ac.nz Fax 7 838 4155 Phone +64 7 838 4773 wk Home +64 7 825 0441 Mobile 021 0200 8350 From research at georgruss.de Fri Jan 7 11:28:46 2011 From: research at georgruss.de (Georg =?iso-8859-15?B?UnXf?=) Date: Fri, 7 Jan 2011 11:28:46 +0100 Subject: [R] Stepwise SVM Variable selection In-Reply-To: <4D26BC83.4050906@smartmediacorp.com> References: <4D26BC83.4050906@smartmediacorp.com> Message-ID: <20110107102845.GH19058@greode> On 06/01/11 23:10:59, Noah Silverman wrote: > I have a data set with about 30,000 training cases and 103 variable. > I've trained an SVM (using the e1071 package) for a binary classifier > {0,1}. The accuracy isn't great. I used a grid search over the C and G > parameters with an RBF kernel to find the best settings. [...] > > Can anyone suggest an approach to seek the ideal subset of variables for > my SVM classifier? The standard feature selection stuff (backward/forward etc.) is probably ruled out by the time it takes to compute all the sets and subsets. What you could try is the following: First, do a cross-validation setup: split up your data set into a training and testing set (ratio 0.9 / 0.1 or so). Second, train your SVM on the training set (try conservative parameters first). Third, have your trained SVM classify the test set and compute the classification error. Fourth, iterate over all variables and do the following: a) choose one variable and permute its values (only) in the test set b) have your trained SVM (from step 2) classify this test set and measure the classification error c) repeat a) and b) a (high) number of times to be significant d) go to next variable Fifth, you can get an impression of the importance that one variable has by comparing the errors generated on the permuted test set for each variable with the non-permuted test set classification error. If the permutation of one variable drastically increases the classification error, the variable is probably important. Sixth: repeat the cross-validation / random sampling a number of times to be significant. This is more like an ad-hoc approach and there are some pitfalls, but the idea is easily explained and can also be carried over to any other regression model with cross-validation. The computational burden in SVM is assumed to be the training and not the prediction step and you only need a relatively low number of training runs (sixth step) here. Regards, Georg. -- Research Assistant Otto-von-Guericke-Universit?t Magdeburg research at georgruss.de http://research.georgruss.de From petr.pikal at precheza.cz Fri Jan 7 11:52:59 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Fri, 7 Jan 2011 11:52:59 +0100 Subject: [R] Odp: Calculating Returns : (Extremely sorry for earlier incomplete mail) In-Reply-To: <796439.99888.qm@web121404.mail.ne1.yahoo.com> References: <796439.99888.qm@web121404.mail.ne1.yahoo.com> Message-ID: Hi Your code is quite complicated and I get an error spot_returns_table <- lapply(1:nrow(trans), function(z) with(trans[z, ], spot_trans(currency_trans=trans$currency_transacted))) Error in if (currency_trans == "USD") { : argument is of length zero It seems to me that you do not know what is your code doing. The warnings are from the fact that the currency_trans value you feed to spot_trans function is longer than one and if function needs an input of only one logical value. Maybe you could use debug and see what are values of your variables during computation but I believe that better is to use more convenient input objects together with *apply or aggregate or basic math could be better solution. rate1 USD GBP EURO CHF AUD 1 112.05 171.52 42.71 41.50 109.55 2 112.90 168.27 42.68 41.47 102.52 3 110.85 169.03 41.86 42.84 114.91 4 109.63 169.64 44.71 43.44 122.48 5 108.08 169.29 44.14 43.69 122.12 6 111.23 169.47 44.58 42.30 123.96 7 112.49 170.90 41.07 42.05 100.36 8 108.87 168.69 42.23 41.23 110.19 9 109.33 170.90 44.55 42.76 121.58 10 111.88 169.96 41.12 43.79 103.46 log(rate1[-1,]/rate1[-nrow(rate1),]) Is this what you want? Regards Petr r-help-bounces at r-project.org napsal dne 07.01.2011 07:36:28: > > > > > > > > > > Dear R forum helpers, > > I am extremely sorry for the receipt of my incomplete mail yesterday. There > was connectivity problem at my end and so I chose to send the mail through my > cell, only to realize today about the way mail has been transmitted. I am > again sending my complete mail through regular channel and sincerely apologize > for the inconvenience caused. > > > ## Here is my actual mail > > > Dear R forum helpers, > > I have following data > > trans <- data.frame(currency = c("EURO", "USD", "USD", "GBP", "USD", "AUD"), > position_amt = c(10000, 25000, 20000, 15000, 22000, 30000)) > > date <- c("12/31/2010", "12/30/2010", "12/29/2010", "12/28/2010", "12/27/ > 2010", "12/24/2010", "12/23/2010", "12/22/2010", "12/21/2010", "12/20/2010") > USD <- c(112.05, 112.9, 110.85, 109.63, 108.08, 111.23, 112.49, 108.87, 109.33, 111.88) > GBP <- c(171.52, 168.27,169.03, 169.64, 169.29, 169.47, 170.9, 168.69, 170.9, 169.96) > EURO <- c(42.71, 42.68, 41.86, 44.71, 44.14, 44.58, 41.07, 42.23, 44.55, 41.12) > CHF <- c(41.5, 41.47, 42.84, 43.44, 43.69, 42.3, 42.05, 41.23, 42.76, 43.79) > AUD <- c(109.55, 102.52, 114.91, 122.48, 122.12, 123.96, 100.36, 110.19, 121. > 58, 103.46) > > These are the exchange rates and I am trying calculating the returns. I am > giving only a small portion of actual data as I can't send this as an > attachment. I am using function as I want to generalize the code for any portfolio. > > > # __________________________________________________ > > # My Code > > trans <- read.table('transactions.csv', header=TRUE, sep=",", > na.strings="NA", dec=".", strip.white=TRUE) > # reading as table. > > #currency <- read.table('currency.csv') > > #date <- currency$date > #USD = currency$USD > #GBP = currency$GBP > #EURO = currency$EURO > #CHF = currency$CHF > #AUD = currency$AUD > > # _________________________________________________________ > > # CREATION of Function. I am using function as no of transactions is not constant. > > spot_trans = function(currency_trans) > > { > > if (currency_trans == "USD") > {rate = USD} > > # So if I am dealing with TRANSACTION "USD", I am selecting the USD exchange rates. > > if (currency_trans == "GBP") > {rate = GBP} > > if (currency_trans == "EURO") > {rate = EURO} > > if (currency_trans == "CHF") > {rate = CHF} > > if (currency_trans == "AUD") > {rate = AUD} > > # ________________________________________________ > > # CURRENCY Rate RETURNS i.e. lob(todays rate / yesterday rate) and the data is > in descending "Date" order > > currency_rate_returns = NULL > for (i in 1:(length(rate)-1)) # if there are 10 elements, total no of returns = 9 > > { > currency_rate_returns[i] = log(rate[i]/rate[i+1]) > } > > currency_rate_returns > > return(data.frame(returns = currency_rate_returns)) > > } > > # _______________________________________________ > > spot_returns_table <- lapply(1:nrow(trans), function(z) with(trans[z, ], > spot_trans(currency_trans=trans$currency_transacted))) > > spot_returns_table > > This generates the output as given below with 30 warnings. Also, as there are > six transactions, 6 outputs are generated but the output in all pertains only > to the first transacations i.e. 6 times returns are generated for the first > transaction "EURO" > > > warnings() > Warning messages: > 1: In if (currency_trans == "USD") { ... : > the condition has length > 1 and only the first element will be used > 2: In if (currency_trans == "GBP") { ... : > the condition has length > 1 and only the first element will be used > 3: In if (currency_trans == "EURO") { ... : > the condition has length > 1 and only the first element will be used > > .... and so on > > The output is as given below. > > > spot_returns_table > [[1]] > spot_returns > 1 0.0007026584 > 2 0.0193997094 > 3 -0.0658664732 > 4 0.0128307894 > 5 -0.0099189271 > 6 0.0820074000 > 7 -0.0278529410 > 8 -0.0534812850 > 9 0.0801175328 > 10 -0.0710983059 > > [[2]] > spot_returns > 1 0.0007026584 > 2 0.0193997094 > 3 -0.0658664732 > 4 0.0128307894 > 5 -0.0099189271 > 6 0.0820074000 > 7 -0.0278529410 > 8 -0.0534812850 > 9 0.0801175328 > 10 -0.0710983059 > > [[3]] > spot_returns > 1 0.0007026584 > 2 0.0193997094 > 3 -0.0658664732 > 4 0.0128307894 > 5 -0.0099189271 > 6 0.0820074000 > .................... > .................... > > and so on. > > Kindly guide as if there is only one transaction i.e. I am dealing with only > one currency, code runs excellently. > > Thanking in advance and once again apologize for the inconvenience caused. > > Amelia Vettori > > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Fri Jan 7 12:07:41 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Fri, 07 Jan 2011 12:07:41 +0100 Subject: [R] R packages for R 2.11.1 In-Reply-To: References: <1294375707041-3178633.post@n4.nabble.com> Message-ID: <4D26F3FD.3040501@statistik.tu-dortmund.de> On 07.01.2011 09:21, Prof Brian Ripley wrote: > On Thu, 6 Jan 2011, Joshua Wiley wrote: > >> I would try using the R 2.12.1 packages first, but if that does not > > On 32-bit Windows this will not work if compiled code is involved: both > the compiler and the package layout changed at 2.12.0. > > However, binary packages for 2.11.x are still being run through the > autobuiilder and are available on CRAN if they were built successfully > (increasingly many are not). See the summary table at > http://cran.r-project.org/bin/windows/contrib/checkSummaryWin.html Right, and if they were not built successfully, then the latest version that passed the checks is still available there (which is the one you want to use anyway then). Let me add that just typing install.packages("packagename") should do the trick and will use the 2.11 package repository at your-CRAN-mirror/bin/windows/contrib/2.11 Best, Uwe > Nevertheless (as the posting guide makes clear) the R developers do not > support obsolete versions of R, so you are advised to update to R 2.12.1. > >> work, then you can go here: >> http://cran.r-project.org/src/contrib/Archive/ >> to get older versions of the tar balls. I think you might have to >> build them yourself. I kind of doubt anyone is keeping entire >> duplicates of old CRAN packages in all forms. This is not that >> difficult though. If the packages you want to install do not have >> other code, I believe you can actually install them without any >> additional software. If there is compiled code (e.g., C or C++), then >> I think you'll need RTools as well as a compatible compiler. See the >> installation manual for details: >> >> http://cran.r-project.org/doc/manuals/R-admin.html >> >> Cheers, >> >> Josh >> >> >> On Thu, Jan 6, 2011 at 8:48 PM, Raji wrote: >>> >>> Hi , >>> >>> I am using R 2.11.1 . I need to download few packages for the same for >>> Windows.But in CRAN i see the latest packages for R 2.12.1 only. Can you >>> help me out with the locations where i can find the packages for R >>> 2.11.1 >>> Windows zip? > >> -- >> Joshua Wiley >> Ph.D. Student, Health Psychology >> University of California, Los Angeles >> http://www.joshuawiley.com/ > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From savicky at praha1.ff.cuni.cz Fri Jan 7 11:10:32 2011 From: savicky at praha1.ff.cuni.cz (Petr Savicky) Date: Fri, 7 Jan 2011 11:10:32 +0100 Subject: [R] Match numeric vector against rows in a matrix? In-Reply-To: <0AF8C77E-FD6B-4CFA-BA18-2430F16F56E9@gmail.com> References: <0AF8C77E-FD6B-4CFA-BA18-2430F16F56E9@gmail.com> Message-ID: <20110107101032.GA14450@praha1.ff.cuni.cz> On Wed, Jan 05, 2011 at 07:16:47PM +0000, Kevin Ummel wrote: > Two posts in one day is not a good day...and this question seems like it should have an obvious answer: > > I have a matrix where rows are unique combinations of 1's and 0's: > > > combs=as.matrix(expand.grid(c(0,1),c(0,1))) > > combs > Var1 Var2 > [1,] 0 0 > [2,] 1 0 > [3,] 0 1 > [4,] 1 1 > > I want a single function that will give the row index containing an exact match with vector x: > > > x=c(0,1) > > The solution needs to be applied many times, so I need something quick -- I was hoping a base function would do it, but I'm drawing a blank. If the matrix can have different number of columns, then also the following can be used combs <- as.matrix(expand.grid(c(0,1),c(0,1),c(0,1))) x <- c(0,1,1) which(rowSums(combs != rep(x, each=nrow(combs))) == 0) # [1] 7 Petr Savicky. From vdimitrakas at gmail.com Fri Jan 7 11:37:59 2011 From: vdimitrakas at gmail.com (Vassilis) Date: Fri, 7 Jan 2011 02:37:59 -0800 (PST) Subject: [R] weighed mean of a data frame row-by-row In-Reply-To: <1294321052216-3177436.post@n4.nabble.com> References: <1294320798093-3177421.post@n4.nabble.com> <1294321052216-3177436.post@n4.nabble.com> Message-ID: <1294396679563-3178912.post@n4.nabble.com> Thanks for the help guys! For my purpose I think that rharlow2's answer, i.e. the `rowSums' function is the most appropriate since it also takes care of the NAs. Best, Vassilis -- View this message in context: http://r.789695.n4.nabble.com/weighed-mean-of-a-data-frame-row-by-row-tp3177421p3178912.html Sent from the R help mailing list archive at Nabble.com. From jon.skoien at jrc.ec.europa.eu Fri Jan 7 12:40:10 2011 From: jon.skoien at jrc.ec.europa.eu (Jon Olav Skoien) Date: Fri, 07 Jan 2011 12:40:10 +0100 Subject: [R] Cross validation for Ordinary Kriging In-Reply-To: <523289.24057.qm@web30501.mail.mud.yahoo.com> References: <523289.24057.qm@web30501.mail.mud.yahoo.com> Message-ID: <4D26FB9A.8050008@jrc.ec.europa.eu> Pearl, The error suggests that there is something wrong with x2, and that there is a difference between the row names of the coordinates and the data. If you call str(x2) see if the first element of @coords is different from NULL, as this can cause some problems when cross-validating. If it is, try to figure out why. You can also set the row.names equal to NULL directly: row.names(x2 at coords) = NULL although I dont think such manipulation of the slots of an object is usually recommended. Cheers, Jon BTW, you will usually get more response to questions about spatial data handling using the list r-sig-geo (https://stat.ethz.ch/mailman/listinfo/r-sig-geo) On 1/6/2011 4:00 PM, pearl may dela cruz wrote: > ear ALL, > > The last part of my thesis analysis is the cross validation. Right now I am > having difficulty using the cross validation of gstat. Below are my commands > with the tsport_ace as the variable: > > nfold<- 3 > part<- sample(1:nfold, 69, replace = TRUE) > sel<- (part != 1) > m.model<- x2[sel, ] > m.valid<- x2[-sel, ] > t<- fit.variogram(v,vgm(0.0437, "Exp", 26, 0)) > cv69<- krige.cv(tsport_ace ~ 1, x2, t, nfold = nrow(x2)) > > The last line gives an error saying: > Error in SpatialPointsDataFrame(coordinates(data), > data.frame(matrix(as.numeric(NA), : > row.names of data and coords do not match > > I don't know what is wrong. The x2 data is a SpatialPointsdataframe that is why > i did not specify the location (as it will take it from the data). Here is the > usage of the function krige.cv: > > krige.cv(formula, locations, data, model = NULL, beta = NULL, nmax = Inf, > nmin = 0, maxdist = Inf, nfold = nrow(data), verbose = TRUE, ...) > I hope you can help me on this. Thanks a lot. > Best regards, > Pearl > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From marchywka at hotmail.com Fri Jan 7 12:46:40 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Fri, 7 Jan 2011 06:46:40 -0500 Subject: [R] Accessing data via url In-Reply-To: <1294388659726-3178773.post@n4.nabble.com> References: <352569.97781.qm@web38406.mail.mud.yahoo.com>, <1294388659726-3178773.post@n4.nabble.com> Message-ID: > Date: Fri, 7 Jan 2011 00:24:19 -0800 > From: dieter.menne at menne-biomed.de > To: r-help at r-project.org > Subject: Re: [R] Accessing data via url > > > > John Kane-2 wrote: > > > > # Can anyone suggest why this works > > > > datafilename <- > > "http://personality-project.org/r/datasets/maps.mixx.epi.bfi.data" > > person.data <- read.table(datafilename,header=TRUE) > > > > # but this does not? > > > > dd <- > > "https://sites.google.com/site/jrkrideau/home/general-stores/trees.txt" > > treedata <- read.table(dd, header=TRUE) > > > > =================================================================== > > > > Error in file(file, "rt") : cannot open the connection > > > > Your original file is no longer there, but when I try RCurl with a png file > that is present, I get a certificate error: > > Dieter > > -------- > library(RCurl) > sessionInfo() > dd <- > "https://sites.google.com/site/jrkrideau/home/general-stores/history.png" > x = getBinaryURL(dd) > > ------------- > > sessionInfo() > R version 2.12.1 (2010-12-16) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 > [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C > [5] LC_TIME=German_Germany.1252 > > attached base packages: > [1] stats graphics grDevices datasets utils methods base > > other attached packages: > [1] RCurl_1.5-0.1 bitops_1.0-4.1 > > loaded via a namespace (and not attached): > [1] tools_2.12.1 > > > dd <- > > "https://sites.google.com/site/jrkrideau/home/general-stores/history.png" > > > x = getBinaryURL(dd) > Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) : > SSL certificate problem, verify that the CA cert is OK. Details: > error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify > failed > > I think I replied to OP only using wget but puresumably there is similar option for rcurl as "-k" on cmd line version. Network IO is unpredictable, you really can use a few external tools from time to time. $ wget -O xxx -S -v --no-check-certificate --user-agent="Mozilla5.0" "http://si tes.google.com/site/jrkrideau/home/general-stores/trees.txt" --2011-01-06 16:00:01--? http://sites.google.com/site/jrkrideau/home/general-sto res/trees.txt Resolving sites.google.com (sites.google.com)... 74.125.229.3, 74.125.229.5, 74. 125.229.13, ... Connecting to sites.google.com (sites.google.com)|74.125.229.3|:80... connected. HTTP request sent, awaiting response... ? HTTP/1.0 404 Not Found ? Content-Type: text/html; charset=utf-8 ? Date: Thu, 06 Jan 2011 22:00:05 GMT ? Expires: Thu, 06 Jan 2011 22:00:05 GMT ? Cache-Control: private, max-age=0 ? X-Content-Type-Options: nosniff ? X-XSS-Protection: 1; mode=block ? Server: GSE 2011-01-06 16:00:01 ERROR 404: Not Found. $ wget -O xxx -S -v --no-check-certificate --user-agent="Mozilla5.0" "http://si tes.google.com/site/jrkrideau/home/general-stores/history.png" --2011-01-07 05:43:00--? http://sites.google.com/site/jrkrideau/home/general-sto res/history.png Resolving sites.google.com (sites.google.com)... 74.125.229.11, 74.125.229.6, 74 .125.229.14, ... Connecting to sites.google.com (sites.google.com)|74.125.229.11|:80... connected . HTTP request sent, awaiting response... ? HTTP/1.0 200 OK ? Content-Type: image/png ? X-Robots-Tag: noarchive ? Cache-Control: no-cache, no-store, max-age=0, must-revalidate ? Pragma: no-cache ? Expires: Fri, 01 Jan 1990 00:00:00 GMT ? Date: Fri, 07 Jan 2011 11:43:04 GMT ? Last-Modified: Wed, 28 Oct 2009 18:58:56 GMT ? ETag: "1256756336889" ? Content-Length: 3817 ? X-Content-Type-Options: nosniff ? X-XSS-Protection: 1; mode=block ? Server: GSE ? Connection: Keep-Alive Length: 3817 (3.7K) [image/png] Saving to: `xxx' 100%[======================================>] 3,817?????? --.-K/s?? in 0s 2011-01-07 05:43:00 (30.8 MB/s) - `xxx' saved [3817/3817] $ $ curl -o xxx -k "http://sites.google.com/site/jrkrideau/home/general-stores/hi story.png" ? % Total??? % Received % Xferd? Average Speed?? Time??? Time???? Time? Current ???????????????????????????????? Dload? Upload?? Total?? Spent??? Left? Speed 100? 3817? 100? 3817??? 0???? 0? 28916????? 0 --:--:-- --:--:-- --:--:-- 40606 $ From wwwhsd at gmail.com Fri Jan 7 13:07:26 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Fri, 7 Jan 2011 10:07:26 -0200 Subject: [R] Accessing data via url In-Reply-To: <1294388659726-3178773.post@n4.nabble.com> References: <352569.97781.qm@web38406.mail.mud.yahoo.com> <1294388659726-3178773.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From nsabhyankar at gmail.com Fri Jan 7 13:21:26 2011 From: nsabhyankar at gmail.com (nikhil abhyankar) Date: Fri, 7 Jan 2011 17:51:26 +0530 Subject: [R] Extracting user specified variables from data frame to use as function arguments Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From a.spiess at uke.uni-hamburg.de Fri Jan 7 12:50:08 2011 From: a.spiess at uke.uni-hamburg.de (A.N. Spiess) Date: Fri, 7 Jan 2011 03:50:08 -0800 (PST) Subject: [R] How to join matrices of different row length from a list In-Reply-To: <1294311405773-3177212.post@n4.nabble.com> References: <1294311405773-3177212.post@n4.nabble.com> Message-ID: <1294401008711-3178991.post@n4.nabble.com> Dear Emma, there is a 'cbind.na', 'rbind.na' and 'data.frame.na' function in my qpcR package. library(qpcR) matLis <- list(matrix(1:4, 2, 2), matrix(1:6, 3, 2), matrix(2:1, 1, 2)) do.call(cbind.na, matLis) They are essentially the generic functions extended with an internal fill. You might also want to try these examples: ## binding cbind.na(1, 1:7) # the '1' (= shorter vector) is NOT recycled but filled cbind.na(1:8, 1:7, 1:5, 1:10) # same with many vectors rbind.na(1:8, 1:7, 1:5, 1:10) # or in rows a <- matrix(rnorm(20), ncol = 4) # unequal size matrices b <- matrix(rnorm(20), ncol = 5) cbind.na(a, b) # works, in contrast to original cbind rbind.na(a, b) # works, in contrast to original rbind ## data frame with unequal size vectors data.frame.na(A = 1:7, B = 1:5, C = letters[1:3], D = factor(c(1, 1, 2, 2))) ## convert a list with unequal length list items ## to a data frame z <- list(a = 1:5, b = letters[1:3], c = matrix(rnorm(20), ncol = 2)) do.call(data.frame.na, z) -- View this message in context: http://r.789695.n4.nabble.com/How-to-join-matrices-of-different-row-length-from-a-list-tp3177212p3178991.html Sent from the R help mailing list archive at Nabble.com. From marchywka at hotmail.com Fri Jan 7 13:08:12 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Fri, 7 Jan 2011 07:08:12 -0500 Subject: [R] Waaaayy off topic...Statistical methods, pub bias, scientific validity In-Reply-To: References: <4D2644A5.9040303@witthoft.com>, <4D26A0F3.3030902@structuremonitoring.com>, Message-ID: > Date: Thu, 6 Jan 2011 23:06:44 -0800 > From: peter.langfelder at gmail.com > To: r-help at r-project.org > Subject: Re: [R] Waaaayy off topic...Statistical methods, pub bias, scientific validity > > >From a purely statistical and maybe somewhat naive point of view, > published p-values should be corrected for the multiple testing that > is effectively happening because of the large number of published > studies. My experience is also that people will often try several > statistical methods to get the most significant p-value but neglect to > share that fact with the audience and/or at least attempt to correct > the p-values for the selection bias. You see this everywhere in one form or another from medical to financial modelling. My solution here is simply to publish more raw data in a computer readable form, in this case of course something easy to get with R, so disinterested or adversarial parties can run their own "analysis." I think there was also a push to create a data base for failed drug trials that may contain data of some value later. The value of R with easily available data for a large cross section of users could be to moderate problems like the one cited here. I almost slammed a poster here earlier who wanted a simple rule for "when do I use this test" with something like " when your mom tells you to" since post hoc you do just about everything to assume you messed up and missed something but a priori you hope you have designed a good hypothesis. And at the end of the day, a given p-value is one piece of evidence in the overall objective of learning about some system, not appeasing a sponsor. Personally I'm a big fan of post hoc analysis on biotech data in some cases, especially as more pathway or other theory is published, but it is easy to become deluded if you have a conclusion that you know JUST HAS TO BE RIGHT. Also FWIW, in the few cases I've examined with FDA-sponsor rhetoric, the data I've been able to get tends to make me side with the FDA and I still hate the idea of any regulation or access restrictions but it seems to be the only way to keep sponsors honest to any extent. Your mileage may vary however, take a look at some rather loud disagreement with FDA over earlier DNDN panel results, possibly involving threats against critics. LOL. > > That being said, it would seem that biomedical sciences do make > progress, so some of the published results are presumably correct :) > > Peter > > On Thu, Jan 6, 2011 at 9:13 PM, Spencer Graves > wrote: > > Part of the phenomenon can be explained by the natural censorship in > > what is accepted for publication: Stronger results tend to have less > > difficulty getting published. Therefore, given that a result is published, > > it is evident that the estimated magnitude of the effect is in average > > larger than it is in reality, just by the fact that weaker results are less > > likely to be published. A study of the literature on this subject might > > yield an interesting and valuable estimate of the magnitude of this > > selection bias. > > > > > > A more insidious problem, that may not affect the work of Jonah Lehrer, > > is political corruption in the way research is funded, with less public and > > more private funding of research > > (http://portal.unesco.org/education/en/ev.php-URL_ID=21052&URL_DO=DO_TOPIC&URL_SECTION=201.html). > > For example, I've heard claims (which I cannot substantiate right now) that > > cell phone companies allegedly lobbied successfully to block funding for > > researchers they thought were likely to document health problems with their > > products. Related claims have been made by scientists in the US Food and > > Drug Administration that certain therapies were approved on political > > grounds in spite of substantive questions about the validity of the research > > backing the request for approval (e.g., > > www.naturalnews.com/025298_the_FDA_scientists.html). Some of these > > accusations of political corruption may be groundless. However, as private > > funding replaces tax money for basic science, we must expect an increase in > > research results that match the needs of the funding agency while degrading > > the quality of published research. This produces more research that can not > > be replicated -- effects that get smaller upon replication. (My wife and I > > routinely avoid certain therapies recommended by physicians, because the > > physicians get much of their information on recent drugs from the > > pharmaceuticals, who have a vested interest in presenting their products in > > the most positive light.) > > From nik at nikosalexandris.net Fri Jan 7 13:04:03 2011 From: nik at nikosalexandris.net (Nikos Alexandris) Date: Fri, 7 Jan 2011 14:04:03 +0200 Subject: [R] How to export/save an "mrpp" object? In-Reply-To: References: <201012220557.18804.nikos.alexandris@felis.uni-freiburg.de> <201101070503.40394.nik@nikosalexandris.net> Message-ID: <201101071404.05765.nik@nikosalexandris.net> Nikos: > > I finally ran mrpp tests. I think all is fine but one very important > > issue: I > > have no idea how to export/save an "mrpp" object. Tried anything I > > know and > > searched the archives but found nothing. David W: > And what happened when you tried what seems like the obvious: > save(mrpp_obj, file=) > # rm(list=ls() ) # Only uncomment if you are ready for your workspace > to clear > load("mrpp_store.Rdata") Right, "clearing" did the trick. > > Any ideas? Is really copy-pasting the mrpp results the only way? > > Many of us have no idea what such an object is, since you have not > described the packages and functions used to create it. If you want an > ASCII version then dput or dump are also available. Multiresponse Permuation Procedures (MRPP) is implemented in the "vegan" package. The function mrpp() returns (an object of class "mrpp") something like: --%<--- # check class class ( samples_bitemporal_modis.0001.mrpp ) [1] "mrpp" # check structure str ( samples_bitemporal_modis.0001.mrpp ) List of 12 $ call : language mrpp(dat = samples_bitemporal_modis.0001[, 1:5], grouping = samples_bitemporal_modis.0001[["Class"]]) $ delta : num 0.126 $ E.delta : num 0.202 $ CS : logi NA $ n : Named int [1:5] 335 307 183 188 27 ..- attr(*, "names")= chr [1:5] "Urban" "Vegetation" "Bare ground" "Burned" ... $ classdelta : Named num [1:5] 0.1255 0.1045 0.1837 0.0981 0.1743 ..- attr(*, "names")= chr [1:5] "Urban" "Vegetation" "Bare ground" "Burned" ... $ Pvalue : num 0.001 $ A : num 0.378 $ distance : chr "euclidean" $ weight.type : num 1 $ boot.deltas : num [1:999] 0.202 0.202 0.202 0.203 0.202 ... $ permutations: num 999 - attr(*, "class")= chr "mrpp" -->%--- Now I've tried the following: --%<--- # 1. save(d) it save ( samples_bitemporal_modis.0001.mrpp , file="exported.mrpp.R" ) # 2. loade(d) it in a new object... loadedmrpp <- load ( "exported.mrpp.R") # 3. (tried) to check it... str ( "exported.mrpp.R") chr "samples_bitemporal_modis.0001.mrpp" # it did not cross my mind immediately to... get(loadedmrpp) Call: mrpp(dat = samples_bitemporal_modis.0001[, 1:5], grouping = samples_bitemporal_modis.0001[["Class"]]) Dissimilarity index: euclidean Weights for groups: n Class means and counts: Urban Vegetation Bare ground Burned Water delta 0.1255 0.1045 0.1837 0.0981 0.1743 n 335 307 183 188 27 Chance corrected within-group agreement A: 0.3778 Based on observed delta 0.1258 and expected delta 0.2022 Significance of delta: 0.001 Based on 999 permutations # ...or to work on a clean workspace! -->%--- Thank you David. Cheers, Nikos From amelia_vettori at yahoo.co.nz Fri Jan 7 13:46:53 2011 From: amelia_vettori at yahoo.co.nz (Amelia Vettori) Date: Fri, 7 Jan 2011 04:46:53 -0800 (PST) Subject: [R] Currency return calculations Message-ID: <561728.67841.qm@web121410.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wwwhsd at gmail.com Fri Jan 7 13:49:06 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Fri, 7 Jan 2011 10:49:06 -0200 Subject: [R] Match numeric vector against rows in a matrix? In-Reply-To: <0AF8C77E-FD6B-4CFA-BA18-2430F16F56E9@gmail.com> References: <0AF8C77E-FD6B-4CFA-BA18-2430F16F56E9@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From petr.pikal at precheza.cz Fri Jan 7 13:59:06 2011 From: petr.pikal at precheza.cz (Petr PIKAL) Date: Fri, 7 Jan 2011 13:59:06 +0100 Subject: [R] Odp: Currency return calculations In-Reply-To: <561728.67841.qm@web121410.mail.ne1.yahoo.com> References: <561728.67841.qm@web121410.mail.ne1.yahoo.com> Message-ID: Hi What is wrong with my suggestion then. > rate1 USD GBP EURO CHF AUD 1 112.05 171.52 42.71 41.50 109.55 2 112.90 168.27 42.68 41.47 102.52 3 110.85 169.03 41.86 42.84 114.91 4 109.63 169.64 44.71 43.44 122.48 5 108.08 169.29 44.14 43.69 122.12 6 111.23 169.47 44.58 42.30 123.96 7 112.49 170.90 41.07 42.05 100.36 8 108.87 168.69 42.23 41.23 110.19 9 109.33 170.90 44.55 42.76 121.58 10 111.88 169.96 41.12 43.79 103.46 > portfolio<-c("USD", "USD", "CHF", "AUD", "USD") > log(rate1[-1,portfolio]/rate1[-nrow(rate1),portfolio]) USD USD.1 CHF AUD USD.2 2 0.007557271 0.007557271 -0.000723153 -0.066323165 0.007557271 3 -0.018324535 -0.018324535 0.032501971 0.114091312 -0.018324535 4 -0.011066876 -0.011066876 0.013908430 0.063798538 -0.011066876 5 -0.014239366 -0.014239366 0.005738567 -0.002943583 -0.014239366 6 0.028728436 0.028728436 -0.032332157 0.014954765 0.028728436 7 0.011264199 0.011264199 -0.005927700 -0.211195211 0.011264199 8 -0.032709819 -0.032709819 -0.019693240 0.093442427 -0.032709819 9 0.004216322 0.004216322 0.036436939 0.098366334 0.004216322 10 0.023056037 0.023056037 0.023802395 -0.161387418 0.023056037 > As I said instead fiddling with several loop/if/function/variables attempt it seems to me better to use powerful R indexing and "whole object" approach where it is possible. Regards Petr Amelia Vettori napsal dne 07.01.2011 13:46:53: > Dear sir, I am extremely sorry for messing up the logic asking for help w.r.t. > my earlier mails > > I have tried to explain below what I am looking for. > > > I have a database (say, currency_rates) storing datewise currency exchange > rates with some base currency XYZ. > > currency_rates <- data.frame(date = c("12/31/2010", "12/30/2010", "12/29/ > 2010", "12/28/2010", "12/27/2010","12/24/2010", "12/23/2010", "12/22/2010", > "12/21/2010", "12/20/2010"), > USD = c(112.05, 112.9, 110.85, 109.63, 108.08, 111.23, 112.49, 108.87, 109.33, 111.88), > GBP = c(171.52, 168.27,169.03, 169.64, 169.29, 169.47, 170.9, 168.69, 170.9, 169.96), > EURO = c(42.71, 42.68, 41.86, 44.71, 44.14, 44.58, 41.07, 42.23, 44.55, 41.12), > CHF = c(41.5, 41.47, 42.84, 43.44, 43.69, 42.3, 42.05, 41.23, 42.76, 43.79), > AUD = c(109.55, 102.52, 114.91, 122.48, 122.12, 123.96, 100.36, 110.19, 121. > 58, 103.46)) > > I have a portfolio consisting of some of these currencies. > > At this moment, suppose my portfolio has following currency transactions. i.e > following is my current portfolio and > has 2 USD transactions, 2 EURO transactions and a CHF transactions. > > portfolio_currency_names = c("USD", "USD", "EURO", "CHF", "EURO", "USD") > > > # ____________________________________ > > My objective is AS PER THE PORTFOLIO, I need to generate a data.frame giving > respective currency returns. > > Thus, I need to have an output like > > USD USD EURO CHF > EURO USD > -0.0076 -0.0076 0.0007 0.0007 0. > 0007 -0.0076 > 0.0183 0.0183 0.0194 -0.0325 0. > 0194 0.0183 > 0.0111 0.0111 -0.0659 -0.0139 -0. > 0659 0.0111 > 0.0142 0.0142 0.0128 -0.0057 0. > 0128 0.0142 > -0.0287 -0.0287 -0.0099 0.0323 -0. > 0099 -0.0287 > -0.0113 -0.0113 0.0820 0.0059 0. > 0820 -0.0113 > 0.0327 0.0327 -0.0279 0.0197 -0. > 0279 0.0327 > -0.0042 -0.0042 -0.0535 -0.0364 -0. > 0535 -0.0042 > -0.0231 -0.0231 0.0801 -0.0238 0. > 0801 -0.0231 > > Thus, my requirement is to have the dataframe as per the composition of my > portfolio. Thus, if there are only 2 transactions i.e. if my portfolio > contains say only CHF and AUD, I need the return calculations done only forCHF and AUD. > > > CHF AUD > 0.0007 0.0663 > -0.0325 -0.1141 > -0.0139 -0.0638 > -0.0057 0.0029 > 0.0323 -0.0150 > 0.0059 0.2112 > 0.0197 -0.0934 > -0.0364 -0.0984 > -0.0238 0.1614 > > I once again apologize for not asking my requirement properly thereby causing > not only inconvenience to all of you, but also wasting your valuable time. Its > not that I wasn't careful while asking for guidance for my requirement, I > wasn't clear about it. I am sorry for the same once again. > > I request you to please help me. > > Amelia Vettori > > > From diegopujoni at gmail.com Fri Jan 7 14:02:31 2011 From: diegopujoni at gmail.com (Diego Pujoni) Date: Fri, 7 Jan 2011 11:02:31 -0200 Subject: [R] How to make a Cluster of Clusters In-Reply-To: References: Message-ID: Hi Michael, I agree with you and I will make this ordination. But I also want to check a spatial correlation of the variables, so I thought that comparing the dendrogram of the environmental variables with the dendrogram of the geographical distances of the lakes it will indicates if similar lakes are next to each other. But I have just one geographical coordinate for each lake, but 12 measures of environmental variables. How can I analyse this? Thank you very much for the attention Diego PJ From jon.skoien at jrc.ec.europa.eu Fri Jan 7 14:05:13 2011 From: jon.skoien at jrc.ec.europa.eu (Jon Olav Skoien) Date: Fri, 07 Jan 2011 14:05:13 +0100 Subject: [R] Cross validation for Ordinary Kriging In-Reply-To: <4D26FB9A.8050008@jrc.ec.europa.eu> References: <523289.24057.qm@web30501.mail.mud.yahoo.com> <4D26FB9A.8050008@jrc.ec.europa.eu> Message-ID: <4D270F89.7020501@jrc.ec.europa.eu> On 1/7/2011 12:40 PM, Jon Olav Skoien wrote: > Pearl, > > The error suggests that there is something wrong with x2, and that > there is a difference between the row names of the coordinates and the > data. If you call > str(x2) > see if the first element of @coords is different from NULL, as this > can cause some problems when cross-validating. If it is, try to figure > out why. You can also set the row.names equal to NULL directly: > row.names(x2 at coords) = NULL > although I dont think such manipulation of the slots of an object is > usually recommended. Pearl, It seems the problem was caused by a recent change in sp without updating gstat, the maintainer has fixed it and submitted new version of gstat to CRAN. So you should be able to use your original script after downloading the new version, probably available in a couple of days. In the mean time the suggestion above should still work. Cheers, Jon > > Cheers, > Jon > > BTW, you will usually get more response to questions about spatial > data handling using the list r-sig-geo > (https://stat.ethz.ch/mailman/listinfo/r-sig-geo) > > > On 1/6/2011 4:00 PM, pearl may dela cruz wrote: >> ear ALL, >> >> The last part of my thesis analysis is the cross validation. Right >> now I am >> having difficulty using the cross validation of gstat. Below are my >> commands >> with the tsport_ace as the variable: >> >> nfold<- 3 >> part<- sample(1:nfold, 69, replace = TRUE) >> sel<- (part != 1) >> m.model<- x2[sel, ] >> m.valid<- x2[-sel, ] >> t<- fit.variogram(v,vgm(0.0437, "Exp", 26, 0)) >> cv69<- krige.cv(tsport_ace ~ 1, x2, t, nfold = nrow(x2)) >> >> The last line gives an error saying: >> Error in SpatialPointsDataFrame(coordinates(data), >> data.frame(matrix(as.numeric(NA), : >> row.names of data and coords do not match >> >> I don't know what is wrong. The x2 data is a SpatialPointsdataframe >> that is why >> i did not specify the location (as it will take it from the data). >> Here is the >> usage of the function krige.cv: >> >> krige.cv(formula, locations, data, model = NULL, beta = NULL, nmax = >> Inf, >> nmin = 0, maxdist = Inf, nfold = nrow(data), verbose = TRUE, >> ...) >> I hope you can help me on this. Thanks a lot. >> Best regards, >> Pearl >> >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Fri Jan 7 14:10:08 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Fri, 07 Jan 2011 14:10:08 +0100 Subject: [R] Dont show zero values in line graph In-Reply-To: <1294369854437-3178566.post@n4.nabble.com> References: <1294369854437-3178566.post@n4.nabble.com> Message-ID: <4D2710B0.1090908@statistik.tu-dortmund.de> On 07.01.2011 04:10, LCOG1 wrote: > > Hey everyone, > Im getting better at plotting my data but cant for the life of me figure > out how to show a line graph with missing data that doesnt continue the line > down to zero then back up to the remaining values. > > Consider the following > x<-c(1:5,0,0,8:10) > y<-1:10 > > plot(0,0,xlim=c(0,10), ylim=c(0,10),type="n",main="Dont show the bloody 0 > values!!") > lines(x~y, col="blue", lwd=2,) > > My data is missing the 6th and 7th values and they come in as NA's so i > change them to 0s but then the plot has these ugly lines that dive toward > the x axis then back up. I would do bar plots but i need to show multiple > sets of data on the same and side by side bars doesnt do it for me. > > So i need a line graph that starts and stops where 0s or missing values > exist. Thoughts? If I understand correctly what you are going to do: Just do not change the NAs to zero in advance. NAs are not printed. Uwe Ligges > JR From jon.skoien at jrc.ec.europa.eu Fri Jan 7 14:15:13 2011 From: jon.skoien at jrc.ec.europa.eu (Jon Olav Skoien) Date: Fri, 07 Jan 2011 14:15:13 +0100 Subject: [R] Prediction error for Ordinary Kriging In-Reply-To: <972575.99511.qm@web30508.mail.mud.yahoo.com> References: <972575.99511.qm@web30508.mail.mud.yahoo.com> Message-ID: <4D2711E1.2010603@jrc.ec.europa.eu> Pearl, You find the prediction error as the var1.var column in your result object, i.e., y in your script. For plotting: spplot(y, 2) or spplot(y,"var1.var") Jon On 1/5/2011 9:28 PM, pearl may dela cruz wrote: > Hi ALL, > > Can you please help me on how to determine the prediction error for ordinary > kriging?Below are all the commands i used to generate the OK plot: > > rsa2<- readShapeSpatial("residentialsa", CRS("+proj=tmerc > +lat_0=39.66666666666666 +lon_0=-8.131906111111112 +k=1 +x_0=0 +y_0=0 > +ellps=intl +units=m +no_defs")) > x2<- readShapeSpatial("ptna2", CRS("+proj=tmerc +lat_0=39.66666666666666 > +lon_0=-8.131906111111112 +k=1 +x_0=0 +y_0=0 +ellps=intl +units=m +no_defs")) > bb<- bbox(rsa2) > cs<- c(1, 1) > cc<- bb[, 1] + (cs/2) > cd<- ceiling(diff(t(bb))/cs) > rsa2_grd<- GridTopology(cellcentre.offset = cc,cellsize = cs, cells.dim = cd) > getClass("SpatialGrid") > p4s<- CRS(proj4string(rsa2)) > x2_SG<- SpatialGrid(rsa2_grd, proj4string = p4s) > x2_SP<- SpatialPoints(cbind(x2$X, x2$Y)) > v<- variogram(log1p(tsport_ace) ~ 1, x2, cutoff=100, width=9) > te<- fit.variogram(v,vgm(0.0437, "Exp", 26, 0)) > y<- krige(tsport_ace~1, x2, x2_SG, model = ve.fit) > spplot(y, 1, col.regions = bpy.colors(100), sp.layout = list("sp.lines",as(rsa2, > "SpatialLines"),no.clip = TRUE)) > > I'm looking forward to your response. Thanks. > > Best regards, > Pearl dela Cruz > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From eduardo.oliveirahorta at gmail.com Fri Jan 7 14:21:01 2011 From: eduardo.oliveirahorta at gmail.com (Eduardo de Oliveira Horta) Date: Fri, 7 Jan 2011 11:21:01 -0200 Subject: [R] Cairo pdf canvas size In-Reply-To: References: Message-ID: Thanks! On Thu, Jan 6, 2011 at 7:13 PM, Dennis Murphy wrote: > Hi: > > On Thu, Jan 6, 2011 at 5:36 AM, Eduardo de Oliveira Horta > wrote: >> >> Peter, >> thank you, that's what I was looking for! >> David,?I forgot to tell you my OS. Sorry... it's Win7. I'm running a >> RKWard session. >> And this is strange: >> > Cairo("example.pdf", type="pdf",width=12,height=12,units="cm",dpi=300) >> Error: could not find function "Cairo" >> ... maybe you're not using the Cairo >> package??http://cran.r-project.org/web/packages/Cairo/Cairo.pdf >> >> And Dennis, thanks for the code. It worked, and I'm considering to adopt >> data frames in the near future. By the way, I'm working with functional time >> series, so each observation is a function (or a vector representing that >> function evaluated on a grid) indexed by time. Any insights on how to >> implement data frames here? > > I don't see a real issue. It would be easier to give you concrete > information if there were an artificial example that mimics your situation, > but it's not that hard.? I'd suggest looking into the zoo package to create > a series - it can handle both regular (zooreg()) and irregular (zoo()) > series. Basically, a zoo object is a numeric vector with a time index. One > can create multiple series with a single index, individual series with > different indices that can be combined into data frames, etc. I've browsed > through some of the code that accompanies Ramsey, Hooker and Graves' FDA > book in R and Matlab, and occasionally they use the zoo package as well. > > Here's an example, but I expect that someone will show how to convert the > zoo series to data frames much more efficiently for use in ggplot2... > > library(zoo) > library(ggplot2) > library(lattice) > # Generate three daily series with different start times and lengths > a <- zoo(rnorm(450), as.Date("2005-01-01") + 0:449) > b <- zoo(rnorm(600, 1, 2), as.Date('2005-06-01') + 0:599) > d <- zoo(rnorm(300, 2, 1), as.Date('2004-09-01') + 0:299) > > # Convert to data frame, make time index a variable and make sure it's a > Date object > A <- as.data.frame(a) > B <- as.data.frame(b) > D <- as.data.frame(d) > A$Date <- as.Date(rownames(A)) > B$Date <- as.Date(rownames(B)) > D$Date <- as.Date(rownames(D)) > # Give all three series the same name > names(A)[1] <- names(B)[1] <- names(D)[1] <- 'y' > # Stack the three data frames and create a series ID variable > comb <- rbind(A, B, D) > comb$Series <- rep(c('A', 'B', 'D'), c(nrow(A), nrow(B), nrow(D))) > str(comb)??? # make sure that Date is a Date object > > # ggplot of the three series > ggplot(comb, aes(x = Date, y = y, color = Series)) + geom_path() > # Stacked individual plots (faceted) > last_plot() + facet_grid(Series ~ .) > > # lattice version > xyplot(y ~ Date, data = comb, groups = Series, type = 'l', col.line = 1:3) > # Stacked individual series > xyplot(y ~ Date | Series, data = comb, type = 'l', layout = c(1, 3)) > > If you need the grid coordinates, use expand.grid() - it can be used when > creating a data frame, too. > > As Bert noted the other night in another thread, one can use xyplot directly > on zoo objects, but I don't have any direct experience with that yet so will > defer to others if they wish to contribute. ?xyplot.zoo provides some > examples. > > Hope this gives you some idea of what can be done, > Dennis > >> Best regards, >> Eduardo >> On Thu, Jan 6, 2011 at 1:47 AM, Peter Langfelder >> wrote: >>> >>> On Wed, Jan 5, 2011 at 7:35 PM, Eduardo de Oliveira Horta >>> wrote: >>> > Something like this: >>> > >>> > u=seq(from=-pi, to=pi, length=1000) >>> > f=sin(u) >>> > Cairo("example.pdf", type="pdf",width=12,height=12,units="cm",dpi=300) >>> > par(cex.axis=.6,col.axis="grey",ann=FALSE, lwd=.25,bty="n", las=1, >>> > tcl=-.2, >>> > mgp=c(3,.5,0)) >>> > xlim=c(-pi,pi) >>> > ylim=round(c(min(f),max(f))) >>> > plot(u,f,xlim,ylim,type="l",col="firebrick3", axes=FALSE) >>> > axis(side=1, lwd=.25, col="darkgrey", at=seq(from=xlim[1], to=xlim[2], >>> > length=5)) >>> > axis(side=2, lwd=.25, col="darkgrey", at=seq(from=ylim[1], to=ylim[2], >>> > length=5)) >>> > abline(v=seq(from=xlim[1], to=xlim[2], length=5), lwd=.25,lty="dotted", >>> > col="grey") >>> > abline(h=seq(from=ylim[1], to=ylim[2], length=5), lwd=.25,lty="dotted", >>> > col="grey") >>> > dev.off() >>> > >>> > >>> >>> >>> Wow, you must like light colors :) >>> >>> To the point, just set margins, for example >>> >>> par(mar = c(2,2,0.5, 0.5)) >>> >>> (margins are bottom, left, top, right) >>> >>> after the Cairo command. >>> >>> BTW, Cairo doesn't work for me either... but I tried your example by >>> plotting to the screen. >>> >>> Peter >>> >>> >>> >>> >>> ?Notice how the canvas' margins are relatively far from the plotting >>> area. >>> > >>> > Thanks, >>> > >>> > Eduardo >>> > >>> > On Thu, Jan 6, 2011 at 1:00 AM, David Winsemius >>> > wrote: >>> > >>> >> >>> >> On Jan 5, 2011, at 9:38 PM, Eduardo de Oliveira Horta wrote: >>> >> >>> >> ?Hello, >>> >>> >>> >>> I want to save a pdf plot using Cairo, but the canvas of the saved >>> >>> file >>> >>> seems too large when compared to the actual plotted area. >>> >>> >>> >>> Is there a way to control the relation between the canvas size and >>> >>> the >>> >>> size >>> >>> of actual plotting area? >>> >>> >>> >>> >>> >> OS?, ?... example? >>> >> >>> >> == >>> >> >>> >> David Winsemius, MD >>> >> West Hartford, CT >>> >> >>> >> >>> > >>> > ? ? ? ?[[alternative HTML version deleted]] >>> > >>> > ______________________________________________ >>> > R-help at r-project.org mailing list >>> > https://stat.ethz.ch/mailman/listinfo/r-help >>> > PLEASE do read the posting guide >>> > http://www.R-project.org/posting-guide.html >>> > and provide commented, minimal, self-contained, reproducible code. >>> > >> > > From mtmorgan at fhcrc.org Fri Jan 7 14:17:55 2011 From: mtmorgan at fhcrc.org (Martin Morgan) Date: Fri, 07 Jan 2011 05:17:55 -0800 Subject: [R] Parsing JSON records to a dataframe In-Reply-To: <1294387510403-3178753.post@n4.nabble.com> References: <1294376886645-3178646.post@n4.nabble.com> <1294387510403-3178753.post@n4.nabble.com> Message-ID: <4D271283.4010401@fhcrc.org> On 01/07/2011 12:05 AM, Dieter Menne wrote: > > > Jeroen Ooms wrote: >> >> What is the most efficient method of parsing a dataframe-like structure >> that has been json encoded in record-based format rather than vector >> based. For example a structure like this: >> >> [ {"name":"joe", "gender":"male", "age":41}, {"name":"anna", >> "gender":"female", "age":23} ] >> >> RJSONIO parses this as a list of lists, which I would then have to apply >> as.data.frame to and append them to an existing dataframe, which is >> terribly slow. >> >> > > unlist is pretty fast. The solution below assumes that you know how your > structure is, so it is not very flexible, but it should show you that the > conversion to data.frame is not the bottleneck. > > # json > library(RJSONIO) > # [ {"name":"joe", "gender":"male", "age":41}, > # {"name":"anna", "gender":"female", "age":23} ] > n = 300000 > d = data.frame(name=rep(c("joe","anna"),n), > gender=rep(c("male","female"),n), > age = rep(c("23","41"),n)) > dj = toJSON(d) This doesn't create the required structure > cat(dj) { "name": [ "joe", "anna", "joe", "anna" ], "gender": [ "male", "female", "male", "female" ], "age": [ "23", "41", "23", "41" ] } instead library(rjson) n <- 1000 name <- apply(matrix(sample(letters, n * 5, TRUE), n), 1, paste, collapse="") gender <- sample(c("male", "female"), n, TRUE) age <- ceiling(runif(n, 20, 60)) recs <- sprintf('{"name": "%s", "gender":"%s", "age":%d}', name, gender, age) j <- sprintf("[%s]", paste(recs, collapse=",")) lol <- fromJSON(j) and then with f <- function(lst) function(nm) unlist(lapply(lst, "[[", nm), use.names=FALSE) > oopt <- options(stringsAsFactors=FALSE) # convenience for 'identical' > system.time({ + df0 <- as.data.frame(Map(f(lol), names(lol[[1]]))) + }) user system elapsed 0.006 0.000 0.006 versus for instance > system.time({ + df1 <- do.call(rbind, lapply(lol, data.frame)) + }) user system elapsed 1.497 0.000 1.500 > identical(df0, df1) [1] TRUE Martin > > system.time(d1 <- fromJSON(dj)) > # user system elapsed > # 4.06 0.26 4.32 > > system.time( > dd <- data.frame( > name = unlist(d1$name), > gender = unlist(d1$gender), > age=as.numeric(unlist(d1$age))) > ) > # user system elapsed > # 1.13 0.05 1.18 > > > > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 From sovo0815 at gmail.com Fri Jan 7 14:26:08 2011 From: sovo0815 at gmail.com (sovo0815 at gmail.com) Date: Fri, 7 Jan 2011 14:26:08 +0100 (CET) Subject: [R] Different LLRs on multinomial logit models in R and SPSS In-Reply-To: <8D4AC2A1-ABCB-4B51-B057-05A98745C28E@comcast.net> References: <8D4AC2A1-ABCB-4B51-B057-05A98745C28E@comcast.net> Message-ID: On Thu, 6 Jan 2011, David Winsemius wrote: > On Jan 6, 2011, at 11:23 AM, S?ren Vogel wrote: > >> Thanks for your replies. I am no mathematician or statistician by far, >> however, it appears to me that the actual value of any of the two LLs >> is indeed important when it comes to calculation of >> Pseudo-R-Squared-s. If Rnagel devides by (some transformation of) the >> actiual value of llnull then any calculation of Rnagel should differ. >> How come? Or is my function wrong? And if my function is right, how >> can I calculate a R-Squared independent from the software used? > > You have two models in that function, the null one with ".~ 1" and the > origianl one and you are getting a ratio on the likelihood scale (which is a > difference on the log-likelihood or deviance scale). If this is the case, calculating 'fit' indices for those models must end up in different fit indices depending on software: n <- 143 ll1 <- 135.02 ll2 <- 129.8 # Rcs (Rcs <- 1 - exp( (ll2 - ll1) / n )) # Rnagel Rcs / (1 - exp(-ll1/n)) ll3 <- 204.2904 ll4 <- 199.0659 # Rcs (Rcs <- 1 - exp( (ll4 - ll3) / n )) # Rnagel Rcs / (1 - exp(-ll3/n)) The Rcs' are equal, however, the Rnagel's are not. Of course, this is no question, but I am rather confused. When publishing results I am required to use fit indices and editors would complain that they differ. S?ren From spencer.graves at structuremonitoring.com Fri Jan 7 14:26:12 2011 From: spencer.graves at structuremonitoring.com (Spencer Graves) Date: Fri, 07 Jan 2011 05:26:12 -0800 Subject: [R] Waaaayy off topic...Statistical methods, pub bias, scientific validity In-Reply-To: References: <4D2644A5.9040303@witthoft.com>, <4D26A0F3.3030902@structuremonitoring.com>, Message-ID: <4D271474.3040302@structuremonitoring.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From andreas.nord at zooekol.lu.se Fri Jan 7 14:28:26 2011 From: andreas.nord at zooekol.lu.se (anord) Date: Fri, 7 Jan 2011 05:28:26 -0800 (PST) Subject: [R] Problems with glht function for lme object Message-ID: <1294406906365-3179128.post@n4.nabble.com> Dear all, I'm trying to make multiple comparisons for an lme-object. The data is for an experiment on parental work load in birds, in which adults at different sites were induced to work at one of three levels ('treat'; H, M, L). The response is 'feedings', which is a quantitative measure of nest provisioning per parent per chick per hour. Site is included as a random effect (one male/female pair per site). My final model takes the following form: feedings ~ treat + year + data^2, random = ~1|site,data=feed.df For this model, I would like to do multiple comparisons on 'treat', using the multcomp package: summary(glht(m4.feed,linfct=mcp(treat="Tukey"))) However, this does not work, and I get the below error message. Error in if (is.null(pkg) | pkg == "nlme") terms(formula(x)) else slot(x, : argument is of length zero Error in factor_contrasts(model) : no ?model.matrix? method for ?model? found! I suspect this might have quite a straightforward solution, but I'm stuck at this point. Any help would be most appreciated. Sample data below. Kind regards, Andreas Nord Sweden ============== feedings sex site treat year date^2 1.8877888 M 838 H 2009 81 1.9102787 M 247 H 2009 81 1.4647229 M 674 H 2010 121 1.4160590 M 7009 M 2009 144 1.3106749 M 863 M 2010 196 1.2718121 M 61 M 2009 225 1.2799263 M 729 L 2009 256 1.5829564 M 629 L 2009 256 1.4847251 M 299 L 2010 324 1.2463151 M 569 L 2010 324 2.1694169 F 838 H 2009 81 1.5966899 F 247 H 2009 81 2.4136983 F 674 H 2010 121 1.7784873 F 7009 M 2009 144 1.6681317 F 863 M 2010 196 2.3691275 F 61 M 2009 225 2.0672192 F 729 L 2009 256 1.6389902 F 629 L 2009 256 0.9307536 F 299 L 2010 324 1.6786767 F 569 L 2010 324 ============== -- View this message in context: http://r.789695.n4.nabble.com/Problems-with-glht-function-for-lme-object-tp3179128p3179128.html Sent from the R help mailing list archive at Nabble.com. From dieter.menne at menne-biomed.de Fri Jan 7 15:23:55 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Fri, 7 Jan 2011 06:23:55 -0800 (PST) Subject: [R] Accessing data via url In-Reply-To: References: <352569.97781.qm@web38406.mail.mud.yahoo.com> <1294388659726-3178773.post@n4.nabble.com> Message-ID: <1294410235849-3179197.post@n4.nabble.com> Henrique Dallazuanna wrote: > > With the ssl.verifypeer = FALSE argument it works: > > x = getBinaryURL(dd, ssl.verifypeer = FALSE) > > Thank, good to know. It's only in the examples of ..., but is looks like a parameter important enough to be included in the docs of getBinaryURL. Digging through CURL docs can be daunting. Dieter -- View this message in context: http://r.789695.n4.nabble.com/Accessing-data-via-url-tp3178094p3179197.html Sent from the R help mailing list archive at Nabble.com. From friendly at yorku.ca Fri Jan 7 15:39:53 2011 From: friendly at yorku.ca (Michael Friendly) Date: Fri, 07 Jan 2011 09:39:53 -0500 Subject: [R] defining a formula method for a weighted lm() In-Reply-To: <19750.5408.744169.57312@lynne.math.ethz.ch> References: <4D0A259B.7010105@yorku.ca> <4D25D2B5.6020005@yorku.ca> <19750.5408.744169.57312@lynne.math.ethz.ch> Message-ID: <4D2725B9.7030101@yorku.ca> Thanks, Martin Now I understand 'standard non-standard evaluation' magic, and the code in http://developer.r-project.org/model-fitting-functions.txt explains how this works. Still, I can't help but think of this as evil-magic, for which some anti-magic would be extremely useful, so that a simple function like my.lm <- function(formula, data, subset, weights, ...) { lm(formula, data, subset, weights, ...) } would work as expected. Oh dear, I think I need some syntactic sugar in my coffee! -Michael On 1/6/2011 2:16 PM, Martin Maechler wrote: >>>>>> Michael Friendly >>>>>> on Thu, 06 Jan 2011 09:33:25 -0500 writes: > > No one replied to this, so I'll try again, with a simple example. I > > calculate a set of log odds ratios, and turn them into a data frame as > > follows: > > >> library(vcdExtra) > >> (lor.CM<- loddsratio(CoalMiners)) > > log odds ratios for Wheeze and Breathlessness by Age > > > 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 > > 3.695261 3.398339 3.140658 3.014687 2.782049 2.926395 2.440571 2.637954 > >> > >> (lor.CM.df<- as.data.frame(lor.CM)) > > Wheeze Breathlessness Age LOR ASE > > 1 W:NoW B:NoB 25-29 3.695261 0.16471778 > > 2 W:NoW B:NoB 30-34 3.398339 0.07733658 > > 3 W:NoW B:NoB 35-39 3.140658 0.03341311 > > 4 W:NoW B:NoB 40-44 3.014687 0.02866111 > > 5 W:NoW B:NoB 45-49 2.782049 0.01875164 > > 6 W:NoW B:NoB 50-54 2.926395 0.01585918 > > 7 W:NoW B:NoB 55-59 2.440571 0.01452057 > > 8 W:NoW B:NoB 60-64 2.637954 0.02159903 > > > Now I want to fit a linear model by WLS, LOR ~ Age, which can do directly as > > >> lm(LOR ~ as.numeric(Age), weights=1/ASE, data=lor.CM.df) > > > Call: > > lm(formula = LOR ~ as.numeric(Age), data = lor.CM.df, weights = 1/ASE) > > > Coefficients: > > (Intercept) as.numeric(Age) > > 3.5850 -0.1376 > > > But, I want to do the fitting in my own function, the simplest version is > > > my.lm<- function(formula, data, subset, weights) { > > lm(formula, data, subset, weights) > > } > > > But there is obviously some magic about formula objects and evaluation > > environments, because I don't understand why this doesn't work. > > > >> my.lm(LOR ~ as.numeric(Age), weights=1/ASE, data=lor.CM.df) > > Error in model.frame.default(formula = formula, data = data, subset = > > subset, : > > invalid type (closure) for variable '(weights)' > >> > > Yes, the "magic" has been called "standard non-standard evaluation" > for a while (since August 2002, to be precise), > and the http://developer.r-project.org/ web page has had two > very relevant links since then, namely those mentioned in the > following two lines there: > ---------------------------- > # Description of the nonstandard evaluation rules in R 1.5.1 and some suggestions. (updated). Also an R function and docn for making model frames from multiple formulas. > > # Notes on model-fitting functions in R, and especially on how to enable all the safety features. > ---------------------------- > > For what you want, I think (but haven't tried) the second link, which is > http://developer.r-project.org/model-fitting-functions.txt > is still very relevant. > > Many many people (package authors) had to use something like > that or just directly taken the lm function as an example.. > {{ but then probably failed the more subtle points on how to > program residuals() , predict() , etc functions which you can > also learn from model-fitting-functions.txt}} > > > > A second question: Age is a factor, and as.numeric(Age) gives me 1:8. > > > What simple expression on lor.CM.df$Age would give me either the lower > > limits (here: seq(25, 60, by = 5)) or midpoints of these Age intervals > > (here: seq(27, 62, by = 5))? > > With > > data(CoalMiners, package = "vcd") > > here are some variations : > > > (Astr<- dimnames(CoalMiners)[[3]]) > [1] "25-29" "30-34" "35-39" "40-44" "45-49" "50-54" "55-59" "60-64" > > sapply(lapply(strsplit(Astr, "-"), as.numeric), `[[`,