conflict with RStudio and reading connections

jdlong's Avatar

jdlong

27 May, 2011 04:00 PM

Hey guys, I ran into a bug with R not downloading files properly when using RStudio. Can you all see if you can reproduce this?

http://stackoverflow.com/questions/6154497/r-reading-a-binary-file-...

-JD

  1. Support Staff 2 Posted by Josh Paulson on 27 May, 2011 04:25 PM

    Josh Paulson's Avatar

    JD,

    This seemed to work for me with both RStudio Server and Desktop. You are on server right? Can you check what specific version you are running with the following command:

    RStudio.version()
    

    Thanks,

    Josh

  2. 3 Posted by jdlong on 27 May, 2011 05:46 PM

    jdlong's Avatar

    RStudio.version() [1] "0.94.16"

    I'll grab the nightly and upgrade then report back.

  3. 4 Posted by jdlong on 27 May, 2011 05:54 PM

    jdlong's Avatar

    Nope, same problem even with 94.48.

    Here's an example:

    tst <- function(x){
    p <- gzcon( url( "ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz" ) ) myPoints <- readBin(p, real(), n=1e6, size = 4, endian = "little") close(p) return(length(myPoints)) } sapply(1:10, tst)

    returns

    sapply(1:10, tst) [1] 518400 231982 226839 247598 518400 230460 518400 518400 518400 518400

    and if I run it again I get different results:
    sapply(1:10, tst) [1] 225327 232733 223679 232733 247598 233955 218484 226839 518400 226839

    FWIW I'm on Ubuntu 11.04.

    If I run the above from R in the command line, it works fine:

    sapply(1:10, tst)

    [1] 518400 518400 518400 518400 518400 518400 518400 518400 518400 518400

    -JD

  4. 5 Posted by Luciano Selzer on 27 May, 2011 06:37 PM

    Luciano Selzer's Avatar

    Using RStudio 0.94.48 I get this

    sapply(1:10, tst) [1] 518400 518400 518400 518400 518400 518400 518400 [8] 518400 518400 518400

  5. 6 Posted by jdlong on 27 May, 2011 06:47 PM

    jdlong's Avatar

    Luciano, yes, that's the correct output. Are you using Rstudio Server on Ubuntu?

  6. 7 Posted by Luciano Selzer on 27 May, 2011 06:54 PM

    Luciano Selzer's Avatar

    No, RStudio desktop on Windows 7. I can dual boot to Ubuntu Natty. I'll get back with the test.

  7. 8 Posted by Luciano Selzer on 27 May, 2011 07:48 PM

    Luciano Selzer's Avatar

    I can confirm there's a bug. I think it only affects RStudio on ubuntu as it works well on Windows.

    Output with RStudio:

    sapply(1:10, tst) [1] 239717 439019 237782 219587 221658 233478 342902 411320 233478 429139

    Output with R in terminal

    sapply(1:10, tst) [1] 518400 518400 518400 248299 215300 518400 518400 518400 518400 518400

  8. 9 Posted by Luciano Selzer on 27 May, 2011 07:50 PM

    Luciano Selzer's Avatar

    I've just read my comment. The output in R terminal is also wrong on to ocassion. Could this be a timeout issue?

  9. 10 Posted by jdlong on 27 May, 2011 07:58 PM

    jdlong's Avatar

    Luciano, when i ran this from the command line R in ubuntu I consistently got the correct answer. When you say you see the bug in "R Terminal" do you mean on Ubuntu or Win?

  10. Support Staff 11 Posted by Joe Cheng on 27 May, 2011 07:59 PM

    Joe Cheng's Avatar

    I can repro on Natty server too. Investigating now.

  11. Support Staff 12 Posted by Joe Cheng on 27 May, 2011 08:09 PM

    Joe Cheng's Avatar

    Weird, every time I run:

    download.file("ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz", "foo.gz")

    it gives something like

    downloaded length 149144 != reported length 179058

    but with a different downloaded length every time. This repros with the "internal" download method but not "wget".

    No workaround yet, still digging.

  12. 13 Posted by jdlong on 27 May, 2011 08:14 PM

    jdlong's Avatar

    Yep I get the very same thing.

    Sent from my iPhone.

  13. Support Staff 14 Posted by Joe Cheng on 27 May, 2011 08:18 PM

    Joe Cheng's Avatar

    Hmmm, and only with this particular FTP server. This one works fine, for example:

    download.file("ftp://prism.oregonstate.edu/pub/prism/pacisl/grids/tdmean/Normals/hi_tdmean_1971_2000.01.asc.gz", "bar.gz")
    
  14. 15 Posted by jdlong on 27 May, 2011 08:23 PM

    jdlong's Avatar

    active vs. passive FTP issue?

  15. 16 Posted by jdlong on 27 May, 2011 08:39 PM

    jdlong's Avatar

    I've found a dependable workaround that tells me very little about the bug, unfortunately.

    For the server which is giving me grief, if I call it through http instead of ftp, everything is fine. So instead of doing this using ftp:

    tst <- function(x){
    p <- gzcon( url( "ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz" ) ) myPoints <- readBin(p, real(), n=1e6, size = 4, endian = "little") close(p) return(length(myPoints)) } sapply(1:10, tst)

    I do this:

    tst <- function(x){
    p <- gzcon( url( "http://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/200..." ) ) myPoints <- readBin(p, real(), n=1e6, size = 4, endian = "little") close(p) return(length(myPoints)) } sapply(1:10, tst)

  16. Josh Paulson closed this discussion on 04 Oct, 2012 01:56 PM.

Comments are currently closed for this discussion. You can start a new one.