Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::bad_alloc error #71

Closed
RobertMyles opened this issue Jun 14, 2017 · 28 comments
Closed

std::bad_alloc error #71

RobertMyles opened this issue Jun 14, 2017 · 28 comments

Comments

@RobertMyles
Copy link

Migrating from dplyr #2866:

Hi, I'm having an Error in new_result(connection@ptr, statement) : std::bad_alloc error (raised in #2323 -- I've also followed the instructions there and I'm using the dev version of odbc) when I try to use dplyr/dblpyr with a Microsoft SQL Database. I've read that this may be to do with Rcpp, so I've re-installed dplyr, dblpyr and Rcpp without success.

library(DBI)
library(dplyr)
library(dbplyr)
con <- dbConnect(odbc::odbc(),
                 driver = "Microsoft SQL Server",
                 dsn = "tblEmpenhos", 
                 server = "SERVER")

I can connect to the base with no problems:

class(con)

[1] "Microsoft SQL Server"
attr(,"package")
[1] ".GlobalEnv"

And dbListFields(con, "tblEmpenhos") gives me all the correct column names. But using this gives the error:

tbl(con, "tblEmpenhos")
Error in new_result(connection@ptr, statement) : std::bad_alloc

I thought this error had to do with memory allocation, but that can't be right in this case. Is there anything else I could check?

@jimhester
Copy link
Contributor

Do you get the same error with dbReadTable(con, "tblEmpenhos"), also can you provide the output of devtools::session_info()

@RobertMyles
Copy link
Author

Yup. Results:

> dbReadTable(con, "tblEmpenhos")
Error in new_result(connection@ptr, statement) : std::bad_alloc
> devtools::session_info()
Session info ------------------------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.3.3 (2017-03-06)
 system   i386, mingw32               
 ui       RStudio (1.0.143)           
 language (EN)                        
 collate  Portuguese_Brazil.1252      
 tz       America/Sao_Paulo           
 date     2017-06-14                  

Packages ----------------------------------------------------------------------------------------------------------------
 package    * version    date       source                         
 assertthat   0.2.0      2017-04-11 CRAN (R 3.3.3)                 
 base       * 3.3.3      2017-03-06 local                          
 bindr        0.1        2016-11-13 CRAN (R 3.3.3)                 
 bindrcpp   * 0.1        2016-12-11 CRAN (R 3.3.3)                 
 bit          1.1-12     2014-04-09 CRAN (R 3.3.2)                 
 bit64        0.9-7      2017-05-08 CRAN (R 3.3.3)                 
 blob         1.0.0.9002 2017-06-14 Github (hadley/blob@615057a)   
 datasets   * 3.3.3      2017-03-06 local                          
 DBI        * 0.6-1      2017-04-01 CRAN (R 3.3.3)                 
 dbplyr     * 1.0.0      2017-06-09 CRAN (R 3.3.3)                 
 devtools     1.13.2     2017-06-02 CRAN (R 3.3.3)                 
 digest       0.6.12     2017-01-27 CRAN (R 3.3.3)                 
 dplyr      * 0.7.0      2017-06-09 CRAN (R 3.3.3)                 
 glue         1.1.0      2017-06-13 CRAN (R 3.3.3)                 
 graphics   * 3.3.3      2017-03-06 local                          
 grDevices  * 3.3.3      2017-03-06 local                          
 hms          0.3        2016-11-22 CRAN (R 3.3.3)                 
 magrittr     1.5        2014-11-22 CRAN (R 3.3.3)                 
 memoise      1.1.0      2017-04-21 CRAN (R 3.3.3)                 
 methods    * 3.3.3      2017-03-06 local                          
 odbc         1.0.1.9000 2017-06-14 Github (rstats-db/odbc@64ebc83)
 purrr        0.2.2.2    2017-05-11 CRAN (R 3.3.3)                 
 R6           2.2.1      2017-05-10 CRAN (R 3.3.3)                 
 Rcpp         0.12.11    2017-05-22 CRAN (R 3.3.3)                 
 rlang        0.1.1      2017-05-18 CRAN (R 3.3.3)                 
 stats      * 3.3.3      2017-03-06 local                          
 tibble       1.3.3      2017-05-28 CRAN (R 3.3.3)                 
 tools        3.3.3      2017-03-06 local                          
 utils      * 3.3.3      2017-03-06 local                          
 withr        1.0.2      2016-06-20 CRAN (R 3.3.3)  

@jimhester
Copy link
Contributor

jimhester commented Jun 14, 2017

Can you return the value of the following, it is possible there is a data type issue with the column you are trying to read.

odbc:::connection_sql_columns(con@ptr, table_name="tblEmpenhos")

@jimhester
Copy link
Contributor

Also could you possibly try using the current CRAN version of odbc and see if the error recapitulates there. Would be useful to know if this is a new bug or not.

@RobertMyles
Copy link
Author

Sure.

  1. :
> odbc:::connection_sql_columns(con@ptr, table_name="tblEmpenhos")
                    name field.type nullable
1            DataInicial   datetime    FALSE
2              DataFinal   datetime    FALSE
3    Cd_Exerc_Empresa_Id    numeric    FALSE
4         Cd_AnoExecucao   smallint    FALSE
5           Cd_Exercicio   smallint    FALSE
6             Nr_Empenho    numeric    FALSE
7             Dt_Empenho  datetime2     TRUE
8             Tp_Empenho      nchar     TRUE
9             Nr_Reserva    numeric     TRUE
10         Cd_Dotacao_Id    numeric     TRUE
11         Administracao   nvarchar     TRUE
12          COD_EMP_PMSP   nvarchar     TRUE
13           NOM_EMP_SOF   nvarchar     TRUE
14              Cd_Orgao      nchar     TRUE
15           Sigla_Orgao   nvarchar     TRUE
16              Ds_Orgao   nvarchar     TRUE
17            Cd_Unidade   nvarchar     TRUE
18            Ds_Unidade   nvarchar     TRUE
19             Cd_Funcao      nchar     TRUE
20             Ds_Funcao   nvarchar     TRUE
21          Cd_SubFuncao      nchar     TRUE
22          Ds_SubFuncao   nvarchar     TRUE
23           Cd_Programa      nchar     TRUE
24           Ds_Programa   nvarchar     TRUE
25  Tp_Projeto_Atividade      nchar     TRUE
26  Cd_Projeto_Atividade      nchar     TRUE
27                    PA   nvarchar     TRUE
28                   PAP   nvarchar     TRUE
29                  PAPA   nvarchar     TRUE
30      ProjetoAtividade      nchar     TRUE
31  Ds_Projeto_Atividade   nvarchar     TRUE
32            Cd_Despesa      nchar     TRUE
33            Ds_Despesa   nvarchar     TRUE
34           Ds_DespesaB   nvarchar     TRUE
35     Categoria_Despesa      nchar     TRUE
36          Ds_Categoria   nvarchar     TRUE
37         Grupo_Despesa      nchar     TRUE
38        Grupo_DespesaB      nchar     TRUE
39              Ds_Grupo   nvarchar     TRUE
40             Ds_GrupoB   nvarchar     TRUE
41         Cd_Modalidade      nchar     TRUE
42         Ds_Modalidade   nvarchar     TRUE
43           Cd_Elemento      nchar     TRUE
44              Cd_Fonte      nchar     TRUE
45              Ds_Fonte   nvarchar     TRUE
46       Cd_Item_Despesa   smallint     TRUE
47       Ds_Item_Despesa   nvarchar     TRUE
48    Cd_SubItem_Despesa   smallint     TRUE
49    Ds_SubItem_Despesa   nvarchar     TRUE
50      Cd_Fornecedor_Id    numeric     TRUE
51                   Cgc   nvarchar     TRUE
52             Tp_Pessoa      nchar     TRUE
53          Razao_Social   nvarchar     TRUE
54          Cd_Historico   smallint     TRUE
55          Ds_Licitacao   nvarchar     TRUE
56            Ds_Destino   nvarchar     TRUE
57      Ds_Local_Entrega   nvarchar     TRUE
58           Cd_Convenio   smallint     TRUE
59           Nr_Processo   nvarchar     TRUE
60               Pz_Dias    numeric     TRUE
61             Cond_Pgto   nvarchar     TRUE
62    Dt_Inicio_Vigencia   datetime     TRUE
63         Ds_Observacao   nvarchar     TRUE
64   Ds_Observacao_Anexo      ntext     TRUE
65          Vl_Empenhado    decimal     TRUE
66            Vl_Anulado    decimal     TRUE
67   Vl_EmpenhadoLiquido    decimal     TRUE
68          Vl_Liquidado    decimal     TRUE
69               Vl_Pago    decimal     TRUE
70               Jan_Emp    decimal     TRUE
71              Jan_Prev    decimal     TRUE
72              Jan_Real    decimal     TRUE
73               Jan_Pag    decimal     TRUE
74               Fev_Emp    decimal     TRUE
75              Fev_Prev    decimal     TRUE
76              Fev_Real    decimal     TRUE
77               Fev_Pag    decimal     TRUE
78               Mar_Emp    decimal     TRUE
79              Mar_Prev    decimal     TRUE
80              Mar_Real    decimal     TRUE
81               Mar_Pag    decimal     TRUE
82               Abr_Emp    decimal     TRUE
83              Abr_Prev    decimal     TRUE
84              Abr_Real    decimal     TRUE
85               Abr_Pag    decimal     TRUE
86               Mai_Emp    decimal     TRUE
87              Mai_Prev    decimal     TRUE
88              Mai_Real    decimal     TRUE
89               Mai_Pag    decimal     TRUE
90               Jun_Emp    decimal     TRUE
91              Jun_Prev    decimal     TRUE
92              Jun_Real    decimal     TRUE
93               Jun_Pag    decimal     TRUE
94               Jul_Emp    decimal     TRUE
95              Jul_Prev    decimal     TRUE
96              Jul_Real    decimal     TRUE
97               Jul_Pag    decimal     TRUE
98               Ago_Emp    decimal     TRUE
99              Ago_Prev    decimal     TRUE
100             Ago_Real    decimal     TRUE
101              Ago_Pag    decimal     TRUE
102              Set_Emp    decimal     TRUE
103             Set_Prev    decimal     TRUE
104             Set_Real    decimal     TRUE
105              Set_Pag    decimal     TRUE
106              Out_Emp    decimal     TRUE
107             Out_Prev    decimal     TRUE
108             Out_Real    decimal     TRUE
109              Out_Pag    decimal     TRUE
110              Nov_Emp    decimal     TRUE
111             Nov_Prev    decimal     TRUE
112             Nov_Real    decimal     TRUE
113              Nov_Pag    decimal     TRUE
114              Dez_Emp    decimal     TRUE
115             Dez_Prev    decimal     TRUE
116             Dez_Real    decimal     TRUE
117              Dez_Pag    decimal     TRUE
118         DataExtracao   datetime     TRUE
119    Cod_Resv_Dota_SOF    numeric     TRUE
120         COD_IDT_COTC    numeric     TRUE

@RobertMyles
Copy link
Author

2):

With the CRAN version, I get:

> class(con)
[1] "OdbcConnection"
attr(,"package")
[1] "odbc"
> odbc::dbReadTable(con, "tblEmpenhos")
Error in new_result(connection@ptr, statement) : std::bad_alloc

@jimhester
Copy link
Contributor

That is a very large table, so it is possible the allocation is actually too large for your memory. Could you try selecting a subset of the columns and see if you get the same error?

dbGetQuery(con, "SELECT DataInicial, DataFinal from tblEmpenhos")

Alternatively it is possible that Dt_Empenho, which is datetime2 is causing the issue, possibly try the query without that column.

@RobertMyles
Copy link
Author

RobertMyles commented Jun 19, 2017

Sure, it is a big table, but I was hoping to use the new version of dplyr with it (hence the issue there). Before, I was using RODBC, and I switched to DBI/dplyr/odbc to avail of the new dplyr database capabilities. It was my understanding that I could do that without actually loading the table into memory...is there something basic I'm doing wrong?
dbGetQuery(con, "SELECT DataInicial, DataFinal from tblEmpenhos") works without any problems, by the way.

@jimhester
Copy link
Contributor

jimhester commented Jun 19, 2017

Running tbl(con, "tblEmpenhos") on the console will invoke the print method for the object, which will retrieve the first 10 rows of the table. Do you get the same bad alloc error when you run

dbGetQuery(con, "SELECT TOP 10 * from tblEmpenhos")

@RobertMyles
Copy link
Author

Yup:

> dbGetQuery(con, "SELECT TOP 6 * from tblEmpenhos")
Error in new_result(connection@ptr, statement) : std::bad_alloc

(Sorry, I only got back to this today because it's a database in work)

@jimhester
Copy link
Contributor

jimhester commented Jun 19, 2017

Please verify if it is a specific column causing the bad allocation or just the total number of tables returned. A helper function could be useful for doing this.

columns <- function(con, table, idx) {
  fields <- dbListFields(con, table)
  fields <- fields[idx]
  sql <-
    paste0(
      "SELECT TOP 10 ",
      paste0(dbQuoteIdentifier(con, fields), collapse = ", "),
      " FROM ", dbQuoteIdentifier(con, table))
  dbGetQuery(con, sql)
}
start <- 1
end <- seq(5, 120, 5)

for (e in end) {
  cat(e, "\n")
  columns(con, "tblEmpenhos", seq(start, e))
}

@RobertMyles
Copy link
Author

I get Error in new_result(connection@ptr, statement) : std::bad_alloc with this helper function. I've used this table before for SQL queries using the RODBC package without any problems.

@jimhester
Copy link
Contributor

Yes, but when does it throw the error, you need to pinpoint why you are getting the bad allocation, either because a specific column is resulting in add a bad allocation or too many columns are being retrieved at once. Add a cat(e, "\n") in the for loop so you can figure out how many columns result in the error, then once you have determined this modify the table parameter until you figure out what columns produce it.

@RobertMyles
Copy link
Author

I think I've found it. I get the error at 64, which from the earlier post above, is:

64 Ds_Observacao_Anexo ntext TRUE

It's the only ntext column in the table, too.

@jimhester
Copy link
Contributor

jimhester commented Jun 19, 2017

Ok that definitely helps, do you get the bad alloc error for any rows in the column? If you run the following command does it always give you a bad_alloc error?

dbGetQuery(con, "SELECT TOP 10 Ds_Observacao_Anexo FROM tblEmpenhos TABLESAMPLE ( 10 ROWS )")

@jimhester
Copy link
Contributor

Sorry my SQL above was incorrect, should be correct now.

@RobertMyles
Copy link
Author

Yeah, I get an error every time I try that code.

@jimhester
Copy link
Contributor

Could you retrieve the data from that column (via RODBC or something else) and paste a sample of it here. I cannot reproduce the error with a simple ntext column, so it may be due to particularities in your data.

@RobertMyles
Copy link
Author

It could be, although I get the data back with RODBC (I've had to use XX on some things and not use all the print out of the factor levels, data privacy etc):


> x$Ds_Observacao_Anexo[1:10]
 [1] Demais condições de acordo com o contrato vigente                                                                               
 [2] Prorrogação do Contrato nXXXXXXXXXXXXX Condições conforme Cláusula Contratual.             
 [3] Prorrogação do Contrato XXXXXXXX- 2016                                                            
 [4] ATA XXXXXXXXXXXXX 10/08/2017.\r\nAV. YERVANT KISSAJIKIAN Nº 416 - VL. CONSTANCIA.
 [5] <NA>                                                                                                                            
 [6] PAGAMENTO ASSISTENTE TECNICO                                                                                              
 [7] PAGAMENTO ASSISTENTE TECNICO                                                                                             
 [8] <NA>                                                                                                                            
 [9] <NA>                                                                                                                            
[10] <NA>                                                                                                                            
342234 Levels: \177PENALIDADES\r\n\r\nCONFORME CONTRATO N°20/ 

Assuming it is an issue with a particular column, is there a way to ignore this while using odbc and the new dplyr to work with the base? (I understand you may not know on the dplyr end, but Hadley sent me here so I'd rather check everything before I go back an reopen the issue) I get this bad allocation error as soon as I try to connect dplyr to the base.

@jimhester
Copy link
Contributor

Yes you can use select to remove the problematic column.

x <- tbl(con, "tblEmpenhos") %>% select(-Ds_Observacao_Anexo)
x

Alternatively you can provide a SQL query of just the columns you are interested in.

x <- tbl(con, sql("SELECT DataInicial, DataFinal from tblEmpenhos"))
x

@jimhester
Copy link
Contributor

Also what are the values of max(nchar(levels(x$Ds_Observacao_Anexo))) and table(Encoding(levels(x$Ds_Observacao_Anexo))).

@RobertMyles
Copy link
Author



> max(nchar(levels(x$Ds_Observacao_Anexo)))
[1] 32768


> table(Encoding(levels(x$Ds_Observacao_Anexo)))

unknown 
 342234 

@jimhester
Copy link
Contributor

I am unfortunately unable to reproduce the bad::alloc error, even if I create a table with ntext field and 1000 strings of 32768 length.

letters <- c(letters, "\u00e4", "\u00f6", "\u00fc")
t1 <- replicate(1000, paste0(letters[sample(length(letters), size=32768, replace=T)], collapse = ""))
dbWriteTable(con, "test", data.frame(a = t1, stringsAsFactors=F), fieldTypes = c(a = "ntext"), overwrite=T)
x <- dbReadTable(con, "test")

Could you return the result of the following, which might indicate how the column definition differs from what I have done above?

dbGetQuery(con, "exec sp_columns tblEmpenhos")

@RobertMyles
Copy link
Author

Is it ok if I send it to the email on your profile? It's very big to paste, and I also don't want to put too much information on it here.

@jimhester
Copy link
Contributor

You can just put the information for the Ds_Observacao_Anexo column, I think it is the only thing relevant to this issue.

@RobertMyles
Copy link
Author

bad_alloc

@jimhester
Copy link
Contributor

Ok, all of that information is identical to what I see.

I do not think I can debug this issue further without the ability to reproduce it locally, I would suggest you use the workarounds suggested above.

If you are able to produce a more minimal example using generated data that exhibits the behavior I can try to fix the issue.

@RobertMyles
Copy link
Author

Ok @jimhester, thanks for all your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants