--- title: "Creating a data.frame in C" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Creating a data.frame in C} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) knitr::knit_engines$set(callme = callme:::callme_engine) library(callme) ``` ```{css, echo=FALSE} .callme { background-color: #E3F2FD; } pre.callme span { background-color: #E3F2FD; } ``` ## Introduction `data.frame` objects are mostly `list` objects with some added constraints. In general a `data.frame` consists of * a list (`VECSXP`) * each item in the list is a column * each element in the list must have the same length * a character vector of column names (optional, but always included in practice) * an (optional) character vector of row names * the class `data.frame` ## Example: Create a data.frame in C The following code generates a data.frame within C. The C code is the equivalent of the following R code: ```{r eval=FALSE} data.frame( idx = 1:10, x = (0:9) + 10.0, y = (0:9) + 100.0 ) ``` ```{callme} #include #include //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Creating a data.frame within C and returning it to R // // 1. Create individual integer/real/whatever vectors // 2. allocate a VECSXP of the correct size // 3. assign each member into the data.frame // 4. create names and assign them to the data.frame // 5. set the class to "data.frame" // 6. set rownames on the data.frame //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SEXP create_data_frame_in_c(void) { //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Each column of the data.frame gets allocated separately //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ int n = 10; // number of rows SEXP idx_ = PROTECT(allocVector(INTSXP , n)); SEXP x_ = PROTECT(allocVector(REALSXP, n)); SEXP y_ = PROTECT(allocVector(REALSXP, n)); //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Assign some dummy values into the columns //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ for (int i = 0; i < n; i++) { INTEGER(idx_)[i] = i + 1; REAL(x_)[i] = i + 10; REAL(y_)[i] = i + 100; } //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Allocate a data.frame //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SEXP df_ = PROTECT(allocVector(VECSXP, 3)); //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Add columns to the data.frame //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SET_VECTOR_ELT(df_, 0, idx_); SET_VECTOR_ELT(df_, 1, x_); SET_VECTOR_ELT(df_, 2, y_); //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Treat the VECSXP as a data.frame rather than a list //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ setAttrib(df_, R_ClassSymbol, mkString("data.frame")); //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Set the names on the data.frame //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SEXP names = PROTECT(allocVector(STRSXP, 3)); SET_STRING_ELT(names, 0, mkChar("idx")); SET_STRING_ELT(names, 1, mkChar("x")); SET_STRING_ELT(names, 2, mkChar("y")); setAttrib(df_, R_NamesSymbol, names); //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Set the row.names on the data.frame // Use the shortcut as used in .set_row_names() in R // i.e. set rownames to c(NA_integer, -len) and it will // take care of the rest. This is equivalent to rownames(x) <- NULL //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SEXP rownames = PROTECT(allocVector(INTSXP, 2)); SET_INTEGER_ELT(rownames, 0, NA_INTEGER); SET_INTEGER_ELT(rownames, 1, -n); setAttrib(df_, R_RowNamesSymbol, rownames); UNPROTECT(6); return df_; } ``` ```{r} create_data_frame_in_c() ```