Performance - How to retrieve big matrixes from R?

Dec 10, 2012 at 11:31 PM
Edited Dec 10, 2012 at 11:57 PM

Today I started implementing RdotNet 1.5 and I'm truly impressed !!
(Works out of the box. Easy to implement. Stable. Powerful.)

Retrieving a big matrix  from R seems however a little slow (12 columns, 1 million rows).
I tried two different approaches, both taking approximately a few minutes to complete.
Writing the matrix to a CSV file using R and reading it again by C# performs better. 

First method: Retrieve entire matrix:

NumericMatrix theMatrix = engine.Evaluate("transformedData").AsNumericMatrix();
if (theMatrix != null)
{
	Int32 cols = theMatrix.ColumnCount;
	Int32 rows = theMatrix.RowCount;
	Double[,] theResult = (Double[,])Array.CreateInstance(typeof(Double), rows, cols);
	theMatrix.CopyTo(theResult, rows, cols);
}

Second method: Retrieve matrix column by column:

List<List<double>> theResult = new List<List<double>>();
for (int i = 1; i <= headers.Count; i++)
{
	NumericVector theVector = engine.Evaluate("transformedData[," + i.ToString() + "]").AsNumeric();
	List<double> column = new List<double>();
	foreach (double r in theVector) column.Add(r);
	theResult.Add(column);
}

What is performance wise the best way to import such a big matrix?
Another implementation? Writing to disk? 

Thanks for your time!