This project is read-only.

Missing data values

Sep 27, 2012 at 12:48 PM

In my .NET application I  have a double array (for example dim x() as double) and the application currently is using the value -9999 to indicate missing data in that array. How can I indicate via R.NET.DLL or maybe directly to R itself that -9999 that should be treated as missing data? It seems like there should be an option in the CreateNumericVector() function to declare this. Any help is greatly appreciated.

Oct 4, 2012 at 1:39 PM
Edited Oct 4, 2012 at 1:40 PM

Hi stewartrn,

Good question. NA in R becomes NaN in C# (for double, anyway). However double.NaN becomes NaN in R. Below is a workaround where you can transform the NaN values to NA.

This is indeed a good feature request; NA's are an important aspect of R. Thanks.

 

 

using (REngine engine = REngine.CreateInstance("RDotNet"))
{
    engine.Initialize();
    // .NET Framework array to R vector.
    NumericVector a = engine.CreateNumericVector(new double[] { 30.02, double.NaN, 29.99 });
    engine.SetSymbol("a", a);
    engine.Evaluate("print(a)");
    engine.Evaluate("a[is.nan(a)] <- NA");
    engine.Evaluate("print(a)");
    NumericVector b = engine.Evaluate("c(1.2,2.3, NA, 4.5)").AsNumeric();
    // looking at VC# debugger: this is a  double.NaN
}

Oct 4, 2012 at 4:38 PM

Hey thanks! I’m moving forward again now! R.Dot.dll is AMAZING – keep up the great work.

From: jperraud [email removed]
Sent: Thursday, October 04, 2012 8:40 AM
To: Stewart, Robert N.
Subject: Re: Missing data values [rdotnet:397138]

From: jperraud

Hi stewartrn,

Good question. NA in R becomes NaN in C# (for double, anyway). However double.NaN becomes NaN in R. Below is a workaround where you can transform the NaN values to NA.

This is indeed a good feature request; NA's are an important aspect of R. Thanks.

using (REngine engine = REngine.CreateInstance("RDotNet"))
{
    engine.Initialize();
    // .NET Framework array to R vector.
    NumericVector a = engine.CreateNumericVector(new double[] { 30.02, double.NaN, 29.99 });
    engine.SetSymbol("a", a);
    engine.Evaluate("print(a)");
    engine.Evaluate("a[is.nan(a)] <- NA");
    engine.Evaluate("print(a)");
    NumericVector b = engine.Evaluate("c(1.2,2.3, NA, 4.5)").AsNumeric();
    // looking at VC# debugger: this is an NA
}
 
Oct 3, 2013 at 4:27 PM
It's been a year since the last post. I wonder if there is a more elegant solution available now?
Oct 4, 2013 at 9:14 AM
There is not one that I know. I am working when I can on unit tests on a branch and touched on missing values. At least for double there is NaN that is not a bad equivalent to R's NA, but for 32 bits integers there is not even that, and there is scope for unaware users to make mistakes if they do not learn to know R.NET well enough.

There is not a lot to do about this: the languages and their basic types just have a fundamental difference. I thought of what could be done using nullable values but this is dubious whether this is worth given the drawbacks in usability in most circumstances.

Suggestions always welcome of course.
Oct 4, 2013 at 11:49 PM
No good to have the R vectors able to be loaded (and load into) both int? and int ??
double.NaN is ok (does this imply a +/- infinity ?) but I can see on the integer side no good way to represent (the R) NA could be an issue.

How about the R.NET datatypes, NumericVector, etc, holds a "doubleR" and "intR" array - which CAN be loaded straight into a double, double?, int, or int? --- but also one can examine each element and see the real R value.

So, after a load with....
      NumericVector b = engine.Evaluate("c(1.2,2.3, NA, 4.5)").AsNumeric();
You could -
 double[] d = b.ToArray();      [ NAs become 0 or -9999 or double.NaN]
Or -
 double?[] d = b.ToArray();    [NAs become null ]

 int[] i = b.ToArray();              [NAs become 0 or -9999]

 int?[] i = b.ToArray()             [NAs become null]

BUT
  foreach(var x in b)
  {
           if(x.IsNA)  do_something();
   }



Any thoughts or barking up the wrong tree ?