Azure // R.Net

Nov 10, 2014 at 9:38 PM
Hello,

I can get R.Net to work just fine with any of my desktop applications provided I have the application itself installed with the requisite registry entries. A couple questions I'm hoping someone has answers to:
1) Is it possible to simple have the required R binaries in the current directory of the application and avoid using the registry?

2) I want to use R.Net via a WorkerRole in Azure - is there anyway to do this?

Thanks in advance!

jbt
Developer
Nov 10, 2014 at 10:15 PM
Hi,

It is possible to run R.NET on machines without R locally installed; I have used this to run on windows compute clusters, where the R binaries used were on a remote shared drive. It is also feasible on Linux.

REngine.Initialize has the optional parameters rPath and rHome that can be used to specify where to look for R binaries and packages. the windows registry is searched for only if these are not provided explicitly.

I do not know enough about Azure to address the second question; would be interested to see someone else's answer.

J-M
Nov 11, 2014 at 4:01 PM
Thanks for the note. Given the version I'm using (recently installed R.Net 1.5.16 via NuGet and then R 3.1.2 itself; Windows 8; Visual Studio Ultimate 2013), I don't have a static method on REngine called Initialize and the method available from the instance I get doesn't have the aforementioned parameters. Here's my code - when run, the process exits with code (0x2) when execution reaches the "GetInstance()" method below:

var curDirectory = Directory.GetCurrentDirectory();
REngine.SetEnvironmentVariables(curDirectory, curDirectory);
System.Environment.SetEnvironmentVariable("R_HOME", curDirectory);
var curPath = System.Environment.GetEnvironmentVariable("PATH");
if (null != curPath)
if (!curPath.Contains(curDirectory))
System.Environment.SetEnvironmentVariable("PATH", System.Environment.GetEnvironmentVariable("PATH") + ";" + curDirectory); 
_rEng = REngine.GetInstance();

Perhaps I'm missing a binary - the 14 files I'm copying to the current working directory live here: C:\Program Files\R\R-3.1.2\bin Do I need to move other pieces?

Thanks in advance.

jbt
Nov 11, 2014 at 8:49 PM
Update : got it to work with the following:
            var curDirectory = Directory.GetCurrentDirectory() + "\\rengine";
            System.Environment.SetEnvironmentVariable("R_HOME", curDirectory);
            var curDirectoryWithBin = curDirectory + "\\bin";
            REngine.SetEnvironmentVariables(curDirectoryWithBin, curDirectory);
            var rPath = curDirectory + "\\bin\\R.dll";
            _rEng = REngine.GetInstance(rPath, true, null, null);
Note (for the good of the order) : the first parameter to REngine.SetEnvironmentVariables is not the "system path" but rather the path to R's bin directory (I missed that or assumed otherwise in the documentation).

Now working on the Azure piece and will post an update today or tomorrow if there's progress.

Best,

jbt
Nov 12, 2014 at 1:22 AM
2) Since the role is just something hosted in a local VM under IIS, you'll run into problems with R's global state being shared in the VM that is spun up. If you have a couple of work items queued, setting mfrow, creating a variable in the global environment, or something similar will be available to the requests that come in after that first work item. Also, if the VM spins down, you lose any saved state, so independent requests that might assume that the backend state is preserved, could break. That was an issue for my use case. Also, if the R process throws an abort() or access violation, that VM is hosed and you don't really have a clean way of detecting it (executing various R functions is something I'm doing with my version).

What I've done is created a WCF IPC layer that push's the R process out of the IIS host and as such, allows for dynamic per-user or per-session R instances so that the user's state is preserved in a live session, but everyone is isolated from one another. I also have some "malfunction" tracking that allows me to start a new instance for the user if the previous one is hosed.

Good luck.

Blue Skies,
Ritch
Nov 14, 2014 at 4:15 PM
Ritch,

Thanks for the note. My need for R in Azure is purely to augment statistical functionality not present in Math.Net; that, and the occasional individual work request where I might seek to produce a graphical representation of some data set. In this case, do you suspect I'll run into problems? I have a series of satellite systems outside the cloud I can farm work to in any case so I likely get around any cloud-based unpleasantness relative to R.

Have a great day.

jbt
Nov 14, 2014 at 7:37 PM
Getting the plots out of R is probably the hardest part, otherwise you could get away with something like the RServe and the RServe CLI. For us, having per-session R-instances spin up dynamically was pretty critical, and modifying R.Net to do that was easier than trying to mash up RServe with .Net.

If you just want statistical support, say ANOVA, ARMA, ARIMA, or anything non-niche, IronPython might be a less painful route.

I have a private fork that I've been working on for a web-based healthcare analytics application that allows for executing R code via a browser and returning SVG plots. Its stable enough, but my rewrite broke a bunch of features in R.Net that aren't priorities for us, as well as most of the integration tests in the tests projects. I'm getting back off of other feature work and will be fixing those things next week. If you're serious about your Azure feature, lets talk, that could make for an interesting collaboration. I work 100% remote for my company, so online collaboration isn't an issue. Again, if you are serious.


Blue Skies,
Ritch
Developer
Nov 14, 2014 at 9:00 PM
Reading this thread and taking stock of other discussions, I think I I'll start a discussion thread asking for options on how to handle multiple parallel R sessions and support web-based requests. This will take me a bit of time to write down it, but I thought I'd flag this.

BTW, Ritch, I would add you as developer on the project if I could, but I do not have the credentials to do so. I'll keep on merging your contributions.
Nov 15, 2014 at 2:11 AM
I appreciate that. I don't think I'm missing anything not having write access.

Honestly, I'd rather get off codeplex and go to github as a first choice, or bitbucket as a second. The PR and review mechanism is so much better on those sites.
Nov 15, 2014 at 2:16 AM
2nd for git.
Developer
Nov 15, 2014 at 4:12 AM
I've been continuing with mercurial as the creator's intent a year ago was to move to github anyway, but it is indeed time to just do it, and now. I'll keep you posted.
Nov 15, 2014 at 4:49 PM
I takes one minute to move the repo and history. Not much longer to move the documentation.
Nov 16, 2014 at 12:04 AM
Skyguy,

How can I reach you for an introductory email/call? My work is best explained verbally; you can reach me at taylor.b.jason@gmail.com Cheers,

jbt
Nov 16, 2014 at 4:49 AM
I'm on Central time, you can hit me up via video on gchat after 10AM, I have a 9AM MMM. Hopefully R.Net will get a gitter room going soon.
Nov 18, 2014 at 4:56 PM
Morning!

As/if needed, I can (per earlier comments) go with IronPython - though I'd love to continue to leverage the graphical capabilities of R in the cloud. That said - I've had run-time errors getting the worker in the air. While I can easily and happily load RDotNet (v 1.5.16) in a C# form application, I tried accessing the same common DLL (where I do the loading process) from a WorkerRoleWithQBQueue solution to no avail. A few points:
  • When I tried installed via the NuGet Packages Manager the RDotNet package - all it would offer was v 1.5.5 Curious.
  • While the worker role solution compiles, I get a run-time error :
    An unhandled exception of type 'System.TypeLoadException' occurred in Microsoft.WindowsAzure.ServiceRuntime.dll
    Additional information: Could not load type 'Common.RStatistics' from assembly 'Common, Version=1.0.5424.25643, Culture=neutral, PublicKeyToken=null'.
For reference, both the form application call the same method:
var statistics = RStatistics.Load();

The common class both applications leverage:
using System;
using System.Diagnostics;
using System.IO;
using RDotNet;

namespace Common
{
public static class RStatistics
{
    private static REngine _rEng;

    public static bool Load()
    {
        var load = true;

        try
        {
            if (null != _rEng)
                return true;
            var curDirectory = Directory.GetCurrentDirectory() + "\\rengine";
            System.Environment.SetEnvironmentVariable("R_HOME", curDirectory);
            var curDirectoryWithBin = curDirectory + "\\bin";
            REngine.SetEnvironmentVariables(curDirectoryWithBin, curDirectory);
            var rPath = curDirectory + "\\bin\\R.dll";

            if (!Directory.Exists(curDirectory))
            {
                //get files from cloud
                Trace.WriteLine("here");
            }



            _rEng = REngine.GetInstance(rPath, true, null, null);
        }
        catch (Exception ex)
        {
            Trace.WriteLine("Are all R DLLs present in the current directory? : " + ex.ToString());
            load = false;
        }

        if (!load)
            _rEng = null;

        return load;
    }

    public static void Unload()
    {
        if(null != _rEng)
            _rEng.Dispose();
    }
}
}
Nov 18, 2014 at 5:25 PM
Update : On a hunch (cause today feels like a synthetic Monday), I checked the specific versions of which modules the debugger was linking; somehow Visual Studio was finding and using an old version of the library in question - I've ran into this lovely situation before where a referenced DLL managed via another solution doesn't get copied or does and then somehow get's over-ridden by a version living in approot. If anyone knows how this happens, I'd love some visibility.

That said, I seem to have the worker role access the R wrapper.

jbt
Nov 18, 2014 at 8:46 PM
Glad you could fix it. Let me know if you have any other issues.
Nov 19, 2014 at 10:14 PM
So - would love to toss the fruits of a plot into an array of bytes - any idea how to do this? (perhaps the Cairo module?) I'm comfortably dumping to a temp file (jpg) and moving along from there so if this isn't possible, I have an alternative solution.

Thanks,

Jason
Nov 20, 2014 at 4:07 AM
I've worked with the R.Net graphics library to write a svg device that I serialize to my clients. I'm adding a PNG device. There were a couple of serious kinks I had to work out, including what I think is a bug with R's graphics engine (that's not worth fixing since there's a decent workaround).

You can do this purely in R by saving to a file as you suggest, but if you allow for dynamic R code execution from users, then that involves injecting and/or modifying the R payload before its sent to the engine.