To ASP.NET or not to ASP.NET?

Sep 8, 2014 at 4:22 PM
Hi,
I'm new to R and R.Net and I'm approaching both with one single objective: from our web-app, call some relatively simple R commands, get some graphs out of it and send them to the client.

More in detail, the web-app is a Silverlight application that uses CSLA for the Business Logic layer, it's public-facing and needs to remain reliable.

I've tried to figure out what the issues with thread safety are and whether they have been resolved, and rapidly got lost. There are lots of warnings against trying to use R (and/or R.Net) from an ASP application, and some reassuring news in both the ISSUES section and elsewhere. So, all in all, I'm uncertain of the current situation.

Hence, the short question is: would it be safe to use R.Net with ASP/IIS?

We have plenty of concurrent users, but I think we can afford allowing just one thread to process calls to R, that is: user A calls some R package, while this is running, user B will be able to trigger another call, but this second call will only execute after the first is finished. It's not ideal, but we can afford to write code to support this strategy and see how it goes.
What we can't afford is restricting the whole application to run in a single thread, "just because" otherwise calls to R would fail. The way I understand ASP/.Net/IIS is that IIS will handle threads and may create and dispose them as requests come in, what I don't understand is whether this behaviour will be a problem while calling R via R.Net (and/or calling R in other ways such as D-Com).

I imagine the current project members do understand the current limitations, and I'm happy to help testing in case we do expect our scenario to be fully supported. Otherwise, I'm afraid that I'll find it difficult to justify spending testing time on something that might never work for us.

In all cases, thanks for this very interesting project.
Sergio
Sep 9, 2014 at 3:55 PM
The current version of R.Net was working for me in the general sense under ASP.Net/IIS. We did have a need to be able to stop/start/restart the R engine, so I've forked it and pulled the R engine into a separate process using WCF for IPC.
I've tried to figure out what the issues with thread safety are and whether they have been resolved.
There's a lock around sensitive parts of the R.Net code. It should mitigate the threading issues, but the R engine itself is single threaded, so there's never going to be a pure non-blocking multithreaded way to access a single instance of R. You'll also need to be concerned about sharing data between users. If you load a variable, say x, into the process, it'll be available to other users, even if it should be private. My out of process solution addresses this by allowing for per-user R instances.

FWIW, its DCOM.
Sep 10, 2014 at 11:47 AM
skyguy,
Thanks! This sounds great, it seems that you are talking about your own (local? couldn't see an obvious fork in here) customisation and one that it's closely following what I've been trying to set up as well.

I've started implementing a testing solution that includes:
A silverlight App with all the CSLA plumbing to simulate the full application (done).
A WCF service that will use R.Net to process requests.

The plan was the following:
1) See if I can restrict the WCF service to one thread. At this time it seems like the answer is no (not in IIS6, at least). See here for my question on stackoverflow.
2) Put my own lock block(s) in the service itself so that I'll be sure that only one thread can talk to R at any given time (seems to be the way forward).

from your reply I'm not sure whether we're replicating our efforts and/or if I'm just adding another unnecessary layer. Any clarification will be highly appreciated.
Thanks again,
Sergio
Sep 11, 2014 at 3:12 PM
Thanks! This sounds great, it seems that you are talking about your own (local? couldn't see an obvious fork in here) customisation and one that it's closely following what I've been trying to set up as well.
Its my erroneously named "BugFix" fork. Its not up-to-date atm, but its not that old either.
The plan was the following:
1) See if I can restrict the WCF service to one thread. At this time it seems like the answer is no (not in IIS6, at least). See here for my question on stackoverflow.
2) Put my own lock block(s) in the service itself so that I'll be sure that only one thread can talk to R at any given time (seems to be the way forward).
That's a bit complicated. Since the R instance is single-threaded, you do have automatic blocking at a certain point in the stack, I'm not certain how much you need it in other places. I had issues with creating the external process that hosted the IPC service (WCF) and there's some locks in my code around that logic.
I've started implementing a testing solution that includes:
A WCF service that will use R.Net to process requests.
You can do it that way, and depending on your needs it might be better. I think that's the way I should have done it, but, lesson learned. For our use case, we just need console and graphics output. We do not need the C#<->R type mappings. I chose to break into the code around the SymbolicExpression level assuming that since R provides a flat C-style API, R.Net imported the API at a flat level and then built an object model on top of it.
The current R.Net does not do that. It builds an object model, interspersing the API, and pointer arithmetic throughout its synthetic object model and engine management code. A big chunk of my work was pushing all the pointer arithmetic and direct API calls to a single layer that resides in the R process container. From that, I built a WCF service (three actually) for engine, object, and text/graphics output. activities I modified the existing R.Net object model to use these WCF services transparently.

What I currently have is stable, but I've ignored some R.Net feature like being cross-platform., and I haven't tested a bunch of the C# data mappings (like DataFrame). I'm continuing to work on it, and will keep that BugFix branch updated. Internally we're discussing releasing it as a nuget package "Distributed.R.Net" or something similar. Not certain at this point, but we don't want the source tree in our project.
Sep 11, 2014 at 3:13 PM
Oh, and one of the missing features is "xmldoc" comments. Those aren't coming back in my branch. ever.
Sep 12, 2014 at 2:50 PM
Thanks again.
In the mean time I've managed the following:
  • WCF RESTUFL service that includes all the code that talks with R, it will have one endpoint for each different function that we may need. This project (as well as the Sliverlight client host) run directly on IIS, not the VS Development Server, as this allows me to test security and stability in a configuration that is almost identical to what will run in production.
  • Testing App that follows our paradigm: Sliverlight client, Business Layer via CSLA, CSLA command objects will access the WCF service.
  • Two "Test" commands implemented: the first receives data, the second a plot.
    The whole thing is stable, unless I stop debugging while the service is executing something. If I do this, the next time the service tries to access the R.Net engine the IIS working process starts saturating a CPU and never returns; can't blame it, as in these conditions all the necessary cleanups aren't performed, but the main process remains active as it's running via IIS, clicking STOP(debug) just detaches the debugger. To avoid this problem I can just recycle the worker process, and when the automatic recycling happens the box returns quiet in all cases.
I still need to test what happens when requests happen concurrently, but I'm fairly sure that I can adjust either code or IIS settings to find something that will work for us.

Hence, I'm going to clean up the whole thing, add more tests and tell the rest of the team that we should be able to pull this one off. In the end I'll have a good testing environment where I can develop and try out all commands we may want to implement, all by using a service that can be deployed on any machine I want so that even if it will malfunction the problem will not propagate to the main application (the only issue will be with the R-related functions). I also haven't touched R.Net code itself, nor R, so in terms of licensing I think we should be on the clear.

Will post the news here once I'll have more tests done.
Thanks skyguy and all the devs in the project!
Sergio