Here is an interview with Louis Bajuk-Yorgan, from TIBCO. TIBCO which was the leading commercial vendor to S Plus, the precursor of the R language makes a commercial enterprise version of R called TIBCO Enterprise Runtime for R (TERR). Louis also presented recently at User2014 http://user2014.stat.ucla.edu/abstracts/talks/54_Bajuk-Yorgan.pdf
DecisionStats(DS)- How is TERR different from Revolution Analytics or Oracle R. How is it similar.
Louis Bajuk-Yorgan (Lou)- TERR is unique, in that it is the only commercially-developed alternative R interpreter. Unlike other vendors, who modify and extend the open source R engine, we developed TERR from the ground up, leveraging our 20+ years of experience with the closely-related S-PLUS engine.
Because of this, we were able to architect TERR to be faster, more scalable, and handle memory much more efficiently than the open source R engine. Other vendors are constrained by the limitations of the open source R engine, especially around memory management.
Another important difference is that TERR can be licensed to customers and partners for tight integration into their software, which delivers a better experience for their customers. Other vendors typically integrate loosely with open source R, keeping R at arm’s length to protect their IP from the risk of contamination by R’s GPL license. They often force customers to download, install and configure R separately, making for a much more difficult customer experience.
Finally, TIBCO provides full support for the TERR engine, giving large enterprise customers the confidence to use it in their production environments. TERR is integrated in several TIBCO products, including Spotfire and Streambase, enabling customers to take models developed in TERR and quickly integrate them into BI and real-time applications.
DS- How much of R is TERR compatible with?
We regularly test TERR with a wide variety of R packages, and extend TERR to greater R coverage over time. We are currently compatible with ~1800 CRAN packages, as well as many bioconductor packages. The full list of compatible CRAN packages is available at the TERR Community site at tibcommunity.com
DS- Describe Tibco Cloud Compute Grid, What are it’s applications for data science.
Tibco Cloud Compute Grid leverages the Tibco Gridserver architecture, which has been used by major Wall Street firms to run massively-parallel applications across tens of thousands of individual nodes. TIBCO CCG brings this robust platform to the cloud, enabling anyone to run massively-parallel jobs on their Amazon EC2 account. The platform is ideal for Big Computation types of jobs, such as Monte Carlo simulation and risk calculations. More information can be found at the TIBCO Cloud Marketplace at https://marketplace.cloud.tibco.com/
DS- What advantages does TIBCO’s rich history with the S project give it for the R project.
Lou- Our 20+ years of experience with S-PLUS gave us a unique knowledge of the commercial applications of the S/R language, deep experience with architecting, extending and maintaining a commercial S language engine, strong ties to the R community and a rich trove of algorithms we could apply on developing the TERR engine.
DS- Describe some benchmarks of TERR with open source of R.
Lou- While the speed of individual operations will vary, overall TERR is roughly 2-10x faster than open source R when applied to small data sets, but 10-100x faster when applied to larger data sets. This is because TERR’s efficient memory management enables it to handle larger data more reliably, and stay more linear in performance as data sizes increase.
DS- TERR is not open source. Why is that?
Lou- While open sourcing TERR is an option we continue to consider, we’ve decided to intially focus our energy and time on building the best S/R language engine possible. Running a successful, vibrant open source project is a significant undertaking to do well, and if we choose to do so, we will invest accordingly.
Instead, for now we’ve decided to make a Developer Editon of TERR freely available, so that the R community at large could still benefit from our work on TERR. The Developer Editon is available at tap.tibco.com
DS- How is TIBCO a company to work for potential data scientists.
Lou- Great! I’ve have worked in this space for nearly 18 years in large part because I get the opportunity to work with customers in many different industries (such as Life Sciences, Financial Services, Energy, Consumer Packaged Goods, etc), who are trying to solve valuable and interesting problems.
We have an entire team of data scientists, called the Industry Analytics Group, who work on these sorts of problems for our customers, and we are always looking for more Data Scientists to join that team.
DS- How is TIBCO giving back to the R Community globally. What are it’s plans on community.
Lou- As mentioned above, we make a free Developers Editon of TERR available. In addition, we’ve been sponsors of useR for several years, we contribute feedback to the R Core team as we develop TERR, and we often open source packages that we develop for TERR to that they can be used with open source R as well. This has included packages ported from S-PLUS (such as sjdbc) and new packages (such as tibbrConnector).
DS- As a sixth time attendee of UseR, Describe the evolution of R ecosystem as you have observed it.
It has been fascinating to see how the R community has grown and evolved over the years. The useR conference at UCLA this year was the largest ever (700+ attendees), with more commercial sponsors than ever before (including enterprise heavyweights like TIBCO, Teradata and Oracle, smaller analytic vendors like RStudio, Revolution and Alteryx, and new companies like plot.ly
). What really struck me, however, was the nature of the attendees. There were far more attendees from commercial companies this year, many of whom were R users. More so than in the past, there were many people who simply wanted to learn about R.
Lou Bajuk-Yorgan leads Predictive Analytics product strategy at TIBCO Spotfire, including the development of the new TIBCO Enterprise Runtime for R. With a background in Physics and Atmospheric Sciences, Lou was a Research Scientist at NASA JPL before focusing on analytics and BI software 16 years ago. An avid cyclist, runner and gamer, Lou frequently speaks and tweets (@LouBajuk
) about the importance of Predictive Analytics for the most valuable business challenges.