Home » Posts tagged 'Microsoft Excel'
Tag Archives: Microsoft Excel
I got interviewed on moving on from Excel to R in Human Resources (HR) here at http://www.hrtecheurope.com/blog/?p=5345
“There is a lot of data out there and it’s stored in different formats. Spreadsheets have their uses but they’re limited in what they can do. The spreadsheet is bad when getting over 5000 or 10000 rows – it slows down. It’s just not designed for that. It was designed for much higher levels of interaction.
In the business world we really don’t need to know every row of data, we need to summarise it, we need to visualise it and put it into a powerpoint to show to colleagues or clients.”
And a more recent interview with my fellow IIML mate, and editor at Analytics India Magazine
AIM: Which R packages do you use the most and which ones are your favorites?
AO: I use R Commander and Rattle a lot, and I use the dependent packages. I use car for regression, and forecast for time series, and many packages for specific graphs. I have not mastered ggplot though but I do use it sometimes. Overall I am waiting for Hadley Wickham to come up with an updated book to his ecosystem of packages as they are very formidable, completely comprehensive and easy to use in my opinion, so much I can get by the occasional copy and paste code.
A surprising review at R- Bloggers.com /Intelligent Trading
The good news is that many of the large companies do not view R as a threat, but as a beneficial tool to assist their own software capabilities.
After assisting and helping R users navigate through the dense forest of various GUI interface choices (in order to get R up and running), Mr. Ohri continues to handhold users through step by step approaches (with detailed screen captures) to run R from various simple to more advanced platforms (e.g. CLOUD, EC2) in order to gather, explore, and process data, with detailed illustrations on how to use R’s powerful graphing capabilities on the back-end.
Do you want to write a review too? You can visit the site here
- What does R do? Bring people together, of course! (r-bloggers.com)
- Book Review: R for Business Analytics, A Ohri (r-bloggers.com)
I have not been really posting or writing worthwhile on the website for some time, as I am still busy writing ” R for Business Analytics” which I hope to get out before year end. However while doing research for that, I came across many types of graphs and what struck me is the actual usage of some kinds of graphs is very different in business analytics as compared to statistical computing.
The criterion of top ten graphs is as follows-
1) Usage-The order in which they appear is not strictly in terms of desirability but actual frequency of usage. So a frequently used graph like box plot would be recommended above say a violin plot.
2) Adequacy- Data Visualization paradigms change over time- but the need for accurate conveying of maximum information in a minium space without overwhelming reader or misleading data perceptions.
3) Ease of creation- A simpler graph created by a single function is more preferrable to writing 4-5 lines of code to create an elaborate graph.
4) Aesthetics- Aesthetics is relative and in addition studies have shown visual perception varies across cultures and geographies. However , beauty is universally appreciated and a pretty graph is sometimes and often preferred over a not so pretty graph. Here being pretty is in both visual appeal without compromising perceptual inference from graphical analysis.
so When do we use a bar chart versus a line graph versus a pie chart? When is a mosaic plot more handy and when should histograms be used with density plots? The list tries to capture most of these practicalities.
Let me elaborate on some specific graphs-
1) Pie Chart- While Pie Chart is not really used much in stats computing, and indeed it is considered a misleading example of data visualization especially the skewed or two dimensional charts. However when it comes to evaluating market share at a particular instance, a pie chart is simple to understand. At the most two pie charts are needed for comparing two different snapshots, but three or more pie charts on same data at different points of time is definitely a bad case.
In R you can create piechart, by just using pie(dataset$variable)
As per official documentation, pie charts are not recommended at all.
Pie charts are a very bad way of displaying information. The eye is good at judging linear measures and bad at judging relative areas. A bar chart or dot chart is a preferable way of displaying this type of data.
Cleveland (1985), page 264: “Data that can be shown by pie charts always can be shown by a dot chart. This means that judgements of position along a common scale can be made instead of the less accurate angle judgements.” This statement is based on the empirical investigations of Cleveland and McGill as well as investigations by perceptual psychologists.
Despite this, pie charts are frequently used as an important metric they inevitably convey is market share. Market share remains an important analytical metric for business.
The pie3D( ) function in the plotrix package provides 3D exploded pie charts.An exploded pie chart remains a very commonly used (or misused) chart.
we see some rules for using Pie charts.
From the R Graph Gallery (a slightly outdated but still very comprehensive graphical repository)
par(bg="gray") pie(rep(1,24), col=rainbow(24), radius=0.9) title(main="Color Wheel", cex.main=1.4, font.main=3) title(xlab="(test)", cex.lab=0.8, font.lab=3) (Note adding a grey background is quite easy in the basic graphics device as well without using an advanced graphical package)
- Handling Small Data Percentages in a Microsoft Excel Pie Chart (brighthub.com)
- Pie-Packing by Mario Klingemann: More fascinating pie chart art (lovestats.wordpress.com)
Just got a PR email from Michael Zeller,CEO , Zementis annoucing Zementis (ADAPA) and Revolution Analytics just partnered up.
Is this something substantial or just time-sharing http://bi.cbronline.com/news/sas-ceo-says-cep-open-source-and-cloud-bi-have-limited-appeal or a Barney Partnership (http://www.dbms2.com/2008/05/08/database-blades-are-not-what-they-used-to-be/)
Summary- Thats cloud computing scoring of models on EC2 (Zementis) partnering with the actual modeling software in R (Revolution Analytics RevoDeployR)
See previous interviews with both Dr Zeller at http://decisionstats.com/2009/02/03/interview-michael-zeller-ceozementis/ ,http://decisionstats.com/2009/05/07/interview-ron-ramos-zementis/ and http://decisionstats.com/2009/10/05/interview-michael-zellerceo-zementis-on-pmml/)
and Revolution guys at http://decisionstats.com/2010/08/03/q-a-with-david-smith-revolution-analytics/
- Revolution R Enterprise 4.2 now available (revolutionanalytics.com)
- Enterprise Startup Spotlight: Revolution Analytics, Taking on SAS, SPSS (readwriteweb.com)
- Gartner predicts business intelligence revolution (v3.co.uk)
Close to the launch of JMP9 with it’s R integration comes the announcement of JMP Genomics 5 released. The product brief is available here http://jmp.com/software/genomics/pdf/103112_jmpg5_prodbrief.pdf and it has an interesting mix of features. If you want to try out the features you can see http://jmp.com/software/license.shtml
As per me, I snagged some “new”stuff in this release-
- Perform enrichment analysis using functional information from Ingenuity Pathways Analysis.+
- New bar chart track allows summarization of reads or intensities.
- New color map track displays heat plots of information for individual subjects.
- Use a variety of continuous measures for summarization.
- Using a common identifier, compare list membership for up tofive groups and display overlaps with Venn diagrams.
- Filter or shade segments by mean intensity, with an optionto display segment mean intensity and set a reference valuefor shading.
- Adjust intensities or counts for experimental samples using paired or grouped control samples.
- Screen paired DNA and RNA intensities for allele-specific expression.
- Standardize using a shifting factor and perform log2transformation after standardization.
- Use kernel density information in loess and quantile normalization.
- Depict partition tree information graphically for standard models with new Tree Viewer
- Predictive modeling for survival analysis with Harrell’s assessment method and integration with Cross-Validation Model Comparison.
That’s right- that is incorporating the work of our favorite professor from R Project himself- http://biostat.mc.vanderbilt.edu/wiki/Main/FrankHarrell
Apparently Prof Frank E was quite a SAS coder himself (see http://biostat.mc.vanderbilt.edu/wiki/Main/SasMacros)
Back to JMP Genomics 5-
The JMP software platform provides:
• New integration capabilities let R users leverage JMP’s interactivegraphics to display analytic results.
• Tools for R programmers to build and package user interfaces that let them share customized R analytics with a broader audience.•
A new add-in infrastructure that simplifies the integration of external analytics into JMP.
+ For people in life sciences who like new stats software you can also download a trial version of IPA here at http://www.ingenuity.com/products/IPA/Free-Trial-Software.html
Read rest of the new software here http://jmp.com/software/genomics/pdf/103112_jmpg5_prodbrief.pdf
- JMP 9 releasing on Oct 12 (r-bloggers.com)
- New JMP Software Version Extends Analytic Options (eon.businesswire.com)
- Dan Ariely Headlines JMP Analytics Conference (eon.businesswire.com)
- Whole Genome Sequencing of Japanese Individual Reveals Wealth of Undiscovered Genetic Variation (prweb.com)
- Blog – Ozzy Osbourne’s Genome (technologyreview.com)
- SAS Continues to Expand Analytics Options with Additional R Integration (eon.businesswire.com)
- Human Genome Sciences Invites Investors to Listen to Webcast of Presentation at JMP Securities Healthcare Conference (eon.businesswire.com)
- SAS, JMP Mix Simulation and Analytics to Foster Innovation (eon.businesswire.com)
- Using JMP 9 and R together (r-bloggers.com)
- Japanese flower has the biggest genome in the world [Mad Genomics] (io9.com)
- JMP Customer Herzenberg Lab Wins Computerworld Honor (eon.businesswire.com)
An announcement from Zementis and Predixion Software- about using cloud computing for scoring models using PMML. Note R has a PMML package as well which is used by Rattle, data mining GUI for exporting models.
ALISO VIEJO, Calif., Oct 19, 2010 (BUSINESS WIRE) — Predixion Software today introduced Predixion PMML Connexion(TM), an interface that provides Predixion Insight(TM), the company’s low-cost, self-service in the cloud predictive analytics solution, direct and seamless access to SAS, SPSS (IBM) and other predictive models for use by Predixion Insight customers. Predixion PMML Connexion enables companies to leverage their significant investments in legacy predictive analytics solutions at a fraction of the cost of conventional licensing and maintenance fees.
The announcement was made at the Predictive Analytics World conference in Washington, D.C. where Predixion also announced a strategic partnership with Zementis, Inc., a market leader in PMML-based solutions. Zementis is exhibiting in Booth #P2.
The Predictive Model Markup Language (PMML) standard allows for true interoperability, offering a mature standard for moving predictive models seamlessly between platforms. Predixion has fully integrated this PMML functionality into Predixion Insight, meaning Predixion Insight users can now effortlessly import PMML-based predictive models, enabling information workers to score the models in the cloud from anywhere and publish reports using Microsoft Excel(R) and SharePoint(R). In addition, models can also be written back into SAS, SPSS and other platforms for a truly collaborative, interoperable solution.
“Predixion’s investment in this PMML interface makes perfect business sense as the lion’s share of the models in existence today are created by the SAS and SPSS platforms, creating compelling opportunity to leverage existing investments in predictive and statistical models on a low-cost cloud predictive analytics platform that can be fed with enterprise, line of business and cloud-based data,” said Mike Ferguson, CEO of Intelligent Business Strategies, a leading analyst and consulting firm specializing in the areas of business intelligence and enterprise business integration. “In this economy, Predixion’s low-cost, self-service predictive analytics solutions might be welcome relief to IT organizations chartered with quickly adding additional applications while at the same time cutting costs and staffing.”
“We are pleased to be partnering with Zementis, truly a PMML market leader and innovator,” said Predixion CEO Simon Arkell. “To allow any SAS or SPSS customer to immediately score any of their predictive models in the cloud from within Predixion Insight, compare those models to those created by Predixion Insight, and share the results within Excel and Sharepoint is an exciting step forward for the industry. SAS and SPSS customers are fed up with the high prices they must pay for their business users just to access reports generated by highly skilled PhDs who are burdened by performing routine tasks and thus have become a massive bottleneck. That frustration is now a thing of the past because any information worker can now unlock the power of predictive analytics without relying on experts — for a fraction of the cost and from anywhere they can connect to the cloud,” Arkell said.
Dr. Michael Zeller, Zementis CEO, added, “Our mission is to significantly shorten the time-to-market for predictive models in any industry. We are excited to be contributing to Predixion’s self-service, cloud-based predictive analytics solution set.”
About Predixion Software
Predixion Software develops and markets collaborative predictive analytics solutions in the public and private cloud. Predixion enables self-service predictive analytics, allowing customers to use and analyze large amounts of data to make actionable decisions, all within the familiar environment of Excel and PowerPivot. Predixion customers are achieving immediate results across a multitude of industries including: retail, finance, healthcare, marketing, telecommunications and insurance/risk management.
Predixion Software is headquartered in Aliso Viejo, California with development offices in Redmond, Washington. The company has venture capital backing from established investors including DFJ Frontier, Miramar Venture Partners and Palomar Ventures. For more information please contact us at 949-330-6540, or visit us atwww.predixionsoftware.com.
Zementis, Inc. is a leading software company focused on the operational deployment and integration of predictive analytics and data mining solutions. Its ADAPA(R) decision engine successfully bridges the gap between science and engineering. ADAPA(R) was designed from the ground up to benefit from open standards and to significantly shorten the time-to-market for predictive models in any industry. For more information, please visit www.zementis.com.
- Event: Predictive analytics with R, PMML and ADAPA (r-bloggers.com)
- Lyzasoft Integrates Low-cost Predictive Analytics into its Data Analytics and Collaboration Platform (eon.businesswire.com)
- Predixion Software Introduces Self-Service, Cloud-Based Predictive Analytics Solution (eon.businesswire.com)
- SAS Rolls Out Predictive Analytics for Business Users (nytimes.com)
- Rattle Re-Introduced (r-bloggers.com)
- Predixion Software Finalizes Series A Financing – Raises $5 Million (eon.businesswire.com)
- Taking R to the Limit: Large Datasets; Predictive modeling with PMML and ADAPA (r-bloggers.com)
- Interview Dean Abbott Abbott Analytics (r-bloggers.com)