Home » Posts tagged 'new software'
Tag Archives: new software
Here is an interview with Mike Boyarski , Director Product Marketing at Jaspersoft
the largest BI community with over 14 million downloads, nearly 230,000 registered members, representing over 175,000 production deployments, 14,000 customers, across 100 countries.
I just checked out this new software for making PMML models. It is called Augustus and is created by the Open Data Group (http://opendatagroup.com/) , which is headed by Robert Grossman, who was the first proponent of using R on Amazon Ec2.
Probably someone like Zementis ( http://adapasupport.zementis.com/ ) can use this to further test , enhance or benchmark on the Ec2. They did have a joint webinar with Revolution Analytics recently.
- Augustus v 0.4.3.1 has been released
- Added a guide (pdf) for including Augustus in the Windows System Properties.
- Updated the install documentation.
- Augustus 2010.II (Summer) release is available. This is v 0.4.2.0. More information is here.
- Added performance discussion concerning the optional cyclic garbage collection.
See Recent News for more details and all recent news.
Augustus is a PMML 4-compliant scoring engine that works with segmented models. Augustus is designed for use with statistical and data mining models. The new release provides Baseline, Tree and Naive-Bayes producers and consumers.
There is also a version for use with PMML 3 models. It is able to produce and consume models with 10,000s of segments and conforms to a PMML draft RFC for segmented models and ensembles of models. It supports Baseline, Regression, Tree and Naive-Bayes.
Augustus is written in Python and is freely available under the GNU General Public License, version 2.
See the page Which version is right for me for more details regarding the different versions.
Predictive Model Markup Language (PMML) is an XML mark up language to describe statistical and data mining models. PMML describes the inputs to data mining models, the transformations used to prepare data for data mining, and the parameters which define the models themselves. It is used for a wide variety of applications, including applications in finance, e-business, direct marketing, manufacturing, and defense. PMML is often used so that systems which create statistical and data mining models (“PMML Producers”) can easily inter-operate with systems which deploy PMML models for scoring or other operational purposes (“PMML Consumers”).
For information regarding using Augustus with Change Detection and Health and Status Monitoring, please see change-detection.
Open Data Group provides management consulting services, outsourced analytical services, analytic staffing, and expert witnesses broadly related to data and analytics. It has experience with customer data, supplier data, financial and trading data, and data from internal business processes.
It has staff in Chicago and San Francisco and clients throughout the U.S. Open Data Group began operations in 2002.
The above example contains plots generated in R of scoring results from Augustus. Each point on the graph represents a use of the scoring engine and a chart is an aggregation of multiple Augustus runs. A Baseline (Change Detection) model was used to score data with multiple segments.
Augustus is typically used to construct models and score data with models. Augustus includes a dedicated application for creating, or producing, predictive models rendered as PMML-compliant files. Scoring is accomplished by consuming PMML-compliant files describing an appropriate model. Augustus provides a dedicated application for scoring data with four classes of models, Baseline (Change Detection) Models, Tree Models, Regression Models and Naive Bayes Models. The typical model development and use cycle with Augustus is as follows:
- Identify suitable data with which to construct a new model.
- Provide a model schema which proscribes the requirements for the model.
- Run the Augustus producer to obtain a new model.
- Run the Augustus consumer on new data to effect scoring.
Separate consumer and producer applications are supplied for Baseline (Change Detection) models, Tree models, Regression models and for Naive Bayes models. The producer and consumer applications require configuration with XML-formatted files. The specification of the configuration files and model schema are detailed below. The consumers provide for some configurability of the output but users will often provide additional post-processing to render the output according to their needs. A variety of mechanisms exist for transmitting data but user’s may need to provide their own preprocessing to accommodate their particular data source.
In addition to the producer and consumer applications, Augustus is conceptually structured and provided with libraries which are relevant to the development and use of Predictive Models. Broadly speaking, these consist of components that address the use of PMML and components that are specific to Augustus.
Augustus can accommodate a post-processing step. While not necessary, it is often useful to
- Re-normalize the scoring results or performing an additional transformation.
- Supplements the results with global meta-data such as timestamps.
- Formatting of the results.
- Select certain interesting values from the results.
- Restructure the data for use with other applications.
- Revolution R, PMML and ADAPA: Webinar April 13 (revolutionanalytics.com)
- Predicting R models with PMML: Revolution R Enterprise and ADAPA (revolutionanalytics.com)
- In case you missed it: March Roundup (revolutionanalytics.com)
Close to the launch of JMP9 with it’s R integration comes the announcement of JMP Genomics 5 released. The product brief is available here http://jmp.com/software/genomics/pdf/103112_jmpg5_prodbrief.pdf and it has an interesting mix of features. If you want to try out the features you can see http://jmp.com/software/license.shtml
As per me, I snagged some “new”stuff in this release-
- Perform enrichment analysis using functional information from Ingenuity Pathways Analysis.+
- New bar chart track allows summarization of reads or intensities.
- New color map track displays heat plots of information for individual subjects.
- Use a variety of continuous measures for summarization.
- Using a common identifier, compare list membership for up tofive groups and display overlaps with Venn diagrams.
- Filter or shade segments by mean intensity, with an optionto display segment mean intensity and set a reference valuefor shading.
- Adjust intensities or counts for experimental samples using paired or grouped control samples.
- Screen paired DNA and RNA intensities for allele-specific expression.
- Standardize using a shifting factor and perform log2transformation after standardization.
- Use kernel density information in loess and quantile normalization.
- Depict partition tree information graphically for standard models with new Tree Viewer
- Predictive modeling for survival analysis with Harrell’s assessment method and integration with Cross-Validation Model Comparison.
That’s right- that is incorporating the work of our favorite professor from R Project himself- http://biostat.mc.vanderbilt.edu/wiki/Main/FrankHarrell
Apparently Prof Frank E was quite a SAS coder himself (see http://biostat.mc.vanderbilt.edu/wiki/Main/SasMacros)
Back to JMP Genomics 5-
The JMP software platform provides:
• New integration capabilities let R users leverage JMP’s interactivegraphics to display analytic results.
• Tools for R programmers to build and package user interfaces that let them share customized R analytics with a broader audience.•
A new add-in infrastructure that simplifies the integration of external analytics into JMP.
+ For people in life sciences who like new stats software you can also download a trial version of IPA here at http://www.ingenuity.com/products/IPA/Free-Trial-Software.html
Read rest of the new software here http://jmp.com/software/genomics/pdf/103112_jmpg5_prodbrief.pdf
- JMP 9 releasing on Oct 12 (r-bloggers.com)
- New JMP Software Version Extends Analytic Options (eon.businesswire.com)
- Dan Ariely Headlines JMP Analytics Conference (eon.businesswire.com)
- Whole Genome Sequencing of Japanese Individual Reveals Wealth of Undiscovered Genetic Variation (prweb.com)
- Blog – Ozzy Osbourne’s Genome (technologyreview.com)
- SAS Continues to Expand Analytics Options with Additional R Integration (eon.businesswire.com)
- Human Genome Sciences Invites Investors to Listen to Webcast of Presentation at JMP Securities Healthcare Conference (eon.businesswire.com)
- SAS, JMP Mix Simulation and Analytics to Foster Innovation (eon.businesswire.com)
- Using JMP 9 and R together (r-bloggers.com)
- Japanese flower has the biggest genome in the world [Mad Genomics] (io9.com)
- JMP Customer Herzenberg Lab Wins Computerworld Honor (eon.businesswire.com)
I am just about testing the Karmic Koala which is due for launch next week. There is significant amount of Browser Based Operating System in it- it seems like a Cloud OS and Firefox have been integrated. See the following screenshots-
Note the ability to send email from the toolbar itself- Also the system speeded up considerably after the upgrade was installed. The striking change was in design and folder structure (it seems that they analyzed the usage data in Canonical to decide what design feature to keep and what not to keep)
Here are more views, and yes the Website http://www.ubuntu.com just went for an upgrade maintenance to cope with next week’s heavy rush or downloading.
Caveat- this is just the beta version but 5 days before launch beta version generally stay faithful to design changes.
Visit http://www.ubuntu.com for a better look- I believe dual boot Windows 7 and Koala are supported that helps you with
trying Ubuntu for fun and Windows for your original work.