Home » Posts tagged 'student'
Tag Archives: student
I really liked the initiatives at JMP/Academic. Not only they offer the software bundled with a textbook, which is both good common sense as well as business sense given how fast students can get confused
(Rant 1 Bundling with textbooks is something I think is Revolution Analytics should think of doing instead of just offering the academic version for free downloading- it would be interesting to see the penetration of R academic market with Revolution’s version and the open source version with the existing strategy)
Major publishers of introductory statistics textbooks offer a 12-month license to JMP Student Edition, a streamlined version of JMP, with their textbooks.
and a glance through this http://www.jmp.com/academic/pdf/jmp_se_comparison.pdf shows it is a credible and not extremely whittled down version which would be just dishonest.
And I loved this Reference Card at http://www.jmp.com/academic/pdf/jmp10_se_quick_guide.pdf
Oracle, SAP- Hana, Revolution Analytics and even SAS/STAT itself can make more reference cards like this- elegant solutions for students and new learners!
More- creative-rants Honestly why do corporate sites use PDFs anymore when they can use Instapaper , or any of these SlideShare/Scribd formats to show information in a better way without diverting the user from the main webpage.
But I digress, back to JMP
Resources for Faculty Using JMP® Student Edition
Faculty who select a JMP Student Edition bundle for their courses may be eligible for additional resources, including course materials and training.
Special JMP® Student Edition for AP Statistics
JMP Student Edition is available in a convenient five-year license for qualified Advanced Placement statistics programs.
Try and have a look yourself at http://www.jmp.com/academic/student.shtml
Here is an interview with Prof Rob J Hyndman who has created many time series forecasting methods and authored books as well as R packages on the same.
Probably the biggest impact I’ve had is in helping the Australian government forecast the national health budget. In 2001 and 2002, they had underestimated health expenditure by nearly $1 billion in each year which is a lot of money to have to find, even for a national government. I was invited to assist them in developing a new forecasting method, which I did. The new method has forecast errors of the order of plus or minus $50 million which is much more manageable. The method I developed for them was the basis of the ETS models discussed in my 2008 book on exponential smoothing (www.exponentialsmoothing.net)
Here is an interview with one of the younger researchers and rock stars of the R Project, John Myles White, co-author of Machine Learning for Hackers.
Ajay- What inspired you guys to write Machine Learning for Hackers. What has been the public response to the book. Are you planning to write a second edition or a next book?
John-We decided to write Machine Learning for Hackers because there were so many people interested in learning more about Machine Learning who found the standard textbooks a little difficult to understand, either because they lacked the mathematical background expected of readers or because it wasn’t clear how to translate the mathematical definitions in those books into usable programs. Most Machine Learning books are written for audiences who will not only be using Machine Learning techniques in their applied work, but also actively inventing new Machine Learning algorithms. The amount of information needed to do both can be daunting, because, as one friend pointed out, it’s similar to insisting that everyone learn how to build a compiler before they can start to program. For most people, it’s better to let them try out programming and get a taste for it before you teach them about the nuts and bolts of compiler design. If they like programming, they can delve into the details later.
Ajay- What are the key things that a potential reader can learn from this book?
John- We cover most of the nuts and bolts of introductory statistics in our book: summary statistics, regression and classification using linear and logistic regression, PCA and k-Nearest Neighbors. We also cover topics that are less well known, but are as important: density plots vs. histograms, regularization, cross-validation, MDS, social network analysis and SVM’s. I hope a reader walks away from the book having a feel for what different basic algorithms do and why they work for some problems and not others. I also hope we do just a little to shift a future generation of modeling culture towards regularization and cross-validation.
Ajay- Describe your journey as a science student up till your Phd. What are you current research interests and what initiatives have you done with them?
John-As an undergraduate I studied math and neuroscience. I then took some time off and came back to do a Ph.D. in psychology, focusing on mathematical modeling of both the brain and behavior. There’s a rich tradition of machine learning and statistics in psychology, so I got increasingly interested in ML methods during my years as a grad student. I’m about to finish my Ph.D. this year. My research interests all fall under one heading: decision theory. I want to understand both how people make decisions (which is what psychology teaches us) and how they should make decisions (which is what statistics and ML teach us). My thesis is focused on how people make decisions when there are both short-term and long-term consequences to be considered. For non-psychologists, the classic example is probably the explore-exploit dilemma. I’ve been working to import more of the main ideas from stats and ML into psychology for modeling how real people handle that trade-off. For psychologists, the classic example is the Marshmallow experiment. Most of my research work has focused on the latter: what makes us patient and how can we measure patience?
Ajay- How can academia and private sector solve the shortage of trained data scientists (assuming there is one)?
John- There’s definitely a shortage of trained data scientists: most companies are finding it difficult to hire someone with the real chops needed to do useful work with Big Data. The skill set required to be useful at a company like Facebook or Twitter is much more advanced than many people realize, so I think it will be some time until there are undergraduates coming out with the right stuff. But there’s huge demand, so I’m sure the market will clear sooner or later.
(TIL he has played in several rock bands!)
ACM, the Association for Computing Machinery www.acm.org, is the world’s largest educational and scientific computing society, uniting computing educators, researchers and professionals to inspire dialogue, share resources and address the field’s challenges. )
- the volume of data that is available is growing at a rate we have never seen before. Cisco has measured an 8 fold increase in the volume of IP traffic over the last 5 years and predicts that we will reach the zettabyte of data over IP in 2016
- more of the data is becoming publicly available. This isn’t only on social networks such as Facebook and twitter, but joins a more general trend involving open research initiatives and open government programs
- the desired time to get meaningful results is going down dramatically. In the past 5 years we have seen the half life of data on Facebook, defined as the amount of time that half of the public reactions to any given post (likes, shares., comments) take place, go from about 12 hours to under 3 hours currently
- our access to the net is always on via mobile device. You are always connected.
- the CPU and GPU capabilities of mobile devices is huge (an iPhone has 10 times the compute power of a Cray-1 and more graphics capabilities than early SGI workstations)
- thought leadership by tracking content that your readership is interested in via TrendSpottr you can be seen as a thought leader on the subject by being one of the first to share trending content on a given subject. I personally do this on my Facebook page (http://www.facebook.com/alain.chesnais) and have seen my klout score go up dramatically as a result
- brand marketing to be able to know when something is trending about your brand and take advantage of it as it happens.
- competitive analysis to see what is being said about two competing elements. For instance, searching TrendSpottr for “Obama OR Romney” gives you a very good understanding about how social networks are reacting to each politician. You can also do searches like “$aapl OR $msft OR $goog” to get a sense of what is the current buzz for certain hi tech stocks.
- understanding your impact in real time to be able to see which of the content that you are posting is trending the most on social media so that you can highlight it on your main page. So if all of your content is hosted on common domain name (ourbrand.com), searching for ourbrand.com will show you the most active of your site’s content. That can easily be set up by putting a TrendSpottr widget on your front page
Ajay- What are some of the privacy guidelines that you keep in mind- given the fact that you collect individual information but also have government agencies as potential users.
Prior to his election as ACM president, Chesnais was vice president from July 2008 – June 2010 as well as secretary/treasurer from July 2006 – June 2008. He also served as president of ACM SIGGRAPH from July 2002 – June 2005 and as SIG Governing Board Chair from July 2000 – June 2002.
As a French citizen now residing in Canada, he has more than 20 years of management experience in the software industry. He joined the local SIGGRAPH Chapter in Paris some 20 years ago as a volunteer and has continued his involvement with ACM in a variety of leadership capacities since then.
TrendSpottr is a real-time viral search and predictive analytics service that identifies the most timely and trending information for any topic or keyword. Our core technology analyzes real-time data streams and spots emerging trends at their earliest acceleration point — hours or days before they have become “popular” and reached mainstream awareness.
TrendSpottr serves as a predictive early warning system for news and media organizations, brands, government agencies and Fortune 500 companies and helps them to identify emerging news, events and issues that have high viral potential and market impact. TrendSpottr has partnered with HootSuite, DataSift and other leading social and big data companies.
One of those cool conferences that is on my bucket list- this time in Hungary (That’s a nice place)
But I am especially interested in seeing how far Radoop has come along !
Disclaimer- Rapid Miner has been a Decisionstats.com sponsor for many years. It is also a very cool software but I like the R Extension facility even more!
and not very expensive too compared to other User Conferences in Europe!-
Information about Registration
- Early Bird registration until July 20th, 2012.
- Normal registration from July 21st, 2012 until August 13th, 2012.
- Latest registration from August 14th, 2012 until August 24th, 2012.
- Students have to provide a valid Student ID during registration.
- The Dinner is included in the All Days and in the Conference packages.
- All prices below are net prices. Value added tax (VAT) has to be added if applicable.
Prices for Regular Visitors
Days and Event
Early Bird Rate
(Training / Development 1)
|190 Euro||230 Euro||280 Euro|
|Wednesday + Thursday
|290 Euro||350 Euro||420 Euro|
(Training / Development 2 and Exam)
|190 Euro||230 Euro||280 Euro|
|610 Euro||740 Euro||900 Euro|
Prices for Authors and Students
In case of students, please note that you will have to provide a valid student ID during registration.
Days and Event
Early Bird Rate
(Training / Development 1)
|90 Euro||110 Euro||140 Euro|
|Wednesday + Thursday
|140 Euro||170 Euro||210 Euro|
(Training / Development 2 and Exam)
|90 Euro||110 Euro||140 Euro|
|290 Euro||350 Euro||450 Euro|
09:00 – 10:30
Ingo Mierswa; Rapid-I
NeurophRM: Integration of the Neuroph framework into RapidMiner
|To be announced (Invited Talk)
To be announced
Extending RapidMiner with Recommender Systems Algorithms
Implementation of User Based Collaborative Filtering in RapidMiner
|Parallel Training / Workshop Session
10:30 – 12:30
Nearest-Neighbor and Clustering based Anomaly Detection Algorithms for RapidMiner
Customers’ LifeStyle Targeting on Big Data using Rapid Miner
Robust GPGPU Plugin Development for RapidMiner
Image Mining Extension – Year After
Incorporating R Plots into RapidMiner Reports
An Octave Extension for RapidMiner
12:30 – 13:30
13:30 – 15:00
|Parallel Training / Workshop Session
Application of RapidMiner in Steel Industry Research and Development
A Comparison of Data-driven Models for Forecast River Flow
Portfolio Optimization Using Local Linear Regression Ensembles in Rapid Miner
Processing Data Streams with the RapidMiner Streams-Plugin
Automated Creation of Corpuses for the Needs of Sentiment Analysis
News from the Rapid-I Labs
This short session demonstrates the latest developments from the Rapid-I lab and will let you how you can build powerful analysis processes and routines by using those RapidMiner tools.
15:00 – 17:00
|Book Presentation and Game Show
Data Mining for the Masses: A New Textbook on Data Mining for Everyone
Matthew North presents his new book “Data Mining for the Masses” introducing data mining to a broader audience and making use of RapidMiner for practical data mining problems.
Get some Coffee for free – Writing Operators with RapidMiner Beans
Meta-Modeling Execution Times of RapidMiner operators
Social Event (Conference Dinner)
Social Event (Visit of Bar District)
This is a short introductory training course for users who are not yet familiar with RapidMiner or only have a few experiences with RapidMiner so far. The topics of this training session include
- Basic Usage
- User Interface
- Creating and handling RapidMiner repositories
- Starting a new RapidMiner project
- Operators and processes
- Loading data from flat files
- Storing data, processes, and results
- Predictive Models
- Linear Regression
- Naïve Bayes
- Decision Trees
- Basic Data Transformations
- Changing names and roles
- Handling missing values
- Changing value types by discretization and dichotimization
- Normalization and standardization
- Filtering examples and attributes
- Scoring and Model Evaluation
- Applying models
- Splitting data
- Evaluation methods
- Performance criteria
- Visualizing Model Performance
This is a short introductory training course for users who already know some basic concepts of RapidMiner and data mining and have already used the software before, for example in the first training on Tuesday. The topics of this training session include
- Advanced Data Handling
- Balancing data
- Joins and Aggregations
- Detection and removal of outliers
- Dimensionality reduction
- Control process execution
- Remember process results
- Recall process results
- Using branches and conditions
- Exception handling
- Definition of macros
- Usage of macros
- Definition of log values
- Clearing log tables
- Transforming log tables to data
Want to exchange ideas with the developers of RapidMiner? Or learn more tricks for developing own operators and extensions? During our development workshops on Tuesday and Friday, we will build small groups of developers each working on a small development project around RapidMiner. Beginners will get a comprehensive overview of the architecture of RapidMiner before making the first steps and learn how to write own operators. Advanced developers will form groups with our experienced developers, identify shortcomings of RapidMiner and develop a new extension which might be presented during the conference already. Unfinished work can be continued in the second workshop on Friday before results might be published on the Marketplace or can be taken home as a starting point for new custom operators.
Now that you have read the basics here at http://www.decisionstats.com/how-to-learn-to-be-a-hacker-easily/ (please do read this before reading the below)
Here is a list of tutorials that you should study (in order of ease)
1) LEARN BASICS – enough to get you a job maybe if that’s all you wanted.
2) READ SOME MORE-
Lena’s Reverse Engineering Tutorial-“Use Google.com for finding the Tutorial“
Lena’s Reverse Engineering tutorial. It includes 36 parts of individual cracking techniques and will teach you the basics of protection bypassing
01. Olly + assembler + patching a basic reverseme
02. Keyfiling the reverseme + assembler
03. Basic nag removal + header problems
04. Basic + aesthetic patching
05. Comparing on changes in cond jumps, animate over/in, breakpoints
06. “The plain stupid patching method”, searching for textstrings
07. Intermediate level patching, Kanal in PEiD
08. Debugging with W32Dasm, RVA, VA and offset, using LordPE as a hexeditor
09. Explaining the Visual Basic concept, introduction to SmartCheck and configuration
10. Continued reversing techniques in VB, use of decompilers and a basic anti-anti-trick
11. Intermediate patching using Olly’s “pane window”
12. Guiding a program by multiple patching.
13. The use of API’s in software, avoiding doublechecking tricks
14. More difficult schemes and an introduction to inline patching
15. How to study behaviour in the code, continued inlining using a pointer
16. Reversing using resources
17. Insights and practice in basic (self)keygenning
18. Diversion code, encryption/decryption, selfmodifying code and polymorphism
19. Debugger detected and anti-anti-techniques
20. Packers and protectors : an introduction
21. Imports rebuilding
22. API Redirection
23. Stolen bytes
24. Patching at runtime using loaders from lena151 original
25. Continued patching at runtime & unpacking armadillo standard protection
26. Machine specific loaders, unpacking & debugging armadillo
27. tElock + advanced patching
28. Bypassing & killing server checks
29. Killing & inlining a more difficult server check
30. SFX, Run Trace & more advanced string searching
31. Delphi in Olly & DeDe
32. Author tricks, HIEW & approaches in inline patching
33. The FPU, integrity checks & loader versus patcher
34. Reversing techniques in packed software & a S&R loader for ASProtect
35. Inlining inside polymorphic code
If you want more free training – hang around this website
OWASP Cheat Sheet Series
- OWASP Top Ten Cheat Sheet
- Authentication Cheat Sheet
- Cross-Site Request Forgery (CSRF) Prevention Cheat Sheet
- Transport Layer Protection Cheat Sheet
- Cryptographic Storage Cheat Sheet
- Input Validation Cheat Sheet
- XSS Prevention Cheat Sheet
- DOM based XSS Prevention Cheat Sheet
- Forgot Password Cheat Sheet
- Query Parameterization Cheat Sheet
- SQL Injection Prevention Cheat Sheet
- Session Management Cheat Sheet
- HTML5 Security Cheat Sheet
- Web Service Security Cheat Sheet
- Application Security Architecture Cheat Sheet
- Logging Cheat Sheet
- JAAS Cheat Sheet
Draft OWASP Cheat Sheets
- Access Control Cheat Sheet
- REST Security Cheat Sheet
- Abridged XSS Prevention Cheat Sheet
- PHP Security Cheat Sheet
- Password Storage Cheat Sheet
- Secure Coding Cheat Sheet
- Threat Modeling Cheat Sheet
- Clickjacking Cheat Sheet
- Virtual Patching Cheat Sheet
- Secure SDLC Cheat Sheet
3) SPEND SOME MONEY on TRAINING
Module 1 – The x86 environment
- System Architecture
- Windows Memory Management
- Introduction to Assembly
- The stack
Module 2 – The exploit developer environment
- Setting up the exploit developer lab
- Using debuggers and debugger plugins to gather primitives
Module 3 – Saved Return Pointer Overwrite
- Saved return pointer overwrites
- Stack cookies
Module 4 – Abusing Structured Exception Handlers
- Abusing exception handler overwrites
- Bypassing Safeseh
Module 5 – Pointer smashing
- Function pointers
- Data/object pointers
- vtable/virtual functions
Module 6 – Off-by-one and integer overflows
- Integer overflows
Module 7 – Limited buffers
- Limited buffers, shellcode splitting
Module 8 – Reliability++ & reusability++
- Finding and avoiding bad characters
- Creative ways to deal with character set limitations
Module 9 – Fun with Unicode
- Exploiting Unicode based overflows
- Writing venetian alignment code
- Creating and Using venetian shellcode
Module 10 – Heap Spraying Fundamentals
- Heap Management and behaviour
- Heap Spraying for Internet Explorer 6 and 7
Module 11 – Egg Hunters
- Using and tweaking Egg hunters
- Custom egghunters
- Using Omelet egghunters
- Egghunters in a WoW64 environment
Module 12 – Shellcoding
- Building custom shellcode from scratch
- Understanding existing shellcode
- Writing portable shellcode
- Bypassing Antivirus
Module 13 – Metasploit Exploit Modules
- Writing exploits for the Metasploit Framework
- Porting exploits to the Metasploit Framework
Module 14 – ASLR
- Bypassing ASLR
Module 15 – W^X
- Bypassing NX/DEP
- Return Oriented Programming / Code Reuse (ROP) )
Module 16 – Advanced Heap Spraying
- Heap Feng Shui & heaplib
- Precise heap spraying in modern browsers (IE8 & IE9, Firefox 13)
Module 17 – Use After Free
- Exploiting Use-After-Free conditions
Module 18 – Windows 8
- Windows 8 Memory Protections and Bypass
ALSO GET CERTIFIED http://www.offensive-security.com/information-security-training/penetration-testing-with-backtrack/ ($950 cost)
the syllabus is here at
4) HANG AROUND OTHER HACKERS
or The Noir Hat Conferences-
or read this website
5) GET A DEGREE
Yes it is possible
The Johns Hopkins University Information Security Institute (JHUISI) is the University’s focal point for research and education in information security, assurance and privacy.
The Information Security Institute is now accepting applications for the Department of Defense’s Information Assurance Scholarship Program (IASP). This scholarship includes full tuition, a living stipend, books and health insurance. In return each student recipient must work for a DoD agency at a competitive salary for six months for every semester funded. The scholarship is open to American citizens only.
MASTER OF SCIENCE IN SECURITY INFORMATICS PROGRAM
The flagship educational experience offered by Johns Hopkins University in the area of information security and assurance is represented by the Master of Science in Security Informatics degree. Over thirty courses are available in support of this unique and innovative graduate program.
Disclaimer- I havent done any of these things- This is just a curated list from Quora so I am open to feedback.
You use this at your own risk of conscience ,local legal jurisdictions and your own legal liability.
A thing that strikes me when I was a student of statistics is that most theories of sampling, testing of hypothesis and modeling were built in an age where data was predominantly insufficient, computation was inherently manual and results of tests aimed at large enough differences.
I look now at the explosion of data, at the cloud computing enabled processing power on demand, and competitive dynamics of businesses to venture out my opinion-
1) We now have large , even excess data than we had before for statisticians a generation ago.
2) We now have extremely powerful computing devices, provided we can process our algorithms in parallel.
3) Even a slight uptick in modeling efficiency or mild uptick in business insight can provide huge monetary savings.
Call it High Performance Analytics or Big Data or Cloud Computing- are we sure statisticians are creating enough mathematical theory or are we just taking it easy in our statistics classrooms only to be subjected to something completely different when we hit the analytics workplace.
Do we need more theorists as well? Is there ANY incentive for corporations with private R and D research teams to share their latest cutting edge theoretical work outside their corporate silo.