Home » Posts tagged 'search engine'
Tag Archives: search engine
The noted Diamonds dataset in the ggplot2 package of R is actually culled from the website http://www.diamondse.info/diamond-prices.asp
However it has ~55000 diamonds, while the whole Diamonds search engine has almost ten times that number. Using iMacros – a Google Chrome Plugin, we can scrape that data (or almost any data). The iMacros chrome plugin is available at https://chrome.google.com/webstore/detail/cplklnmnlbnpmjogncfgfijoopmnlemp while notes on coding are at http://wiki.imacros.net
Imacros makes coding as easy as recording macro and the code is automatcially generated for whatever actions you do. You can set parameters to extract only specific parts of the website, and code can be run into a loop (of 9999 times!)
Here is the iMacros code-Note you need to navigate to the web site http://www.diamondse.info/diamond-prices.asp before running it
VERSION BUILD=5100505 RECORDER=CR
SET !EXTRACT_TEST_POPUP NO
SET !ERRORIGNORE YES
TAG POS=6 TYPE=TABLE ATTR=TXT:* EXTRACT=TXT
TAG POS=1 TYPE=DIV ATTR=CLASS:paginate_enabled_next
SAVEAS TYPE=EXTRACT FOLDER=* FILE=test+3
and voila- all the diamonds you need to analyze!
The returning data can be read using the standard delimiter data munging in the language of SAS or R.
More on IMacros from
Automate your web browser. Record and replay repetitious work
If you encounter any problems with iMacros for Chrome, please let us know in our Chrome user forum at http://forum.iopus.com/viewforum.php?f=21 Our forum is also the best place for new feature suggestions :-) ---- iMacros was designed to automate the most repetitious tasks on the web. If there’s an activity you have to do repeatedly, just record it in iMacros. The next time you need to do it, the entire macro will run at the click of a button! With iMacros, you can quickly and easily fill out web forms, remember passwords, create a webmail notifier, and more. You can keep the macros on your computer for your own use, use them within bookmark sync / Xmarks or share them with others by embedding them on your homepage, blog, company Intranet or any social bookmarking service as bookmarklet. The uses are limited only by your imagination! Popular uses are as web macro recorder, form filler on steroids and highly-secure password manager (256-bit AES encryption).
So I tried to move without a search engine , and only social sharing, but for a small blog like mine, that means almost 75% of traffic comes via search engines.
Maybe the ratio of traffic from search to social will change in the future,
I have now enough data to conclude search is the ONLY statistically significant driver of traffic ( for a small blog)
If you are a blogger you should definitely try and give the tools at Google Webmaster a go,
URL Googlebot type Fetch Status Fetch date
http://decisionstats.com/ Web Denied by robots.txt 1/19/12 8:25 PM
http://decisionstats.com/ Web Success URL and linked pages submitted to index 12/27/11 9:55 PM
Also from Google Analytics, I see that denying search traffic doesnot increase direct/ referral traffic in any meaningful way.
So my hypothesis that some direct traffic was mis-counted as search traffic due to Chrome, toolbar search – well the hypothesis was wrong :)
Also Google seems to drop url quite quickly (within 18 hours) and I will test the rebound in SERPs in a few hours. I was using meta tags, blocked using robots.txt, and removal via webmasters ( a combination of the three may have helped)
To my surprise search traffic declined to 5-10, but it did not become 0. I wonder why that happens (I even got a few Google queries per day) and I was blocking the “/” fron robots.txt.
Net Net- The numbers below show- as of now , in a non SOPA, non Social world, Search Engines remain the webmasters only true friend (till they come up with another panda or whatever update ;) )
I just used the really handy tools at
, clicked Remove URL
and submitted http://www.decisionstats.com
and I also modified my robots.txt file to
Just to make sure- I added the meta tag to each right margin of my blog
“<meta name=”robots” content=”noindex”>”
Now for last six months of 2011 as per Analytics, search engines were really generous to me- Giving almost 170 K page views,
Source Visits Pages/Visit
1. google 58,788 2.14
2. (direct) 10,832 2.24
3. linkedin.com 2,038 2.50
4. google.com 1,823 2.15
5. bing 1,007 2.04
6. reddit.com 749 1.93
7. yahoo 740 2.25
8. google.co.in 576 2.13
9. search 572 2.07
I do like to experiment though, and I wonder if search engines just -
1) Make people lazy to bookmark or type the whole website name in Chrome/Opera toolbars
2) Help disguise sources of traffic by encrypted search terms
3) Help disguise corporate traffic watchers and aggregators
So I am giving all spiders a leave for Q1 2012. I am interested in seeing impact of this on my traffic , and I suspect that the curves would not be as linear as I think.
Is search engine optimization over rated? Let the data decide…. :)
I am also interested in seeing how social sharing can impact traffic in the absence of search engine interaction effects- and whether it is possible to retain a bigger chunk of traffic by reducing SEO efforts and increasing social efforts!
I use the simple-tags plugin in WordPress for automatically creating and posting tags. I am hoping this makes the site better to navigate. Given the fact that I had not been a very efficient tagger before, this plugin can really be useful for someone in creating tags for more than 100 (or 1000 posts) especially WordPress based blog aggregators.
The plugin is available here -
Simple Tags is the successor of Simple Tagging Plugin This is THE perfect tool to manage perfectly your WP terms for any taxonomy
It was written with this philosophy : best performances, more secured and brings a lot of new functions
This plugin is developped on WordPress 3.3, with the constant WP_DEBUG to TRUE.
- Tags suggestion from Yahoo! Term Extraction API, OpenCalais, Alchemy, Zemanta, Tag The Net, Local DB with AJAX request
- Compatible with TinyMCE, FCKeditor, WYMeditor and QuickTags
- tags management (rename, delete, merge, search and add tags, edit tags ID)
- Edit mass tags (more than 50 posts once)
- Auto link tags in post content
- Auto tags !
- Type-ahead input tags / Autocompletion Ajax
- Click tags
- Possibility to tag pages (not only posts) and include them inside the tags results
- Easy configuration ! (in WP admin)
Ajay-You can also combine this plugin with RSS auto post blog aggregator (read instructions here) and create SEO optimized Blog Aggregation or Curation