Spinn3r 2.3.1
We just pushed Spinn3r 2.3.1. If you depend on changes in this release you should grab the new reference client.
A number of small fit and finish fixes went into this release. More important fixes include:
- New post:title element in the permalink API. When non-null this is the authoritative title element from the RSS feed for crawled content. This gets us bit further towards a grand unified API for indexing the blogosphere.
- New post:body element which will include authoritative feed content in the next push of our crawler (we're just testing it now).
- The internal hashcodes for sources and feeds are included in the API and reference client for advanced API usage and debugging.
- The source.register mechanism now allows clients to specify publisher_type for new sources. We're going to work on a new API to allow customers to flag sources for existing sources as well.
- A number of extension are now present in the spinn3r admin console for debugging including:
- the ability to view raw HTML source for a given permalink or feed
- the ability to view the cached HTML rendered in your local browser.
- The spinn3r admin console now graphs publisher types (mainstream weblogs, news feeds, etc).
- All Spinn3r robots can now be identified by reverse DNS. This is documented in our robot FAQ:
How do I verify that the robot visiting my website is Spinn3r?
First, it will have a User-Agent of:
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1; aggregator:Tailrank (Spinn3r 2.3); http://spinn3r.com/robot) Gecko/20021130Second, we support robot DNS verification.
When you have a HTTP log entry which has our user agent, just perform a reverse DNS on the raw IP address.
For example:
%shell% nslookup 64.34.195.138
Non-authoritative answer:
138.195.34.64.in-addr.arpa name = robot32.spinn3r.com.
%shell% nslookup robot32.spinn3r.com
Non-authoritative answer:
Name: robot32.spinn3r.com
Address: 64.34.195.138
.... and did I mention we're hiring?
Comments