Thursday, August 30, 2012

Greasemonkey API Usage -- August 2012

Back in November 2009, I analyzed the API usage, and a few other aspects, of all the scripts on userscripts.org, then 36,141 scripts.  I was directly discussing some of the topics that were already bubbling around the back of our minds, for how to carry Greasemonkey into the future with us.  The short version is: web browsers and web apps are getting so much poweful, why do we need these one-off Greasemonkey APIs with cross-browser problems?

Now that Greasemonkey 1.0 is out, we've made big steps made in that direction, and I've repeated the analysis.  I downloaded (with permission from the site owner) every single active script on userscripts.org, now 82,084 scripts.

First up, which API calls are made, and how common are they?
API Usage by Number of Scripts

Not much has changed.  By far most common is that a script doesn't call any special APIs (57.94%).  Then, GM_getValue/GM_setValue are still right up there.

The biggest change is that unsafeWindow usage has jumped from 6th to 2nd place (12.44% to 17.65% of scripts; 1,527 scripst also mention wrappedJSObject, not on the chart).  Authors want to interact with the page in ways that the security sandbox (which protects these APIs) prevent, so they explicitly jump out of the sandbox, bringing vulnerabilities with them.  Of the 14,484 scripts that reference unsafeWindow, 6,494 of them use no other Greasemonkey APIs, and thus are served well by moving towards a model where there is no sandboxing (nor APIs, unless you ask for it/them).  Another 838 scripts only use GM_log and/or GM_addStyle, which can easily be replaced with console.log() or the compatibility shim layer.  It gets hard to analyze other calls in more detail, but I see a lot of get/set value calls, which (assuming you run on only one domain) can also be well served by DOM Storage.

Moreover, those 47,557 scripts that don't use any of the special APIs are still saddled with the sandbox and its pitfalls, known and unknown before Greasemonkey 1.0.  Plenty of newer browser features don't work in the security sandbox because its entire point is to separate the content scope (where these features work) from the script scope (with its smaller set of privileged features).  A huge part of the design changes in Greasemonkey 1.0 is to make the default behavior, like these majority of scripts need/want, to run as close to possible as a regular script in a regular web page, without surprises like missing values and broken features.

So do scripts that use get/set value or xmlhttpRequest really need the cross-domain behavior they provide?
API Usage Cross-domain Analysis
(Note: the left-most set of bars is "@include *" and the rightmost is ">5" -- the labels are missing from the graph and I'm not sure why.)
Mostly: no, and this hasn't changed much since 2009.  The vast majority of scripts using get/set value (71.86%) only ever execute on a single domain, and thus can use DOM Storage with no ill effects.

The XHR usage to two domains is lower mostly because I fixed my analysis a bit (i.e. not counting an @include of *.example.com and an XHR to www.example.com as two domains, and not counting XHRs to userscripts.org, assumed to be update checker scripts, which is now provided by Greasemonkey).  However, a combined 44.25% of scripts that call XHR (and with a string literal that I could pull a domain name out of, not a variable set somewhere else) either call to/run on two or all domains, and thus really use the cross-domain power of GM_xmlhttpRequest.

Finally a bit more detail about Metadata imperatives.  This graph is for all imperatives used in at least 1% of scripts, regardless of what they are.
Greasemonkey Metadata Imperative Usage
Most of what has changed since 2009 is the analysis, including more values.  Note that almost every script (99.37%) specifies @name, and we see a power law trail off in usage.  The commonly used, but unsupported in Greasemonkey, ones are @author, @homepage, @license/@copyright, @date, and @history.

Check the raw data to see hundreds more @things, generally all unsupported values.  And there I pasted only those used at least ten times, there are yet more hundreds used fewer times.

To those that are interested: the script that I used to generate these numbers is available for inspection, in case it perhaps contains a serious bug. The raw data that I generated with it, and the charts above, are also available to check.

Tuesday, August 28, 2012

Greasemonkey 1.0 + jQuery: Broken, with Workaround

One of the big changes behind Greasemonkey 1.0 was moving towards the goal of not forcing the security sandbox (and all its pitfalls) upon script authors.  This is the entire reason for @grant, and specifically the @grant none setting.  In the @grant none case, the script does not get the traditional security sandbox (with XPCNativeWrappers), but rather a very thin sandbox that exclusively acts as a private scope, to hold variables for the script that don't interact with the page.

The idea was, if you want to set something in the page, you just do window.foo = 'bar', and if you don't it's just a normal var foo = 'bar'.  But there's a problem.

If you @require jQuery, it implicitly does a window.$ = window.jQuery = ...,  which exports the copy of jQuery that your script is loading into the page.  If they're different versions, there is a very real possibility of completely breaking the page.

This is Greasemonkey issue 1614, which is open and being tracked for a fix.  In the meantime, you can insert a one line fix into your script, at the top level (not inside any functions):

this.$ = this.jQuery = jQuery.noConflict(true);

This line just calls the standard jQuery noConflict() method, so that this loaded version of jQuery doesn't conflict with anything already in the page.  It's already there, it's exactly what it's for!  And saves a local (in the script) reference to the version of jQuery that you want to use.

This should let your script keep working, and also keep the page from breaking.  It's only lightly tested so far; let us know in the comments if it helps you.

Friday, August 24, 2012

Greasemonkey 1.0 Release

After more than seven years, Greasemonkey has finally grown to version 1.0.

Back in August of 2005 (almost exactly seven years ago now), Greasemonkey introduced wrappers intended to plug security holes.  As a result the common pitfalls were born.  Ever since then, in order to write a user script that would function properly in Greasemonkey, authors were required to either get lucky and not trip over one of these pitfalls, or get lucky and figure out that they exist -- and how to work around them.

As of today, all you have to know is "@grant none".  If you specify this setting in your metadata, then none of these security wrappers are put around your script. And you aren't granted access to any of the Greasemonkey APIs that a normal page wouldn't have.  Almost anything you can do in a script in the page itself should work in a "@grant none" user script.  There is still a sandbox; this isolates your variable scope from the page itself.  Without this it is extremely easy to break the page.  In order to explicitly read/write to/from the content scope, just reference properties of the window object.  (I.E. "x = 10" assigns to a variable x only in the user script's private scope.  "window.x = 10" assigns to the variable x in the content page's scope, which it can see.)


The entire list of bugs handled in this release is also available via the 1.0 milestone on GitHub. A number of issues listed there only affected Greasemonkey during the development of version 1.0, so they aren't listed as changes below.

As always, if you notice problems, it's best to log an issue at GitHub or let us know at the greasemonkey-dev mailing list (and be clear that your issues are with this version).

Enhancements since Greasemonkey 0.9.x:
  • New metadata, @grant, specifies which special APIs a user script will have access to.  Specifying @grant none means no special API access, and thus no security restrictions.  Then, everything you're used to doing in JavaScript in a web page (including but not limited to jQuery) should just work.  For legacy scripts (which have no @grant line at all), Greasemonkey will try to guess what @grant lines you should have.  See http://wiki.greasespot.net/@grant for more detail. (#1425, #1427, #1558)
  • The toolbar button is colorful (in the enabled state) on Mac OS X. (#1597)
  • The metadata @unwrap has been removed, as being unwrapped is now the default.  The wrapper will still be applied to scripts that have a "return" statement outside of any function, but this may be removed in the future, so make sure your scripts (and requires) don't do this; authors may manually add an anonymous function wrapper around the script for the exact same behavior. (#1568, #1592)
  • Scripts that @run-at document-start have a valid document object to modify, E.G. for adding <style> tags; but still before any part of the document is loaded. (#1565)
  • GM_xmlhttpRequest() accepts a timeout option.  (#1561)
  • GM_getResourceURL() works with a special protocol handler.  (This is more efficient/faster than the data: URI encoding used previously.)  For example, specify images and styles with URLs to your @resources.
  • The standard Firefox web developer console works for console.log() et al.  (#1564)
  • Automatic updates work correctly with scripts installed from userscripts.org (but still note the require secure updates setting). (#1555)
  • Require at least Firefox 14.0 (no more Firefox 3 compatibility).  (#1426, #1522)
  • Error reporting is much more consistent and obvious than in the past. (#1404, #1592)
  • The alert() workaround (see http://bugzil.la/647727) is not applied for Firefox versions that do not exhibit this bug.  (#1318, #1350)
Bug fixes since Greasemonkey 0.9.x:
  • When downloading a script not encoded in UTF-8, display an error message to the user (rather than just failing). (#1588)
  • The "show script" button in the install dialog is disabled until the download of the script file is complete. (#1586)
  • Scripts with missing or broken "==UserScript==" metadata will work.  (#1562)

Thursday, August 09, 2012

Beta: Greasemonkey Release 1.0beta7

The entire list of bugs handled (and some still pending) in this release is also available via the 1.0 milestone on GitHub. This is only a beta release so you'll need to head to the all versions page to find it.

After more than seven years, Greasemonkey is finally graduating to version 1.0!  We're taking the major version number bump as an opportunity to reconsider some big ideas.  As of right now we believe there are appropriate detections and modes to make everything continue to work as always, but we're laying the groundwork to really break backwards compatibility, perhaps in a 1.1 release.

Keep an eye on this blog for posts dedicated to more detail on these topics.  For now each is briefly mentioned in the changelog below.

We desperately want feedback on this beta release, especially from script authors.  If you are using it and notice problems of any kind or have any other feedback for us, it's best to log an issue at GitHub or mail us at greasemonkey-dev (and be clear that which version is under discussion).

Enhancements since Greasemonkey 0.9.x:
  • New metadata, @grant, specifies which special APIs a user script will have access to.  Specifying @grant none means no special API access, and thus no security restrictions.  Then, everything you're used to doing in JavaScript in a web page (including but not limited to jQuery) should just work.  For legacy scripts (which have no @grant line at all), Greasemonkey will try to guess what @grant lines you should have.  See http://wiki.greasespot.net/@grant for more detail. (#1425, #1427, #1558)
  • The metadata @unwrap has been removed, as being unwrapped is now the default.  The wrapper will still be applied to scripts that have a "return" statement outside of any function, but this may be removed in the future, so make sure your scripts (and requires) don't do this; authors may manually add an anonymous function wrapper around the script for the exact same behavior. (#1568, #1592)
  • Require at least Firefox 14.0 (no more Firefox 3 compatibility).  (#1426)
  • Scripts that @run-at document-start have a valid document object to modify, E.G. for adding style tags; but still before any part of the document is loaded. (#1565)
  • GM_xmlhttpRequest() accepts a timeout option.  (#1561)
  • GM_getResourceURL() works with a special protocol handler.  (This is more efficient/faster than the data: URI encoding used previously.)  For example, specify images and styles with URLs to your @resources.
  • The standard Firefox web developer console works for console.log() et al.  (#1564)
  • Error reporting is much more consistent and obvious than in the past. (#1404, #1592)
Bug fixes since Greasemonkey 0.9.x:
  • Scripts with missing or broken "==UserScript==" metadata will work.  (#1562)
  • The alert() workaround (see http://bugzil.la/647727) is not applied for Firefox versions that do not exhibit this bug.  (#1318, #1350)
  • When downloading a script not encoded in UTF-8, display an error message to the user (rather than just failing). (#1588)
  • The "show script" button in the install dialog is disabled until the download of the script file is complete. (#1586)