Tuesday, November 9, 2010

3elenium Grid?

Hi, this is a worklog of a spike into using Selenium, Selenium RC and Grid for distributed load testing. Yes I know that "real load" is best generated in other ways, and that'll happen too. Given how easy Selenium is reported to be I thought it's worth the experiment.

My goals
  • asses how well Selenium may work with the current 3akai interface.
  • see if repeatable scenarios based on the 6k users generated by the kern-1306, and assorted content, can be distributed on an ad-hoc basis for peer use during 'free form' bug bashes.

some links
Selenium http://seleniumhq.org/ 
a FAQ on Selenium Grid: http://selenium-grid.seleniumhq.org/faq.html 
a spin off and its quickstart guide: http://saucelabs.com/docs/quickstart
The Sakai 3 demo build: http://3akai.sakaiproject.org/dev/index.html
The official QA box: http://sakai3-demo.uits.indiana.edu:8080/dev/index.html

worklog -

fire up nakamura head on Santoku, using run_production.sh
download selenium IDE and install into my firefox
start recording a simple test: failed login on my local HEAD build.
after a fencepost error of some kind of the first run the test executes nicely.
however to my joy running the test against the canonical QA server fails.
what gives?

local server login fail, which is a successful Selenium test (firebug screenshot):

same test against Canonical QA server, and fail, but the Selenium test also fails:

OK so what's the difference in those strings? Grr. nothing a restart of the IDE didn't, for some reason, fix. delightful mystery for another day. After a fair amount of bashing I found I could trigger this state by switching tabs or clicking in another browser window, and after doing that restarting the IDE was the only way to recover.

However if you don't trigger this condition it's really easy in the IDE to switch between different domains ( such as my local build, the Indiana build and the sakaiproject demo build.

My take away on this is that I have to be pretty careful when running multiple domain tests in the IDE. I'll give a run in a simpler machine setup at some point.

Stability and Reproducibility via external scripts?

Let's see about this Selenium RC server. The idea here is to have a control box somewhere driving browsers on other boxes to perform the tests. The remote control script may be from the IDE ( or hand written, natch ) and loaded up into the remote control box. This box then connects to the Selenium RC process running on the worker-boxes.

For this quick spike I picked Ruby from among the plentiful IDE options. I picked Ruby because I found a fair number of Ruby scripts in the Sakai3 testscripts directory and thought it would be best to support whatever previous choices led to that. Another benefit - I don't know Ruby so I might as well learn by fire!

I had two central wrinkles. The first was in modifying the generated Ruby code (you have several options from the Selenium IDE. ) to allow it to find a browser on OS X:

    @selenium = Selenium::Client::Driver.new \
      :host => "localhost",
      :port => 4444,
      :browser => "*googlechrome /Applications/Google Chrome.app/Contents/MacOS/Google Chrome",
      :url => "http://sakai3-demo.uits.indiana.edu:8080/dev/",
      :timeout_in_second => 60
I also chose Chrome just for the heck of it.

When I run this simple login test at full speed it always fails. The IDE has a slider which you can use to slow down the interaction, but the setting you dial-in on does not get exported. The test failure is most likely due to the client-side JS overhead time. So after some poking around I found a doc for the Ruby Selenium client and added this line:


to the setup method/function/body.

Then it was time to start up the RC ( remote control ) server. I just followed the directions and set it up on this dev box. For this spike I just fired off java on the command line, passing in the location of the server jar file. Later I'll set up a couple more remotes in CZWX HQ and do some parallel testing, but this is fine for today.

Starting the  Ruby script...
bash-3.2$ ./loginTestCase.rb
Loaded suite ./loginTestCase
Finished in 19.757175 seconds.

1 tests, 2 assertions, 0 failures, 0 errors
and reviewing the Selenium RC log...
13:44:36.603 INFO - Command request: setSpeed[1000, ] on session null
13:44:36.603 INFO - Got result: OK on session null
13:44:36.617 INFO - Command request: getNewBrowserSession[*googlechrome /Applications/Google Chrome.app/Contents/MacOS/Google Chrome, http://sakai3-demo.uits.indiana.edu:8080/dev/, , ] on session null
13:44:36.617 INFO - creating new remote session
13:44:36.618 INFO - Allocated session 540c35ae9cbd4d18a611d524aefabe3e for http://sakai3-demo.uits.indiana.edu:8080/dev/, launching...
13:44:36.618 INFO - Launching Google Chrome...
13:44:40.585 INFO - Got result: OK,540c35ae9cbd4d18a611d524aefabe3e on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:40.588 INFO - Command request: setTimeout[300000, ] on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:41.594 INFO - Got result: OK on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:41.597 INFO - Command request: open[/dev/index.html, ] on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:50.797 INFO - Got result: OK on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:50.799 INFO - Command request: waitForPageToLoad[300000, ] on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:51.819 INFO - Got result: OK on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:51.822 INFO - Command request: type[username, CaseyD] on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:52.846 INFO - Got result: OK on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:52.849 INFO - Command request: type[password, wooga] on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:53.855 INFO - Got result: OK on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:53.858 INFO - Command request: click[loginbutton, ] on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:54.868 INFO - Got result: OK on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:54.871 INFO - Command request: isTextPresent[The username or password you entered is incorrect!, ] on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:55.880 INFO - Got result: OK,true on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:55.883 INFO - Command request: testComplete[, ] on session 540c35ae9cbd4d18a611d524aefabe3e
13:44:55.883 INFO - Killing Google Chrome...
13:44:56.355 WARN - Google Chrome seems to have ended on its own.
13:44:56.356 INFO - Got result: OK on session 540c35ae9cbd4d18a611d524aefabe3e
and everything is cool.

At this point I like the fact that the test is running through an actual accursed browser, with all the foibles which can be introduced on the client side. It also allows me to create interaction timing scenarios similar to what real users will experience.

I'm also struck by the fact that it's fragile as hell. Because it's driving the UI :)

Here's a movie - it's astoundingly google-compressed at full screen so you'll probably want to just run it in here ;)


So that's nice. And I'm sure I can push this a bit forward into more comprehensive tests via recording and coding. Notice the timestamps in the logs. I'll have to look into what reporting is provided by Selenium

To get some real Selenium load on the Sakai3 I'll have to find some collaborators. A minimal configuration would be folks running a set of scripts during bug bashes on some spare machines. A step up would be a coordinated grid of spare machines.

Grid setup?

KERN-1306 provided a set of tools to populate a Sakai 3 instance with a range of userids, tags, and a big set of content. The goal is to stuff a mess of users in and build tests against a loaded system.

The result is a set of users with random login ids with random system dictionary tags and a big pool of messages between them. Oh and megabytes of data in the repository. Chunky monkey!

Gridding the Selenium tests using the 6k users won't be possible unless there is a canonical set of netids. Currently practices don't create a 'the' of users, instead creating a different cohort each reload. That can be managed as long as accompanying automated test scripts take steps to do so.

A possible effort would be to have someone generate the canonical set, ( users, tags, file IDs) store it somewhere on the net and have the gridded tests pull the canonical user sets down. This allows repeatability of load testing against the bugblast or the QA servers, and distributivity of load testing. Coherent messaging and chat tests could be developed.

OK! onward - I'll try try the grid tomorrow in house, and then see about coding up a framework to pull "canonical" userNetID (etc) files from the net.

No comments: