Ruby’s each, select and reject methods

Often, when test automation people come over to ruby, they bring constructs from their previous language – “you can write fortran in any language” – missing out on the expressiveness that ruby can give you. A great example of this is in array manipulation. Some common scenarios:

  • you need to iterate over each element (eg: clicking each radio button on a page)
  • you need to select elements that match certain criteria (eg: getting the text value for every other row in a table)
  • you need to reject elements that match certain criteria (eg: if there is a disabled text field, ignore it)
  • you need to transform each element in a particular way (eg: you need to convert a list of names to upper case)

Ruby provides expressive and pretty ways of doing the above. What we’re going to do next is look at the sort of code that ruby-n00bs often write to deal with the above, then contrast it with ‘the ruby way’ of doing the same thing. Hopefully, you’ll agree that the ruby way is considerably cleaner, more expressive and cuts out lots of needless boilerplate code. So…

Iterating over each element in an array

In the old school world, the normal way to iterate over an array is to use a for loop. To figure out how many times to iterate you’d get the length of the array. The for loop gives you an index, which you’d then use to access the next element in the array. Here’s an example in ruby that prints off each element of an array:


a = [1, 2, 3, 4, 5]

for i in 0..(a.size - 1)
  puts a[i]
end

Not very expressive, is it. Now for the ruby equivalent:


a = [1, 2, 3, 4, 5]
a.each {|number| puts number}

Now, it’s fairly uncommon to see the above mistake, even in ruby-n00b code. Learning the ‘each’ method seems to be a rite-of-passage that almost everyone goes through.

Transforming each element of an array

So, as we mentioned, very few ruby programmers don’t know about or don’t use ‘each’. Annoyingly, it is often incorrectly used by n00bs to transform each element of an array. The following is an example where an array of lower case words is transformed into an array of upper case words:


lower_case = ["hi", "these", "are", "some", "words"]

upper_case = []
lower_case.each do |word|
  upper_case << word.upcase
end

puts upper_case.inspect

#=> ["HI", "THESE", "ARE", "SOME", "WORDS"]


Every element of the array is looped through (correct), the transformation is done (‘word.upcase’ – correct); the mistake comes when adding that element to a new array. Ruby has a method that does all this for you; it’s called ‘collect’.


lower_case = ["hi", "these", "are", "some", "words"]

upper_case = lower_case.collect { |word| word.upcase }

puts upper_case.inspect

#=> ["HI", "THESE", "ARE", "SOME", "WORDS"]


What collect does is iterate over the array, ‘collect’ the result of the block (in this case the changing to uppercase of the block argument), store the result in a new array. It’s much shorter, but the main thing is that it’s more expressive. Here’s an example of where you could use it. Say you had a class that represented a page that you’re testing, and say that it contains a method that returns the text of every link on the page. Here’s the old school way:


class MyPage
  def links_text
    text_array = []
    @browser.links(:xpath, "//a").each do |link|
      text_array << link.text
    end
    text_array
  end
end

If you change it to use ‘collect’ instead of ‘each’, you’ll have the following instead:


class MyPage
  def links_text
    @browser.links(:xpath, "//a").collect {|link| link.text}
  end
end

Much nicer! Note that you can use ‘map’ instead of ‘collect’ if you like – one is an alias for the other.

Selecting only elements that meet some criteria

Another case of ‘each’ misuse. . . It’s a common scenario to want to select only certain items from an array – specifically elements that meet certain criteria. An example: given an array containing the numbers 1 to 10, I want to get all the even numbers. Well, one way to describe that criteria is:
element % 2 == 0
If, when the element is divided by 2, there is no remainder; element is an even number. So now, lets look at a ruby-n00b way of getting those elements:


all_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
even_numbers = []

all_numbers.each do |number|
  even_numbers << number if number%2 == 0
end

#=> [2, 4, 6, 8, 10]


What’s happening? We’re looping through ‘each’ element, and performing our check. ‘If’ the element meet the criteria, add it to an array called ‘even_numbers’. It works, but it’s long winded, and not very expressive. Here’s the ruby way:


all_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

even_numbers = all_numbers.select {|number| number%2 == 0}

#=> [2, 4, 6, 8, 10]


Much better. No need to create a new array before performing the check, no boilerplate add-the-element-to-the-new-array code, no dirty ‘if’s; just nice expressive code.

An example of when you’d want to use it? Say you have a table with a bunch of rows and you want a method to return only the rows that have a certain background color. Here’s n00b-style code:


class MyPage
  def blue_rows
    my_blue_rows = []
    @browser.row(:xpath, "//tr").each do |row|
      my_blue_rows << row if row.attribute("bgcolor") == "blue"
    end
    my_blue_rows
  end
end

Again, lots of fluff, hard to tell at first glance what’s going on. Here’s the same thing but this time using the ‘select’ method:


class MyPage
  def blue_rows
    @browser.row(:xpath, "//tr").select {|row| row.attribute("bgcolor") == "blue"}
  end
end

Much nicer. Expressive code. No guff.

Rejecting elements that meet certain criteria

Sometimes you want all the elements in an array apart from those which meet certain criteria. It’s almost identical to ‘select’, just. . . the opposite! Instead of selecting items which meet the supplied criteria, ‘reject’ will reject items which meet the criteria. This time, instead of selecting even numbers, we want to reject them, thus getting an array of odd numbers (a bit contrived, I know; we could just select the elements where element%2==1, but this is a tutorial).


all_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
odd_numbers = []

all_numbers.each do |number|
  odd_numbers << number unless number%2 == 0
end

#=> [1, 3, 5, 7, 9]


This time, we don’t add the number ‘if’ it meets the criterial; instead we add it ‘unless’ it meets the criteria. Again, it’s horrible code. The ruby way:


all_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

odd_numbers = all_numbers.reject {|number| number%2 == 0}

#=> [1, 3, 5, 7, 9]


So when would you want to use it? With a similar example to what we have above, this time we want all the rows that haven’t got a red background. Old school code:


class MyPage
  def non_red_rows
    my_non_red_rows = []
    @browser.row(:xpath, "//tr").each do |row|
      my_non_red_rows << row unless row.attribute("bgcolor") == "red"
    end
    my_non_red_rows
  end
end

Long winded, ugly, hard to see what’s going on. Now, we’ll change it to use ruby’s ‘reject’ method:


class MyPage
  def non_red_rows
    @browser.row(:xpath, "//tr").reject {|row| row.attribute("bgcolor") == "red"}
  end
end

Hard to argue against, right? It’s short, expressive and to the point.

Summary: ruby provides nice methods for array manipulation. ‘Each’, ‘select’ and ‘reject’ are only a few of those methods, but they’re the most frequently used (or should be!). I hope this helps make your code shorter, more expressive and easier to maintain.

Counting strings in a file: Ruby vs Windows Command shell

This is not the usual material that I put up, but I’d like to immortalize an event that demonstrated yet again the beauty of Ruby for basic file manipulation, especially in contrast to doing the same in a Windows command shell. Here goes:

“Nat, I need a script that displays a count of the number of instances of a string in a file. The output must be a number and nothing else.”
“No worries, that won’t take 2 seconds.”
“Stop right there – I don’t want any of your ruby nonsense – it must be a batch file.”
“Hmmm… Can the batch file call a ruby script?”
“No.”
“Err… ok… I’ll see what I can do.”

So off I went trawling google, stackoverflow, random blogs, and websites which can’t have seen hits since 1995. One hour, some frustration, and several cups of tea later, this is what I came up with:

findstr /C:"search string" "c:\my\file.txt" | find /C /V "nonsense"

And that, ladies and gentlemen, works! Let me explain what’s going on… The script uses 2 commands: findstr and find. findstr is used for finding strings in files, and find is also used for finding strings in files. It of course makes perfect sense to have two commands that do the same thing – the very definition of the word “intuitive”. In the above example, findstr returns lines from the file that contain the search string. These lines are piped to find which then displays the number of lines that don’t contain a particular string, in the above case: "nonsense". That will return a number. It’s the only way you can get find, findstr or a combination of the two to return a-number-and-only-a-number of the instances of a string in a file. I would love to see this improved – leave a comment if you know a better way to do it.

To demonstrate to myself why doing the above in DOS is crazy, I wrote the same line in ruby:

File.open("c:/my/file.txt").read.scan(/search string/).count

It doesn’t take much explanation: It opens a file, reads it, scans it for a search string and then returns the number of instances it found.

Now. Can we all start using the right tool for the right job please? I know it may involve a bit of learning, but that never hurt anyone. That is all.

Finding the balance between hacky and over-engineered UI test automation frameworks

There are very few real requirements for a UI test automation framework:

  1. It should provide accurate test results
  2. It should provide accurate test results every time you ask it for results
  3. It should make it easy to write tests
  4. It should require little maintenance – time should be spent writing tests and analyzing results, not on coding the framework
  5. It should be easy to tweak in order to deal with last minute changes to the app being tested.

Projects rarely have the time or patience to deal with “we can’t run the tests just now, we’ve got a framework issue”, so frameworks tend to get built alongside the tests; and unless you’re careful, frameworks written under these (quite common) conditions normally die in one of 2 ways:

  1. Due to time constraints, any changes that have to be made to the test framework tend to be band-aids/hacks. “oh,-didn’t-we-tell-you-about-[insert-new-feature-that-will-break-lots-of-tests]-oh-and-can-you-kick-off-a-run-in-5-minutes?-Just-make-it-work!”, etc. That’s just the nature of the job. But, as the many dead UI frameworks that litter IT shops will attest to, there’re only so many band-aids you can stick onto a framework before it collapses under it’s own weight. Eventually, a change comes along that can’t be fixed just by “adding another band-aid” – a big refactor is required to deal with the new feature which in turn causes other framework instability problems. Test runs become unreliable resulting in the framework being abandoned.
  2. The other way frameworks die is when the test automation team are given time and money and are told to come back with a test automation framework… they have lots of time, so they spend lots of it on making things super-abstract, modeling business entities, writing test parsers etc. The tests that are written using the framework are all ‘semantic’, but they can’t deal with those “oh,-didn’t-we-tell-you…” changes to the app being tested. The super-abstracted nature of the framework makes it difficult to “just make it work” – there’s no one place to stick the band-aid, it needs to be spread across the framework. Many files need updating, the beautiful (but ultimately useless) business model is broken, and major refactors are required to ‘fix’ the model. During this time the tests can’t run. The framework ends up on a shelf gathering dust.

Like most things, a middle ground needs to be found:

  • A framework should be flexible and simple enough to be able to deal with last minute changes in the application under test. But, small chunks of time should then be given to allow small refactors of the framework do deal with the change ‘properly’ so that the quick hack can be removed. This way, the framework stays lean and can deal with new changes on a whim.
  • The framework shouldn’t be over-engineered – simplicity is key. Abstraction for abstraction’s sake is an utter waste of time. Modeling business entities in classes usually isn’t required, and when it is, only small elements of the model are usually needed for testing purposes. Doubtless, often it makes sense to model fundamental things like users, but rarely have I needed to keep track of more than the username, password and a few other simple fields. Keep business model classes simple – that way they’ll deal with application changes without much work.

Hacks for hacks’ sake aren’t good. Abstractions for abstraction’s sake aren’t good either. Write what needs to be written, don’t write what doesn’t need to be written, keep things simple, and tidy up after yourself when things get hacky.

Test Case Interdependency

One of the most common ways of structuring a series of test cases is to make one test case dependent on the outcome of another. For example, Test Case ‘A’ verifies the functionality surrounding the ability to create an account. Test Case ‘B’ verifies functionality surrounding account deletion, but instead of stating that the required data is and account in a particular state, it states that the account generated by test case ‘A’ should be the one to test for deletion. The mistake cascades through the test cycle: in execution of the test suite, if test case ‘A’ fails then test case ‘B’ cannot be executed and so it is marked as ‘failed’.

This test case interdependency causes problems for automation. It’s also a bad thing to do in general. Why?

In the above example, when it comes down to it, test case ‘B’ is not dependent on test case ‘A’ at all. If ‘B’ is testing deletion, it should test deletion. Deletion is dependent on an account, not necessarily a specific one (i.e. the one generated by test case ‘A’). OK, the account to test deletion against may need to be in a specific state (e.g. not already marked for deletion, etc…) but that’s not the same as dictating a specific account number.

As well as being, er, “philosophically” wrong, interdependency of test cases leads to testers incorrectly failing tests. Marking test case ‘B’ as failed just because test case ‘A’ did produces incorrect data in the test report. Why? Marking a test as failed when it hasn’t been executed is wrong, no matter what the reason is. The tester executing test case ‘B’ should have picked one of the (possibly) large number of valid accounts to use instead of being limited to test case ‘A’s account. That way, the ‘delete’ functionality can be tested even if the ‘create’ functionality is broken.

How is this a problem for automation? Well, an automated test should be just that: an automatic version of a manual test. Hard-wiring data into automated tests is common (and sold as a ‘feature’ of many packages), but makes the tests very fragile. If the data doesn’t exist (due to other tests failing), some tests won’t be able to run even though there may be plenty of valid data to use!

An easy fix is to make a slight modification to your tests: change them to be dependent on data in a particular state rather than specific data. Subtle difference with a large impact on test case management and execution. You’ll still be testing the same functionality, but the tests are much less interdependent. You’ll be able to execute all your tests (not just a subset) and your automated tests will be much more reliable and maintainable.