Idiomatically Scala
This is in response to a well-intentioned article on DZone: 10 amazing Scala collection functions
Scala adoption is growing, and it's growing exponentially
Source: Indeed.com job trends
This is a Good Thing™ - It means that ever more people are being exposed to pure functional programming with strong syntactic support from the language, more people are writing Scala and - yes - more people are writing about their experience and their discoveries.
It also means that more such blog posts are being written by people who still have one foot in Java, whilst they continue to discover all those juicy new Scala patterns, techniques, and idioms. The previously mentioned DZone article is one such example. The message in the article is clear:
"Hey, look at all this cool stuff, concepts and ideas that are simple and easy for people can be made simple in the code too!"
The thing that's obvious to a seasoned Scala developer is... All those examples, although an obvious breath of fresh air to a Java developer, can be made far simpler still.
The Code
example 1:
case class Book(title: String, pages: Int)
val books = Seq( Book("Future of Scala developers", 85),
Book("Parallel algorithms", 240),
Book("Object Oriented Programming", 130),
Book("Mobile Development", 495) )
//Book(Mobile Development,495)
books.maxBy(book => book.pages)
//Book(Future of Scala developers,85)
books.minBy(book => book.pages)
I personally don't like the indentation on that Seq, it's a style that lends itself to quickly exceeding the 80 char limit (and I don't care what resolution your screen is, it's still a good limit for reviewing code on your mobile, or doing side-by-side diffs). The maxBy and minBy examples are also somewhat verbose. Let's try it again:
case class Book(title: String, pages: Int)
val books = Seq(
Book("Future of Scala developers", 85),
Book("Parallel algorithms", 240),
Book("Object Oriented Programming", 130),
Book("Mobile Development", 495)
)
books.maxBy(_.pages)
books.minBy(_.pages)
The underscore notation here de-emphasises "book", and your eye is naturally drawn to the fact that it's the "pages" property that's important here. Consistent indentation and vertical whitespace also help separate concerns - making the code far easier to visually scan.
example 2:
val numbers = Seq(1,2,3,4,5,6,7,8,9,10) numbers.filter(n => n % 2 == 0)
books.filter(book => book.pages >= 120)
This isn't even valid Scala, I'm going to blame the DZone formatter for losing a line break here. Let's try that again!
val numbers = Seq(1,2,3,4,5,6,7,8,9,10)
numbers.filter(_ % 2 == 0)
books.filter(_.pages >= 120)
Again... The underscore allows the essential complexity to be drawn to the fore of your attention, the accidental complexity gets tucked away, where it belongs!
example 3:
val num1 = Seq(1, 2, 3, 4, 5, 6)
val num2 = Seq(4, 5, 6, 7, 8, 9)
num1.diff(num2)
num1.intersect(num2)
num1.union(num2)
Fantastic! Operators! Pure functions straight from algebra and set theory; this is exactly what functional programming is all about.
Except... That's not how we write operators. You'd never see this:
someInt.+(someOtherInt)
and we're talking about set theory here, so let's use the proper type, instead of starting with a sequences then quickly trying to sweep the problems that causes under the carpet via the "distinct" operation:
val num1 = Set(1, 2, 3, 4, 5, 6)
val num2 = Set(4, 5, 6, 7, 8, 9)
num1 diff num2
num1 intersect num2
num1 union num2
I could have used "--" instead of "diff", and there are almost certainly libs out there that would let me use ∩ and ∪ (Scala supports unicode names), but I'm trying not to shake things up too much
example 4:
val numbers = Seq(1,2,3,4,5,6)
numbers.map(n => n * 2)
val chars = Seq('a', 'b', 'c', 'd')
chars.map(ch => ch.toUpper)
Let's use the underscore again. Actually... Let's not! The first example can be reduced even further using what's known as the point-free style. As for the second example, there's a far better (and more familiar) type that can be used for representing a sequence of chars:
val numbers = Seq(1,2,3,4,5,6)
numbers.map(2*)
val chars = "abcd"
chars.map(_.toUpper)
Point-free is probably going a little too far for most use-cases, but there's little excuse for not using a String, Scala treats them as a collection type, fully kitted-out with the same operations that you'd see on any other sequence..
example 5:
val abcd = Seq('a', 'b', 'c', 'd')
abcd.flatMap(ch => List(ch.toUpper, ch))
If I didn't have at least one example showing how maps, flatMaps and filters can (and usually should) be re-written using a for-comprehension, my fellow Scala programmers would likely disown me. So here it is:
for {
ch <- "abcd"
ch2 <- List(ch.toUpper, ch)
} yield ch2
As a bonus, the output value will also be a String, which is nice!
example 6:
def theLongest(s: String): String = {
s.split("[0-9]")
.filter(str => str.exists(ch => ch.isUpper))
.maxBy(str => str.length)
}
At least we finally get Strings, but what about whitespace characters, or control characters? This isn't exactly robust, let's rewrite it to state what we really mean...
First, a couple of helper methods:
def chunk(s: String): (String, String) = {
val wanted = s.takeWhile(_.isLetter)
val remainder = s.drop(wanted.length).dropWhile(c => !c.isLetter)
wanted -> remainder
}
def acceptible(s: String): Boolean =
s.exists(_.isUpper) && s.exists(_.isLower)
"chunk" allows us, with repeated application, to break off fragments of the string containing only letters, then return that fragment and the remainder of the string as a tuple / pair
"acceptible" is a predicate to check that a String matches our requirements.
Let's use them!
@tailrec def doIt(str: String, best: String = ""): String = str match {
case "" => best
case s =>
val (candidate, remain) = chunk(s)
if (candidate.length > best.length && acceptible(candidate))
doIt(remain, candidate)
else
doIt(remain, best)
}
This is a recursive function, it repeatedly calls "chunk" to split the string, checks the result against the best match so far, then calls itself with the remainder... until such point as there is none, when it returns the best found match.
The main benefit here is that it's more efficient. A tail recursive function is actually encoded as a simple jump in the underlying bytecode (no risk of stack overflow) and it allows us to perform a quick check that each candidate substring is longer than the best match so far before even checking it for upper/lower case characters. There's also no risk of an exception: No match will return an empty String.
Yes, it's longer, but it's also faster and more correct, plus I got to demonstrate "drop", "dropWhile" and "takeWhile"
Kevin Wright is a Scala zealot of the worst kind. You should see his quora and twitter responses. He makes the Scala community seem like trolls
If you want to encourage Scala adoption don't write nit-picky articles like this. Clarity is not exactly Scala's strong point.