Monthly Archives: April 2014

A few hints about Scala sequences

This blog post is inspired by a cool talk “The Dark Side of Scala” given by Tomek Nurkiewicz on Scalar conference. I’m going to focus on one particular problem he mentioned – confusing Scala sequence types like Seq, IndexedSeq, Traversable, List, Vector and others.
As non-sequential types like Map or Set are pretty straighforward to use, let’s put them aside for now.
All the information gathered in this article are mostly a summary of different discussions I found on the Internet, especially these two threads:

http://stackoverflow.com/questions/11702798/scala-guidelines-on-return-type-when-prefer-seq-iterable-traversable
http://stackoverflow.com/questions/6928327/when-should-i-choose-vector-in-scala

My point is to put it all in a form of simple Q&A which can serve as a cheat sheet for most non-exotic cases.

Q: What type should my API accept as input?
Answer: As general as possible. In most cases this will be Traversable <- Seq <- List.
Explanation: We want our API consumers to be able to call our code without needing them to convert types. If our function takes a Traversable, the caller can put almost any type of collection. This is usually sufficient if we just map(), fold(), head(), tail(), drop(), find(), filter() or groupBy(). In case you want to use length(), make it a Seq. If you need to efficiently prepend with ::, use a List.

Q: What type should my API return?
Answer: As specific as possible. Typically you want to return a List, Vector, Stack or Queue.
Explanation: This will not put any constraints on your API consumers but will allow them to eventually process returned data in optimal way and worry less about conversions.

Q: Should I use List or Vector?
Answer: You most probably want a Vector, unless your algorithm can be expressed only using ::, head and tail, then use a List.
Explanation: Some people compare List vs Vector in Scala to LinkedList vs ArrayList in Java. This is partially OK, because:

  • Scala Vector is a collection with good random access (like java.util.ArrayList)
  • Scala List is indeed a linked list with very fast append-first operation (::), but the links are unidirectional (for bi-directional use DoubleLinkedList).

However, Scala Vector has a very effective iteration implementation and, comparing to List is faster in traversing linearly (which is weird?), so unless you are planning to do quite some appending, Vector is a better choice. Additionally, Lists don’t work well with parallel algorithms, it’s hard to break them down into pieces and put together in an efficient way.

Q: What about other traits like IndexedSeq, LinearSeq, GenTraversable, TraversableOnce, Iterable or IterableLike?
Answer
: In many cases you don’t need to refer specifically to these types.
Explanation: Most of these types reveal some additional information about underlying implementation which may be important if your code is really performance-critical. Iterable may be familiar from Java world and usable when you really need to use an iterator with state (which is not really a functional apporach). I encourage you to not dig into other types unless you are not satisfied with your current performance and want to squeeze out some more.

 

Advertisements