Scala Tips for updateStateByKey

In the previous section on udateStateByKey, following we need to brush up some Scala methods:

Specifically:

val updateFunc = (values: Seq[Int],state: Option[Int])=>{
val currentCount = values.foldLeft(0)(_+_)
val previousCount = state.getOrElse(0)
Some(currentCount + previousCount)
}

A few methods we need to discuss here:

values.foldLeft(0)(_+_)

In fact, there are follow fold* methods for any collections (such as List):

fold(Initial:T)(func)

foldLeft(Initial:T)(func)

foldRight(Initial:T)(func)

Tips: fold methods above are very similar to reduce method, except fold methods allow a initial value as argument, while reduce does not.

state:Option[Int]

this means argument state is an integer, but can be None. In Scala, it is denoted as Option[Int]

If a variable in Scala may be None in value, it must be declared as Option[T] (T is a datatype placeholder, such as Int, String etc)

An Option[T] can be either Some[T] or None object, which represents a missing value.

If a variable needs to be computed with, for example having data type Option[Int], to avoid computing a None value, an Optional variable has a method called getOrElse(default value), such as

state.getOrElse(0)

For the above function updateFunc, let's test it:

val updateFunc = (values: Seq[Int],state: Option[Int])=>{
val currentCount = values.foldLeft(0)(_+_)
val previousCount = state.getOrElse(0)
currentCount + previousCount
}

Then create a list x, and y:Option[Int], assigned to None, then call updateFunc

val x=Seq(1,2,3,4,5)
val y:Option[Int]=None
updateFunc(x,y)

/*Output
15
*/

Now, replace foldLeft(0)(_+_) with fold(0)(_+_), foldRight(0)(_+_) and reduce(_+_), and test them to reinforce understanding.

Last updated