Scala Tips for updateStateByKey
In the previous section on udateStateByKey, following we need to brush up some Scala methods:
Specifically:
A few methods we need to discuss here:
values.foldLeft(0)(_+_)
In fact, there are follow fold* methods for any collections (such as List):
fold(Initial:T)(func)
foldLeft(Initial:T)(func)
foldRight(Initial:T)(func)
Tips: fold methods above are very similar to reduce method, except fold methods allow a initial value as argument, while reduce does not.
state:Option[Int]
this means argument state is an integer, but can be None. In Scala, it is denoted as Option[Int]
If a variable in Scala may be None in value, it must be declared as Option[T] (T is a datatype placeholder, such as Int, String etc)
An Option[T] can be either Some[T] or None object, which represents a missing value.
If a variable needs to be computed with, for example having data type Option[Int], to avoid computing a None value, an Optional variable has a method called getOrElse(default value), such as
state.getOrElse(0)
For the above function updateFunc, let's test it:
Then create a list x, and y:Option[Int], assigned to None, then call updateFunc
Now, replace foldLeft(0)(_+_) with fold(0)(_+_), foldRight(0)(_+_) and reduce(_+_), and test them to reinforce understanding.
Last updated