map(func)

map(func)

Return a new DStream by passing each element of the source DStream through a function func.
1
val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
2
​
Copied!

flatMap(func)

Similar to map, but each input item can be mapped to 0 or more output items.
1
//split words from line separated by space, such as sentence
2
val lines = ssc.socketTextStream("localhost", 29999
3
, StorageLevel.MEMORY_AND_DISK_SER)
4
val words = lines.flatMap(_.split(" "))
5
​
Copied!
There are choises of StorageLevel:
1
DISK_ONLY = StorageLevel(True, False, False, False, 1)
2
DISK_ONLY_2 = StorageLevel(True, False, False, False, 2)
3
MEMORY_AND_DISK = StorageLevel(True, True, False, False, 1)
4
MEMORY_AND_DISK_2 = StorageLevel(True, True, False, False, 2)
5
MEMORY_AND_DISK_SER = StorageLevel(True, True, False, False, 1)
6
MEMORY_AND_DISK_SER_2 = StorageLevel(True, True, False, False, 2)
7
MEMORY_ONLY = StorageLevel(False, True, False, False, 1)
8
MEMORY_ONLY_2 = StorageLevel(False, True, False, False, 2)
9
MEMORY_ONLY_SER = StorageLevel(False, True, False, False, 1)
10
MEMORY_ONLY_SER_2 = StorageLevel(False, True, False, False, 2)
11
OFF_HEAP = StorageLevel(True, True, True, False, 1)
12
(2 means RDD partitions will have replication of 2)
Copied!
Last modified 1yr ago
Copy link