I wrote this socket streaming server and client to deliver needed streaming data for functional testing of my Spark Network Streaming applications. It is a python script so it can be easily and quickly modified to emit whatever streaming data needed.
Develop Spark Streaming get Twitter tweets save to Hive, sbt assembly build needed twitter util jar
We created this content in the spirit of public knowledge contribution on open source software, specifically for Apache Spark that we are working on to develop our own Spark application in Python and Scala.
This video presentation includes: Create a Spark streaming application in Scala, retrieve tweets from Twitter and save to Hive data warehouse table.
Register with developer.twitter.com to create needed Twitter credential, specifically, consumer key, consumer secret, access token, access token secret needed to log into Twitter from the code.
We created this content in the spirit of public knowledge contribution on open source software, specifically for Apache Spark that we are working on to develop our own Spark application in Python and Scala.
This content includes: Setup right version of JDK Setup Anaconda Python 3.6 under virtual environment Setup and configure Jupyter notebook to run Python and Scala code Setup Eclipse with Scala plug in as IDE for Spark application development with Scala