Sunday, January 7, 2018
Get total number of tweets in twitter using apache spark scala
Get total number of tweets in twitter using apache spark scala
Get total number of tweets in twitter using apache spark scala -
i new apache-spark , want find out total number of tweets posted across world in twitter every 10 seconds span of time. wrote little snippet tag in twitter. need find out total count of tweets in twitter.
please help me resolve issue.
import java.io._ import org.apache.spark.streaming.{seconds, streamingcontext} import streamingcontext._ import org.apache.spark.sparkcontext._ import org.apache.spark.streaming.twitter._ object twitterpopulartags { def main(args: array[string]) { val (master, filters) = (args(0), args.slice(5, args.length)) // twitter authentication credentials system.setproperty("twitter4j.oauth.consumerkey", "xxxx") system.setproperty("twitter4j.oauth.consumersecret","xxxx") system.setproperty("twitter4j.oauth.accesstoken", "xxxx") system.setproperty("twitter4j.oauth.accesstokensecret", "xxxx") val ssc = new streamingcontext(master, "twitterpopulartags",seconds(10), system.getenv("spark_home"), streamingcontext.jarofclass(this.getclass)) val tweets = twitterutils.createstream(ssc, none) val statuses = tweets.map(status => status.gettext()) val words = statuses.flatmap(status => status.split(" ")) val hashtags = words.filter(word => word.startswith("#")) val tagcounts = hashtags.window(seconds(100), seconds(10)).countbyvalue() tagcounts.print()
}
rdd tweets
contains tweets received.
tweets.count
gives total count.
scala twitter apache-spark
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.