我发现以下代码非常有效,可以让我在Python Shell中查看标准的1%的twitter firehose:
import sys import tweepy consumer_key="" consumer_secret="" access_key = "" access_secret = "" auth = tweepy.OAuthHandler(consumer_key, consumer_secret) auth.set_access_token(access_key, access_secret) api = tweepy.API(auth) class CustomStreamListener(tweepy.StreamListener): def on_status(self, status): print status.text def on_error(self, status_code): print >> sys.stderr, 'Encountered error with status code:', status_code return True # Don't kill the stream def on_timeout(self): print >> sys.stderr, 'Timeout...' return True # Don't kill the stream sapi = tweepy.streaming.Stream(auth, CustomStreamListener()) sapi.filter(track=['manchester united'])
如何添加过滤器以仅分析来自特定位置的推文?我见过人们在其他与Twitter相关的Python代码中添加GPS,但我在Tweepy模块中找不到sapi的任何特定内容。
有任何想法吗?
谢谢
流API不允许同时按位置AND关键字进行过滤。
边界框不充当其他过滤器参数的过滤器。例如,track = twitter&locations = -122.75,36.8,-121.75,37.8将匹配任何包含术语Twitter(甚至是非地理推文)或来自旧金山地区的推文。
您可以做的是向流API询问关键字或定位的tweet,然后通过查看每个tweet过滤应用程序中的结果流。
如果按以下方式修改代码,则将捕获英国的推文,然后这些推文将被过滤以仅显示包含“曼彻斯特联合”的推文
import sys import tweepy consumer_key="" consumer_secret="" access_key="" access_secret="" auth = tweepy.OAuthHandler(consumer_key, consumer_secret) auth.set_access_token(access_key, access_secret) api = tweepy.API(auth) class CustomStreamListener(tweepy.StreamListener): def on_status(self, status): if 'manchester united' in status.text.lower(): print status.text def on_error(self, status_code): print >> sys.stderr, 'Encountered error with status code:', status_code return True # Don't kill the stream def on_timeout(self): print >> sys.stderr, 'Timeout...' return True # Don't kill the stream sapi = tweepy.streaming.Stream(auth, CustomStreamListener()) sapi.filter(locations=[-6.38,49.87,1.77,55.81])