我对将Selenium与R一起使用很感兴趣。我注意到WebDriver(Selenium 2)API文档在此介绍了各种文档。在R的实现上是否做过任何工作。我将如何实现这一目标。在文档中,它记录了有关运行selenium服务器的信息,并且可以使用Javascript查询api。任何帮助将非常感激。
可以使用JsonWireProtocol访问Selenium 。
首先,通过以下命令从命令行启动Selenium服务器:
java -jar selenium-server-standalone-2.25.0.jar
可以按以下方式打开新的Firefox浏览器:
library(RCurl) library(RJSONIO) library(XML) baseURL<-"http://localhost:4444/wd/hub/" server<-list(desiredCapabilities=list(browserName='firefox',javascriptEnabled=TRUE)) getURL(paste0(baseURL,"session"), customrequest="POST", httpheader=c('Content-Type'='application/json;charset=UTF-8'), postfields=toJSON(server)) serverDetails<-fromJSON(rawToChar(getURLContent('http://localhost:4444/wd/hub/sessions',binary=TRUE))) serverId<-serverDetails$value[[1]]$id
导航到谷歌。
getURL(paste0(baseURL,"session/",serverId,"/url"), customrequest="POST", httpheader=c('Content-Type'='application/json;charset=UTF-8'), postfields=toJSON(list(url="http://www.google.com")))
获取搜索框的ID
elementDetails<-fromJSON(rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/element"), customrequest="POST", httpheader=c('Content-Type'='application/json;charset=UTF-8'), postfields=toJSON(list(using="xpath",value="//*[@id=\"gbqfq\"]")),binary=TRUE)) ) elementId<-elementDetails$value
搜索主题
rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/element/",elementId,"/value"), customrequest="POST", httpheader=c('Content-Type'='application/json;charset=UTF-8'), postfields=toJSON(list(value=list("\uE009","a","\uE009",'\b','Selenium api in R'))) ,binary=TRUE))
返回搜索HTML
googData<-fromJSON(rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/source"), customrequest="GET", httpheader=c('Content-Type'='application/json;charset=UTF-8'), binary=TRUE )) )
获得建议的链接
gxml<-htmlParse(googData$value) urls<-unname(xpathSApply(gxml,"//*[@class='l']/@href"))
关闭会议
getURL(paste0(baseURL,"session/",serverId), customrequest="DELETE", httpheader=c('Content-Type'='application/json;charset=UTF-8') )