用 R 解析 JSON

我对 R 相当陌生,但是越是使用它,我就越能看到它在 SAS 或 SPSS 上的真正强大。在我看来,其中一个主要的好处就是能够从网络上获取和分析数据。我想这是可能的(甚至可能是直接的) ,但是我希望解析在 Web 上公开可用的 JSON 数据。我不是一个程序员的任何延伸,所以任何帮助和指导,您可以提供将非常感谢。即使你给我指出一个基本的工作示例,我也可以完成它。

121025 次浏览

RJSONIO from Omegahat is another package which provides facilities for reading and writing data in JSON format.

rjson does not use S4/S3 methods and so is not readily extensible, but still useful. Unfortunately, it does not used vectorized operations and so is too slow for non-trivial data. Similarly, for reading JSON data into R, it is somewhat slow and so does not scale to large data, should this be an issue.

Update (new Package 2013-12-03):

jsonlite: This package is a fork of the RJSONIO package. It builds on the parser from RJSONIO but implements a different mapping between R objects and JSON strings. The C code in this package is mostly from the RJSONIO Package, the R code has been rewritten from scratch. In addition to drop-in replacements for fromJSON and toJSON, the package has functions to serialize objects. Furthermore, the package contains a lot of unit tests to make sure that all edge cases are encoded and decoded consistently for use with dynamic data in systems and applications.

Here is the missing example

library(rjson)
url <- 'http://someurl/data.json'
document <- fromJSON(file=url, method='C')

For the record, rjson and RJSONIO do change the file type, but they don't really parse per se. For instance, I receive ugly MongoDB data in JSON format, convert it with rjson or RJSONIO, then use unlist and tons of manual correction to actually parse it into a usable matrix.

The jsonlite package is easy to use and tries to convert json into data frames.

Example:

library(jsonlite)


# url with some information about project in Andalussia
url <- 'https://api.stackexchange.com/2.2/badges?order=desc&sort=rank&site=stackoverflow'


# read url and convert to data.frame
document <- fromJSON(txt=url)

The function fromJSON() in RJSONIO, rjson and jsonlite don't return a simple 2D data.frame for complex nested json objects.

To overcome this you can use tidyjson. It takes in a json and always returns a data.frame. It is currently not availble in CRAN, you can get it here: https://github.com/sailthru/tidyjson

Update: tidyjson is now available in cran, you can install it directly using install.packages("tidyjson")

Try below code using RJSONIO in console

library(RJSONIO)
library(RCurl)




json_file = getURL("https://raw.githubusercontent.com/isrini/SI_IS607/master/books.json")


json_file2 = RJSONIO::fromJSON(json_file)


head(json_file2)