xml - How to recognize data format - scraping in R -


i trying use r data open data source in netherlands. source here.

when open in browser (at least chrome), presented xml code. thought can use rcurl package parse it, , use xpath extract specific nodes seek.

however, when trying parse it, run problems. not seem straight xml, has json in it.

how can extract information datasource? not looking full solution, guidance in right direction.

if try:

url <- "http://www.kiesbeter.nl/open-data/api/care/careproviders/?apikey=18a2b2b0-d232-4f48-8d10-5fc10ff04b17" html <- geturl(url) doc <- htmlparse(html,astext = true) 

it seems doc in json format still. cannot seem use getnodeset(doc, "//careproviders"). however, if use fromjson first, in awkward list format.

so question how can treat data can information out of dataset (e.g. care providers). , how recognize format data in?

use

html <- geturl(url, httpheader = c(accept = "text/xml")) 

with specified content-type xml curl.

a little clarification. service provides both xml , json data formats, default of json. browser sends text/xml (among others) in accept header request, service returns xml. curl (by default) doesn't send so, service returns json format, default type.


Comments

Popular posts from this blog

matlab - Deleting rows with specific rules -

jquery - How would i go about shortening this code? And to cancel the previous click on click of new section? -