Thanks for taking interest in this.
I don't have the slightest idea whether this could be accomplished, and if so, how; so I'd appreciate any guidance that allow me to start in any direction (other than the one I have at hand now: looking them by hand!). I'm familiar with R and some VB, but if it's possible in any other language, I'll give it a try.
What I've tried:
I have used
phantomjs with the
RSelenium package. Details on how to setup
phantomjs can be found at http://cran.r-project.org/web/packages/RSelenium/vignettes/RSelenium-saucelabs.html#id2a
phantomjs can be driven directly without the need for a Selenium Server details here . It should be alot quicker for the task you outline due to its headless nature.
The first part of your question can be achieved as follows:
appURL <- "http://web.sivicos.gov.co:8080/consultas/consultas/consreg_encabcum.jsp" library(RSelenium) pJS <- phantom() remDr <- remoteDriver(browserName = "phantom") remDr$open() remDr$navigate(appURL) # Get the third list item of the select box (MEDICAMENTOS) webElem <- remDr$findElement("css", "select[name='grupo'] option:nth-child(3)") webElem$clickElement() # select this element # Send text to input value="" name="expediente webElem <- remDr$findElement("css", "input[name='expediente']") webElem$sendKeysToElement(list(2203)) # Click the Buscar button remDr$findElement("id", "INPUT2")$clickElement()
Now the form has been filled in and the link clicked. The data is in an iframe with
Iframes need to be switched to:
# switch to datos iframe remDr$switchToFrame(remDr$findElement("css", "iframe[name='datos']")) remDr$findElement("css", "a")$clickElement() # click the link given in the iframe # get the resulting data appData <- remDr$getPageSource()[] # close phantom js pJS$stop()
The data for the iframe is now contained in
appData. As an example we look at the third table using the simple extraction function
readHTMLTable(appData, which = 3) V1 V2 V3 V4 V5 V6 1 Presentacion Comercial <NA> <NA> <NA> <NA> <NA> 2 Expediente Consec Termino Unidad / Medida Cantidad Descripcion 3 000002203 01 0176 ml 60,00 FRASCO AMBAR POR 60 ML 4 000002203 02 0176 ml 120,00 FRASCO AMBAR POR 120 ML 5 000002203 03 0176 ml 90,00 FRASCO AMBAR POR 90 ML V7 V8 V9 1 <NA> <NA> <NA> 2 Fecha insc Estado Fecha Inactiv 3 2007/01/30 Activo 4 2007/01/30 Activo 5 2012/03/15 Activo
©2020 All rights reserved.