I'm writing my first data scraper using Excel and VBA. I'm stuck trying to go to the next page of a website. The source code looks as follows:
<li><a href="#" onclick="changePage(2); return false;">Page 2 of 24</a></li>
This is the VBA code I have but does not seem to work:
For Each l In ie.Document.getElementsByTagName("a") If l.href = "#" And l.onclick = "changePage(2); return false;" Then l.Item(2).Click Exit For End If Next l
When I run the code I don't get any errors, but it doesn't seem to go to page 2. Keep in mind that there are more pages after page 2. My idea is replace "2" with a variable later and increase that variable by one. But I need to get it to work first.
Thanks to whoever can help.
[Edit: I now have a solution and the code has been replaced. -RDH]
First I want to mention that if the data retrieved in this manner is used for commercial purposes or anything other than personal use then it violates 2 sections of the Kelley Blue Book (kbb.com) Terms of Service.
FYI: Sites that collect, update, and maintain data like BlueBook or the MLS take their data very seriously, and they don't like people scraping it. I was speaking to an old classmate of mine who has her degree in Computer Science and is now a real estate agent, and I mentioned to her about how cool it is to be able scrape housing data off of MLS and she nearly flipped out on me. Just saying: people were paid to create that data and people make their lives using that data. 'Nuff said. I was able to get the problem code running by creating a web page on my own server that had the same format you were looking for since I get a different version of the bluebook.com site since I am in Canada. I get redirected to kbb.com.
+++ The real problem +++
The problem is that hrefs with an # symbol are actually the full URL with the # attached to the end, and when you check the onClick event it actually contains the full function declariation, so you have to only search for partial strings.
' A good idea to declare the proper datatypes ' because IHTMLElement has the click event but IHTMLAnchorElements don't Dim l As IHTMLElement Dim htmlanchors As IHTMLElementCollection ' ... Set htmlanchors = ie.Document.getElementsByTagName("a") ' Look through all the anchor tags on the page For Each l In htmlanchors ' Check to see the Href contains a # and the onclick event has specific code If InStr(l.href, "#") And InStr(l.onclick, "changePage(3); return false;") Then ' Click the current anchor link l.Click Exit For End If Next l
Have you tried
.FireEvent ("onclick") Or .FireEvent ("onmouseover") .FireEvent ("onmousedown") .FireEvent("onmouseup")
in place of
Rick – below is my entire code. I’m basically trying to scrape www.the bluebook.com.
Sub ScrapeData() Dim ie As InternetExplorer Dim ele As Object Dim RowCount As Long Dim myWebsite As String, mySearch1 As String, mySearch2 As String, mySearch3 As String Dim Document As HTMLDocument myWebsite = Range("Website").Value mySearch1 = Range("search1").Value mySearch2 = Range("search2").Value mySearch3 = Range("search3").Value Set mySheet = Sheets("Sheet1") Range("A6").Value = "Company" Range("B6").Value = "Address" Range("C6").Value = "Contact" RowCount = 7 Set ie = New InternetExplorer ie.Visible = True With ie .Visible = True .navigate (myWebsite) Do While .Busy Or .readyState <> 4 DoEvents Loop ie.Document.getElementById("search").Value = mySearch1 ie.Document.getElementById("selRegion").Value = mySearch2 ie.Document.getElementsByClassName("searchBtn")(0).Click Do While .Busy Or _ .readyState <> 4 DoEvents Loop For Each ele In .Document.all Select Case ele.className Case "result_title" RowCount = RowCount + 1 Case "cname" mySheet.Range("A" & RowCount) = ele.innerText Case "addy_wrapper" mySheet.Range("B" & RowCount) = ele.innerText End Select Next ele End With 'THIS IS THE CODE THAT IS NOT WORKING For Each l In ie.Document.getElementsByTagName("a") If l.href = "#" And l.onclick = "changePage(3); return false;" Then l.Item(3).Click Exit For End If Next l Set ie = Nothing End Sub
©2020 All rights reserved.