This is a follow-up to my popular and technically challenging HTML injection into someone else's website? question.
To recap: I'd like to demo my technology to customers without actually modifying their live website (e.g. demoing the idea of Stackoverflow financial bounties, without modifying the live site). Essentially, I'm trying to create a server-side version of Greasemonkey.
I've implemented the mirror as follows:
http://myserver.com/forward?uri=[remote]
[remote]
, pulls down the data and returns its body/headers to the request from #1.I chose this syntax because I needed to handle requests from multiple domains (meaning, if stackoverflow.com
links to meta.stackoverflow.com
I need to handle both domains from the same forwarding server).
I have managed to rewrite links in the HTML and CSS files so they are relative to my server. The final hurdle is rewriting URLs referenced by Javascript files.
What is the best way to programmatically rewrite URLs referenced by someone else's Javascript code? Is this even technically doable?
I'll give you an example of the technical hurdle I am facing. Take http://www.honda.com/ for example. They embed a Flash element on the page, but instead of embedding <object>
directly, they use Javascript to document.write()
the <object>
tag containing the URL.
Ideally we want intercept DOM changes before they render, so the browser does not request URLs before we have a chance to rewrite them.
Related resources:
A server-side solution will not work. Even if I can rewrite all DOM URLs, I've seen an example where an embedded Flash application references URLs stored in Javascript variables. There is no programmatic way to detect that these variables represent URLs because the Flash application is opaque.
Next, I plan on trying a client-side solution. I will load the original website in one frame, and manipulate its contents using Javascript in a second (hidden) frame. I hope to be able to inject new DOM elements (to demo my product) without having to rewrite the existing elements.
Very challenging and interesting task. I would go with first saving the javascript files on my server and reference them from the HTML served. Then I would find the URLs in the files (using a regex or something) and replace it with the wanted value. I know it is not very fast, it is not very dynamic and all, but I believe it would be easier to implement.
Hope I helped!
Answering my own question.
After much research, I find this technique works best: https://stackoverflow.com/a/23231268/14731
In other words, there doesn't seem to be a general algorithm to rewrite links. Patching them by hand isn't as much work as you'd expect.
©2020 All rights reserved.