Executing Javascript from Python

I have HTML webpages that I am crawling using xpath. The etree.tostring of a certain node gives me this string:

<script>
<!--
function escramble_758(){
  var a,b,c
  a='+1 '
  b='84-'
  a+='425-'
  b+='7450'
  c='9'
  document.write(a+c+b)
}
escramble_758()
//-->
</script>

I just need the output of escramble_758(). I can write a regex to figure out the whole thing, but I want my code to remain tidy. What is the best alternative?

I am zipping through the following libraries, but I didnt see an exact solution. Most of them are trying to emulate browser, making things snail slow.

Edit: An example will be great.. (barebones will do)

Answers:

Answer

You can also use Js2Py which is written in pure python and is able to both execute and translate javascript to python. Supports virtually whole JavaScript even labels, getters, setters and other rarely used features.

import js2py

js = """
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
document.write(a+c+b)
}
escramble_758()
""".replace("document.write", "return ")

result = js2py.eval_js(js)  # executing JavaScript and converting the result to python string 

Advantages of Js2Py include portability and extremely easy integration with python (since basically JavaScript is being translated to python).

To install:

pip install js2py
Answer

One more solution as PyV8 seems to be unmaintained and dependent on the old version of libv8.

PyMiniRacer It's a wrapper around the v8 engine and it works with the new version and is actively maintained.

pip install py-mini-racer

from py_mini_racer import py_mini_racer
ctx = py_mini_racer.MiniRacer()
ctx.eval("""
function escramble_758(){
    var a,b,c
    a='+1 '
    b='84-'
    a+='425-'
    b+='7450'
    c='9'
    return a+c+b;
}
""")
ctx.call("escramble_758")

And yes, you have to replace document.write with return as others suggested

Answer

quickjs should be the best option after quickjs come out. Just pip install quickjs and you are ready to go.

modify based on the example on README.

from quickjs import Function

js = """
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
document.write(a+c+b)
escramble_758()
}
"""

escramble_758 = Function('escramble_758', js.replace("document.write", "return "))

print(escramble_758())

https://github.com/PetterS/quickjs

Answer

You can use js2py context to execute your js code and get output from document.write with mock document object:

import js2py

js = """
var output;
document = {
    write: function(value){
        output = value;
    }
}
""" + your_script

context = js2py.EvalJs()
context.execute(js)
print(context.output)
Answer

Using PyV8, I can do this. However, I have to replace document.write with return because there's no DOM and therefore no document.

import PyV8
ctx = PyV8.JSContext()
ctx.enter()

js = """
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
document.write(a+c+b)
}
escramble_758()
"""

print ctx.eval(js.replace("document.write", "return "))

Or you could create a mock document object

class MockDocument(object):

    def __init__(self):
        self.value = ''

    def write(self, *args):
        self.value += ''.join(str(i) for i in args)


class Global(PyV8.JSClass):
    def __init__(self):
        self.document = MockDocument()

scope = Global()
ctx = PyV8.JSContext(scope)
ctx.enter()
ctx.eval(js)
print scope.document.value
Answer

You can use requests-html which will download and use chromium underneath.

from requests_html import HTML

html = HTML(html="<a href='http://www.example.com/'>")

script = """
function escramble_758(){
    var a,b,c
    a='+1 '
    b='84-'
    a+='425-'
    b+='7450'
    c='9'
    return a+c+b;
}
"""

val = html.render(script=script, reload=False)
print(val)
# +1 425-984-7450

More on this read here

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us

©2020 All rights reserved.