cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
lala
Level VIII

What url you need to log in to and how you can get cookies and download data through JSL login.

Hello everyone!

This site requires cookies to download the complete data. How do I get cookies and download data through JSL login?

Thanks!

VBA

Sub Post()
Dim User_agent, Response_Text, username, password, cookie, json
username = [d1]: password = [f1]
User_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
Post_data = "return_url=https%3A%2F%2Fwww.jisilu.cn%2Fdata%2Fcbnew%2F&user_name=" & jslencode(username) & "&password=" & jslencode(password) & "&aes=1&auto_login=0"
With CreateObject("WinHttp.WinHttpRequest.5.1")
    .Open "get", "https://www.jisilu.cn/login/", False
    .setrequestheader "User-Agent", User_agent
    .send
    .Open "post", "https://www.jisilu.cn/webapi/account/login_process/", False
    .setrequestheader "User-Agent", User_agent
    .setrequestheader "Content-Type", "application/x-www-form-urlencoded; charset=UTF-8"
    .send (Post_data)
    cookie = Split(.getallresponseheaders, "Set-Cookie: ")(2)
    .Open "get", "https://www.jisilu.cn/data/cbnew/detail_hist/" & ([b1] & ""), False
    .setrequestheader "User-Agent", User_agent
    .setrequestheader "cookie", cookie
    .send "fprice=&tprice=&curr_iss_amt=&volume=&svolume=&premium_rt=&ytm_rt=&rating_cd=&is_search=N&market_cd%5B%5D=shmb&market_cd%5B%5D=shkc&market_cd%5B%5D=szmb&market_cd%5B%5D=szcy&btype=&listed=Y&qflag=N&sw_cd=&bond_ids=&rp=50&page=1"
    Response_Text = .responsetext
    Set json = JsonConverter.ParseJson(Response_Text)
End With
End Sub
2 ACCEPTED SOLUTIONS

Accepted Solutions
Craige_Hales
Super User

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

This is beyond the scope of this forum. Perhaps the site provides an API for what you want to do.

 

I think the site uses javascript to help verify you are properly logged in.  JMP does not run the javascript.

The captcha may be part of the problem, and may require human interaction.

If the user name or password contain special characters then it will need some sort of encoding.

The 413 error code might mean something else is wrong.

 

Craige

View solution in original post

Craige_Hales
Super User

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

And, if you still want to look at it, 1F8B is the gzip signature (because Accept-Encoding said gzip, among others.) You could use

blob=Char To Blob(  "~1F~8B~08~00~00~00~00~00~00~03~ABVJ~CEOIU~B2214~D6Q~CA-NW~B2Rz~BEg~D7~D3~0D~13~9FOY~F1~ACc~FB~D3~09~BD~CF:~A6=~ED_~FCd~F7~12%~1D~A5~94~C4~92D%~ABj~A5~E4~C4~82~92~E4~0C 3~AF4'~A7~B6~16~00x&~D1~03E~00~00~00",  "ascii~hex" );
write(blobtochar(Gzip Uncompress(blob)))
{"code":413,"msg":"缺少用户名或口令","data":{"captcha":null}}

Google Translate: "Missing username or password"

 

But I really think using Selenium might be a better answer. Browser Scripting with Python Selenium  shows how to script login and page through some data, loading it into a data table.

Craige

View solution in original post

14 REPLIES 14
Craige_Hales
Super User

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

I'm not a VB user, but it appears like it should work in JMP using 3 newHttpRequests.

The three sections in the VB code end with a ".send".

 

The first section goes to a login page. Not sure why that might be necessary...it doesn't give you the cookie but might be required to visit it first.

 

The second section sends the post_data to a URL that will give you a cookie that means you are logged in.

JMP's httprequest may persist that data for you. Or you might need to tell httprequest to write it to a file and then grab it out of the file.

The (2) gets the cookie's value from some sort of VB structure.

 

The third section uses the URL to get some data and supplies the cookie as your credential. JMP may do that for you.

// get a cookie
try(deletefile("$temp/cookie.txt"));
s = New HTTP Request(
    URL( "http://httpbin.org/cookies/set/aaa/bbb" ),
    Method( "GET" ),
    Headers( {"Accept: application/json", "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:99.0) Gecko/20100101 Firefox/99.0"} ),
    cookiefile("$temp/cookie.txt")
);
data = s << Send;

show(loadtextfile("$temp/cookie.txt"));



// send a cookie
s = New HTTP Request(
    URL( "http://httpbin.org/cookies" ),
    Method( "GET" ),
    Headers( {
    "Accept: application/json", 
    "Cookie: PHPSESSID=298zf09hf012fh2; csrftoken=u32t4o3tb3gg43; _gat=1", 
    "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:99.0) Gecko/20100101 Firefox/99.0"
    } )
);
data = s << Send;
Show( data );  // note the aaa cookie was persisted
Craige
lala
Level VIII

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

 

I have tried many times but there is no way to directly log in and download the full data.

I found a Python code on the Internet, 

 

 

2022-04-27_21-39-09.png

lala
Level VIII

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

class AllcontentSpider(scrapy.Spider):
    name = 'allcontent'

    headers = {
        'Host': 'www.jisilu.cn', 'Connection': 'keep-alive', 'Pragma': 'no-cache',
        'Cache-Control': 'no-cache', 'Accept': 'application/json,text/javascript,*/*;q=0.01',
        'Origin': 'https://www.jisilu.cn', 'X-Requested-With': 'XMLHttpRequest',
        'User-Agent': 'Mozilla/5.0(WindowsNT6.1;WOW64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/67.0.3396.99Safari/537.36',
        'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8',
        'Referer': 'https://www.jisilu.cn/login/',
        'Accept-Encoding': 'gzip,deflate,br',
        'Accept-Language': 'zh,en;q=0.9,en-US;q=0.8'
    }

    def start_requests(self):
        login_url = 'https://www.jisilu.cn/login/'
        headers = {
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
            'Accept-Encoding': 'gzip,deflate,br', 'Accept-Language': 'zh,en;q=0.9,en-US;q=0.8',
            'Cache-Control': 'no-cache', 'Connection': 'keep-alive',
            'Host': 'www.jisilu.cn', 'Pragma': 'no-cache', 'Referer': 'https://www.jisilu.cn/',
            'Upgrade-Insecure-Requests': '1',
            'User-Agent': 'Mozilla/5.0(WindowsNT6.1;WOW64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/67.0.3396.99Safari/537.36'}

        yield Request(url=login_url, headers=headers, callback=self.login,dont_filter=True)

    def login(self, response):
        url = 'https://www.jisilu.cn/account/ajax/login_process/'
        data = {
            'return_url': 'https://www.jisilu.cn/',
            'user_name': config.username,
            'password': config.password,
            'net_auto_login': '1',
            '_post_type': 'ajax',
        }

        yield FormRequest(
            url=url,
            headers=self.headers,
            formdata=data,
            callback=self.parse,
            dont_filter=True
        )

    def parse(self, response):
        for i in range(1,3726):
            focus_url = 'https://www.jisilu.cn/home/explore/sort_type-new__day-0__page-{}'.format(i)
            yield Request(url=focus_url, headers=self.headers, callback=self.parse_page,dont_filter=True)

    def parse_page(self, response):
        nodes = response.xpath('//div[@class="aw-question-list"]/div')
        for node in nodes:
            each_url=node.xpath('.//h4/a/@href').extract_first()
            yield Request(url=each_url,headers=self.headers,callback=self.parse_item,dont_filter=True)

    def parse_item(self,response):
        item = JslItem()
        title = response.xpath('//div[@class="aw-mod-head"]/h1/text()').extract_first()
        s = response.xpath('//div[@class="aw-question-detail-txt markitup-box"]').xpath('string(.)').extract_first()
        ret = re.findall('(.*?)\.donate_user_avatar', s, re.S)
lala
Level VIII

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

It not work:


s = New HTTP Request( URL( "https://www.jisilu.cn/login" ), Method( "get" ) ); data = s << Send; username = "lala"; password = "jmp"; jj = "return_url=http://www.jisilu.cn/&user_name=" || username || "&password=" || password || "&net_auto_login=1&_post_type=ajax"; h = [=> ]; h["Content-Type"] = "application/json"; h["User-Agent"] = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"; h["Content-Type"] = "application/x-www-form-urlencoded; charset=UTF-8"; s = New HTTP Request( URL( "https://www.jisilu.cn/webapi/account/login_process/" ), Method( "POST" ), JSON( jj ), Headers( h ) ); data1 = s << Send;

 

Craige_Hales
Super User

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

What did it do? Anything in the log window?

 

Show(data);

Show(s<<getResponseHeaders);

Show(s<<getWarningHeaders);

Show(s<<Get Status Message);

Show(s<<Get Last URL);

 

If you have Python code that works, why not use it?

Use the Python code to make a .csv file and import that into JMP.

You can call it from JMP, either with the Python interface or with RunProgram.

Same thing with a VB program to make a .csv file and import it.

Craige
lala
Level VIII

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

  • show:   A user name or password is missing

2022-04-28_11-39-12.png

Craige_Hales
Super User

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

This is beyond the scope of this forum. Perhaps the site provides an API for what you want to do.

 

I think the site uses javascript to help verify you are properly logged in.  JMP does not run the javascript.

The captcha may be part of the problem, and may require human interaction.

If the user name or password contain special characters then it will need some sort of encoding.

The 413 error code might mean something else is wrong.

 

Craige
Craige_Hales
Super User

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

Browser Scripting with Python Selenium 

Won't be easy, but it is possible. Until you get a captcha that must be solved.

Craige
lala
Level VIII

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

请原谅我的坚持。

这个题目我继续尝试了。

发现这个方式能通过脚本登录、没有登录失败的提示。

但我还是不会分析其中的内容。

 

谢谢!

2022-05-24_14-47-10.png