Solved: What url you need to log in to and how you can get cookies and download data thr...

lala · Jun 9, 2023 9:57 AM

Hello everyone!

This site requires cookies to download the complete data. How do I get cookies and download data through JSL login?

Thanks!

VBA

Sub Post()
Dim User_agent, Response_Text, username, password, cookie, json
username = [d1]: password = [f1]
User_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
Post_data = "return_url=https%3A%2F%2Fwww.jisilu.cn%2Fdata%2Fcbnew%2F&user_name=" & jslencode(username) & "&password=" & jslencode(password) & "&aes=1&auto_login=0"
With CreateObject("WinHttp.WinHttpRequest.5.1")
    .Open "get", "https://www.jisilu.cn/login/", False
    .setrequestheader "User-Agent", User_agent
    .send
    .Open "post", "https://www.jisilu.cn/webapi/account/login_process/", False
    .setrequestheader "User-Agent", User_agent
    .setrequestheader "Content-Type", "application/x-www-form-urlencoded; charset=UTF-8"
    .send (Post_data)
    cookie = Split(.getallresponseheaders, "Set-Cookie: ")(2)
    .Open "get", "https://www.jisilu.cn/data/cbnew/detail_hist/" & ([b1] & ""), False
    .setrequestheader "User-Agent", User_agent
    .setrequestheader "cookie", cookie
    .send "fprice=&tprice=&curr_iss_amt=&volume=&svolume=&premium_rt=&ytm_rt=&rating_cd=&is_search=N&market_cd%5B%5D=shmb&market_cd%5B%5D=shkc&market_cd%5B%5D=szmb&market_cd%5B%5D=szcy&btype=&listed=Y&qflag=N&sw_cd=&bond_ids=&rp=50&page=1"
    Response_Text = .responsetext
    Set json = JsonConverter.ParseJson(Response_Text)
End With
End Sub

Craige_Hales · Apr 28, 2022 08:55 AM

This is beyond the scope of this forum. Perhaps the site provides an API for what you want to do.

I think the site uses javascript to help verify you are properly logged in. JMP does not run the javascript.

The captcha may be part of the problem, and may require human interaction.

If the user name or password contain special characters then it will need some sort of encoding.

The 413 error code might mean something else is wrong.

Craige

View solution in original post

Craige_Hales · May 24, 2022 6:13 AM

And, if you still want to look at it, 1F8B is the gzip signature (because Accept-Encoding said gzip, among others.) You could use

blob=Char To Blob(  "~1F~8B~08~00~00~00~00~00~00~03~ABVJ~CEOIU~B2214~D6Q~CA-NW~B2Rz~BEg~D7~D3~0D~13~9FOY~F1~ACc~FB~D3~09~BD~CF:~A6=~ED_~FCd~F7~12%~1D~A5~94~C4~92D%~ABj~A5~E4~C4~82~92~E4~0C 3~AF4'~A7~B6~16~00x&~D1~03E~00~00~00",  "ascii~hex" );
write(blobtochar(Gzip Uncompress(blob)))

{"code":413,"msg":"缺少用户名或口令","data":{"captcha":null}}

Google Translate: "Missing username or password"

But I really think using Selenium might be a better answer. Browser Scripting with Python Selenium shows how to script login and page through some data, loading it into a data table.

Craige

View solution in original post

Craige_Hales · Apr 21, 2022 12:54 PM

I'm not a VB user, but it appears like it should work in JMP using 3 newHttpRequests.

The three sections in the VB code end with a ".send".

The first section goes to a login page. Not sure why that might be necessary...it doesn't give you the cookie but might be required to visit it first.

The second section sends the post_data to a URL that will give you a cookie that means you are logged in.

JMP's httprequest may persist that data for you. Or you might need to tell httprequest to write it to a file and then grab it out of the file.

The (2) gets the cookie's value from some sort of VB structure.

The third section uses the URL to get some data and supplies the cookie as your credential. JMP may do that for you.

// get a cookie
try(deletefile("$temp/cookie.txt"));
s = New HTTP Request(
    URL( "http://httpbin.org/cookies/set/aaa/bbb" ),
    Method( "GET" ),
    Headers( {"Accept: application/json", "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:99.0) Gecko/20100101 Firefox/99.0"} ),
    cookiefile("$temp/cookie.txt")
);
data = s << Send;

show(loadtextfile("$temp/cookie.txt"));



// send a cookie
s = New HTTP Request(
    URL( "http://httpbin.org/cookies" ),
    Method( "GET" ),
    Headers( {
    "Accept: application/json", 
    "Cookie: PHPSESSID=298zf09hf012fh2; csrftoken=u32t4o3tb3gg43; _gat=1", 
    "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:99.0) Gecko/20100101 Firefox/99.0"
    } )
);
data = s << Send;
Show( data );  // note the aaa cookie was persisted

Craige

lala · Apr 27, 2022 6:53 AM

I have tried many times but there is no way to directly log in and download the full data.

I found a Python code on the Internet,

lala · Apr 27, 2022 09:44 AM

class AllcontentSpider(scrapy.Spider):
    name = 'allcontent'

    headers = {
        'Host': 'www.jisilu.cn', 'Connection': 'keep-alive', 'Pragma': 'no-cache',
        'Cache-Control': 'no-cache', 'Accept': 'application/json,text/javascript,*/*;q=0.01',
        'Origin': 'https://www.jisilu.cn', 'X-Requested-With': 'XMLHttpRequest',
        'User-Agent': 'Mozilla/5.0(WindowsNT6.1;WOW64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/67.0.3396.99Safari/537.36',
        'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8',
        'Referer': 'https://www.jisilu.cn/login/',
        'Accept-Encoding': 'gzip,deflate,br',
        'Accept-Language': 'zh,en;q=0.9,en-US;q=0.8'
    }

    def start_requests(self):
        login_url = 'https://www.jisilu.cn/login/'
        headers = {
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
            'Accept-Encoding': 'gzip,deflate,br', 'Accept-Language': 'zh,en;q=0.9,en-US;q=0.8',
            'Cache-Control': 'no-cache', 'Connection': 'keep-alive',
            'Host': 'www.jisilu.cn', 'Pragma': 'no-cache', 'Referer': 'https://www.jisilu.cn/',
            'Upgrade-Insecure-Requests': '1',
            'User-Agent': 'Mozilla/5.0(WindowsNT6.1;WOW64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/67.0.3396.99Safari/537.36'}

        yield Request(url=login_url, headers=headers, callback=self.login,dont_filter=True)

    def login(self, response):
        url = 'https://www.jisilu.cn/account/ajax/login_process/'
        data = {
            'return_url': 'https://www.jisilu.cn/',
            'user_name': config.username,
            'password': config.password,
            'net_auto_login': '1',
            '_post_type': 'ajax',
        }

        yield FormRequest(
            url=url,
            headers=self.headers,
            formdata=data,
            callback=self.parse,
            dont_filter=True
        )

    def parse(self, response):
        for i in range(1,3726):
            focus_url = 'https://www.jisilu.cn/home/explore/sort_type-new__day-0__page-{}'.format(i)
            yield Request(url=focus_url, headers=self.headers, callback=self.parse_page,dont_filter=True)

    def parse_page(self, response):
        nodes = response.xpath('//div[@class="aw-question-list"]/div')
        for node in nodes:
            each_url=node.xpath('.//h4/a/@href').extract_first()
            yield Request(url=each_url,headers=self.headers,callback=self.parse_item,dont_filter=True)

    def parse_item(self,response):
        item = JslItem()
        title = response.xpath('//div[@class="aw-mod-head"]/h1/text()').extract_first()
        s = response.xpath('//div[@class="aw-question-detail-txt markitup-box"]').xpath('string(.)').extract_first()
        ret = re.findall('(.*?)\.donate_user_avatar', s, re.S)

lala · Apr 27, 2022 09:47 AM

It not work:


s = New HTTP Request( URL( "https://www.jisilu.cn/login" ), Method( "get" ) );
data = s << Send;
username = "lala";
password = "jmp";
jj = "return_url=http://www.jisilu.cn/&user_name=" || username || "&password=" || password || "&net_auto_login=1&_post_type=ajax";
h = [=> ];
h["Content-Type"] = "application/json";
h["User-Agent"] = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36";
h["Content-Type"] = "application/x-www-form-urlencoded; charset=UTF-8";

s = New HTTP Request( URL( "https://www.jisilu.cn/webapi/account/login_process/" ), Method( "POST" ), JSON( jj ), Headers( h ) );
data1 = s << Send;

Craige_Hales · Apr 27, 2022 11:25 AM

What did it do? Anything in the log window?

Show(data);

Show(s<<getResponseHeaders);

Show(s<<getWarningHeaders);

Show(s<<Get Status Message);

Show(s<<Get Last URL);

If you have Python code that works, why not use it?

Use the Python code to make a .csv file and import that into JMP.

You can call it from JMP, either with the Python interface or with RunProgram.

Same thing with a VB program to make a .csv file and import it.

Craige

lala · Apr 28, 2022 05:59 AM

show: A user name or password is missing

Craige_Hales · Apr 28, 2022 08:55 AM

This is beyond the scope of this forum. Perhaps the site provides an API for what you want to do.

I think the site uses javascript to help verify you are properly logged in. JMP does not run the javascript.

The captcha may be part of the problem, and may require human interaction.

If the user name or password contain special characters then it will need some sort of encoding.

The 413 error code might mean something else is wrong.

Craige

Craige_Hales · May 8, 2022 10:10 PM

Browser Scripting with Python Selenium

Won't be easy, but it is possible. Until you get a captcha that must be solved.

Craige

lala · May 24, 2022 02:52 AM

请原谅我的坚持。

这个题目我继续尝试了。

发现这个方式能通过脚本登录、没有登录失败的提示。

但我还是不会分析其中的内容。

谢谢！

What url you need to log in to and how you can get cookies and download data through JSL login.

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

Re: What url you need to log in to and how you can get cookies and download data through JSL login.

Re: What url you need to log in to and how you can get cookies and download data through JSL login.