<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: For large amounts of data, is it faster to use python to process JSON asynchronously into structured data? in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795926#M97235</link>
    <description>&lt;P&gt;Are you pulling in more data every minute?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Let's say you currently have downloaded 5000 JSON files. You should process those only once into JMP table / database / somewhere else. After that you just keep parsing the new files and adding that new data to where ever you are storing it. &lt;/P&gt;</description>
    <pubDate>Fri, 06 Sep 2024 05:21:53 GMT</pubDate>
    <dc:creator>jthi</dc:creator>
    <dc:date>2024-09-06T05:21:53Z</dc:date>
    <item>
      <title>For large amounts of data, is it faster to use python to process JSON asynchronously into structured data?</title>
      <link>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795615#M97215</link>
      <description>&lt;P&gt;The JSON of the following structures is convenient to handle with JSL of JMP software.The JSON format is fixed and only 13 columns in [] are extracted.(Only two files are listed here.)&lt;/P&gt;&lt;P&gt;JSON1&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;{"ZJB":4271175,"ZJS":-3443749,"trend":[["09:30",0,-444931,0,-444931,0,1,0,21100,0,0,0,444931],["09:33",2,1433022,1433022,0,2,0,67100,0,0,1433022,0,0],["09:34",3,-316128,0,-316128,0,1,0,14800,0,0,0,316128],["09:45",4,318570,318570,0,1,0,15000,0,0,318570,0,0],["09:52",5,403965,403965,0,1,0,19100,0,0,403965,0,0],["10:03",7,-345725,328755,-674480,1,1,15500,31800,328755,0,0,674480],["10:25",8,419440,419440,0,1,0,19600,0,419440,0,0,0],["10:32",9,-623500,0,-623500,0,1,0,29000,0,0,0,623500],["10:40",10,353925,353925,0,1,0,16500,0,0,353925,0,0],["13:52",11,-1065500,0,-1065500,0,1,0,50000,0,0,0,1065500],["14:17",12,332436,332436,0,1,0,15600,0,332436,0,0,0],["14:25",13,-319214,0,-319214,0,1,0,15000,0,0,319214,0],["14:54",14,681065,681065,0,1,0,31900,0,0,681065,0,0]],"active":1080631,"passive":3190547,"Active":-319214,"Passive":-3124539,"AvgPrice":21.32,"AvgPrice":21.3,"time2":1725535598,"ttag":0.004174999999999929,"errcode":"0"}&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;JSON2&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;{"ZJB":1913404,"ZJS":-4366449,"trend":[["09:30",0,-730500,0,-730500,0,1,0,50000,0,0,0,730500],["09:34",1,402408,402408,0,1,0,27600,0,0,402408,0,0],["09:52",2,-442380,0,-442380,0,1,0,30300,0,0,0,442380],["10:51",3,-314545,0,-314545,0,1,0,21500,0,0,0,314545],["11:17",4,-339184,0,-339184,0,1,0,23200,0,0,0,339184],["13:06",5,-438600,0,-438600,0,1,0,30000,0,0,0,438600],["13:27",6,-337491,0,-337491,0,1,0,23100,0,0,337491,0],["13:47",7,-323676,0,-323676,0,1,0,22200,0,0,0,323676],["13:49",8,-447299,0,-447299,0,1,0,30700,0,0,0,447299],["14:00",9,-630448,0,-630448,0,1,0,43300,0,0,630448,0],["14:27",11,344796,707124,-362328,1,1,48400,24800,707124,0,0,362328],["14:31",12,320426,320426,0,1,0,21902,0,320426,0,0,0],["14:32",13,483449,483449,0,1,0,33000,0,0,483449,0,0]],"active":1027550,"passive":885857,"Active":-967939,"Passive":-3398512,"AvgPrice":14.62,"AvgPrice":14.6,"time2":1725535597,"ttag":0.0024069999999999925,"errcode":"0"}&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;But is python's asynchronous processing faster when these files are large?&lt;BR /&gt;But I don't know how python handles this JSON, and I asked ChatGPT for an answer.&lt;BR /&gt;So ask community experts.Thank you very much!&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Please make this file in the C:\8 directory and want to call python by encoding it into JSL.Concatenate all files into a JMP table.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;This requires that each file also have one additional file name.&lt;/P&gt;</description>
      <pubDate>Thu, 05 Sep 2024 12:26:23 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795615#M97215</guid>
      <dc:creator>lala</dc:creator>
      <dc:date>2024-09-05T12:26:23Z</dc:date>
    </item>
    <item>
      <title>Re: For large amounts of data, is it faster to use python to process JSON asynchronously into structured data?</title>
      <link>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795626#M97216</link>
      <description>&lt;P&gt;ChatGPT&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;JSL&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;# Define the Python code block
pythonCode = """
import os
import json
import asyncio
import aiofiles

# Define the directory to process
directory = 'C:/8/'

# Read and parse JSON files
async def process_file(file_path, encoding):
    async with aiofiles.open(file_path, mode='r', encoding='utf-8') as file:
        content = await file.read()
        data = json.loads(content)
        trend_data = data.get('trend', [])
        
        # Extract data and add encoding column
        rows = []
        for entry in trend_data:
            entry.append(encoding)  # Add filename as encoding column
            rows.append(entry)
        return rows

# Asynchronously process all files
async def process_all_files():
    all_data = []
    for filename in os.listdir(directory):
        if filename.endswith('.json'):
            file_path = os.path.join(directory, filename)
            encoding = os.path.splitext(filename)[0]  # Extract filename (without extension)
            rows = await process_file(file_path, encoding)
            all_data.extend(rows)
    return all_data

# Run asynchronous tasks
def run_asyncio_fetch():
    return asyncio.run(process_all_files())

# Get data
data = run_asyncio_fetch()
"""

# Submit the Python code to JMP's Python environment
Python Submit(pythonCode)

# Retrieve the processed data from Python
results = Python Get("data")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 05 Sep 2024 12:30:58 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795626#M97216</guid>
      <dc:creator>lala</dc:creator>
      <dc:date>2024-09-05T12:30:58Z</dc:date>
    </item>
    <item>
      <title>Re: For large amounts of data, is it faster to use python to process JSON asynchronously into structured data?</title>
      <link>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795650#M97220</link>
      <description>&lt;P&gt;Someone from JMP might know more technical answer, I just have questions for you:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;What do you consider large file?&lt;/LI&gt;
&lt;LI&gt;How many files do you have?&lt;/LI&gt;
&lt;LI&gt;Have you tried with different file sizes?&lt;/LI&gt;
&lt;LI&gt;Have you tried with JMP using different methods&lt;BR /&gt;
&lt;UL&gt;
&lt;LI&gt;Open single file&lt;/LI&gt;
&lt;LI&gt;Load as text&lt;/LI&gt;
&lt;LI&gt;Multiple File Import&lt;/LI&gt;
&lt;LI&gt;Python integration&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;Have you tried with python while not using JMP integration?&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Thu, 05 Sep 2024 14:17:37 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795650#M97220</guid>
      <dc:creator>jthi</dc:creator>
      <dc:date>2024-09-05T14:17:37Z</dc:date>
    </item>
    <item>
      <title>Re: For large amounts of data, is it faster to use python to process JSON asynchronously into structured data?</title>
      <link>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795664#M97221</link>
      <description>&lt;P&gt;Thank jthi!&lt;/P&gt;&lt;P&gt;I am currently dealing with more than 5000 such JSON files, and the key is to be processed in a short time.&lt;BR /&gt;I used python's asyncio concurrent download via ChatGPT only recently and found it to be faster than JMP concurrent download.These JSON files are downloaded in this way.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;I am now using python asyncio first to quickly and concurrently download the original JSON file to save on the computer.&lt;BR /&gt;So I also wanted to try python's asynchronous handling of JSON, but save JMP tables directly.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;I just wanted to try it first and didn't think much about anything else.&lt;BR /&gt;I hope experienced experts can give guidance and help.Thank you very much!&lt;/P&gt;</description>
      <pubDate>Thu, 05 Sep 2024 14:35:10 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795664#M97221</guid>
      <dc:creator>lala</dc:creator>
      <dc:date>2024-09-05T14:35:10Z</dc:date>
    </item>
    <item>
      <title>Re: For large amounts of data, is it faster to use python to process JSON asynchronously into structured data?</title>
      <link>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795675#M97222</link>
      <description>&lt;P&gt;&lt;SPAN class=""&gt;I only know a little about JSL.&lt;/SPAN&gt;&lt;SPAN class=""&gt;So I'm not familiar with how JSL and python can handle data better and faster in memory.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class=""&gt;Thanks!&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 05 Sep 2024 14:43:20 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795675#M97222</guid>
      <dc:creator>lala</dc:creator>
      <dc:date>2024-09-05T14:43:20Z</dc:date>
    </item>
    <item>
      <title>Re: For large amounts of data, is it faster to use python to process JSON asynchronously into structured data?</title>
      <link>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795677#M97223</link>
      <description>&lt;P&gt;Maybe this could be a good time to start thinking a bit more about this?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;More questions (and few I did ask earlier):&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;What is large file?&lt;/LI&gt;
&lt;LI&gt;What is "short time" for processing?&lt;/LI&gt;
&lt;LI&gt;Is the issue getting the data from JSON to JMP or getting the data downloaded?&lt;/LI&gt;
&lt;LI&gt;Have you tried loading the JSON using JMP, for example with Multiple File Import?&lt;/LI&gt;
&lt;LI&gt;Do you always have batches of 5000+ files?&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;And going a bit further&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Maybe that data should be stored to database in scheduled manner (and at this point this comes not a JMP question for a long time. JMP can load the data from database if needed)?&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Thu, 05 Sep 2024 14:54:01 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795677#M97223</guid>
      <dc:creator>jthi</dc:creator>
      <dc:date>2024-09-05T14:54:01Z</dc:date>
    </item>
    <item>
      <title>Re: For large amounts of data, is it faster to use python to process JSON asynchronously into structured data?</title>
      <link>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795815#M97233</link>
      <description>&lt;P&gt;Thank the experts for thinking from a higher perspective.&lt;BR /&gt;I further explain: these data are available every minute, and the key is to calculate after downloading.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;So I'm still dealing with one piece at a time.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;I found that processing from JSON to tables took a long time.Want to try asynchronous processing in python.&lt;BR /&gt;I hope experts can give specific guidance.Thanks!&lt;/P&gt;</description>
      <pubDate>Thu, 05 Sep 2024 23:09:42 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795815#M97233</guid>
      <dc:creator>lala</dc:creator>
      <dc:date>2024-09-05T23:09:42Z</dc:date>
    </item>
    <item>
      <title>Re: For large amounts of data, is it faster to use python to process JSON asynchronously into structured data?</title>
      <link>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795926#M97235</link>
      <description>&lt;P&gt;Are you pulling in more data every minute?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Let's say you currently have downloaded 5000 JSON files. You should process those only once into JMP table / database / somewhere else. After that you just keep parsing the new files and adding that new data to where ever you are storing it. &lt;/P&gt;</description>
      <pubDate>Fri, 06 Sep 2024 05:21:53 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795926#M97235</guid>
      <dc:creator>jthi</dc:creator>
      <dc:date>2024-09-06T05:21:53Z</dc:date>
    </item>
    <item>
      <title>Re: For large amounts of data, is it faster to use python to process JSON asynchronously into structured data?</title>
      <link>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795932#M97237</link>
      <description>&lt;P class=""&gt;&lt;SPAN class=""&gt;Thank the experts for their patient follow-up.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;I look like this:&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;Processing 5000 JSON files per minute.&lt;/SPAN&gt;&lt;SPAN class=""&gt;Centralize the calculations in JMP tables, save only a few summarized results, and use another JMP file (which can be easily handled in JSL).&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;The raw JSON is not saved, and neither is the merged JMP data table.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;So I wanted to speed things up by processing data in memory.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;Thanks!&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 06 Sep 2024 06:34:42 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795932#M97237</guid>
      <dc:creator>lala</dc:creator>
      <dc:date>2024-09-06T06:34:42Z</dc:date>
    </item>
    <item>
      <title>Re: For large amounts of data, is it faster to use python to process JSON asynchronously into structured data?</title>
      <link>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795947#M97238</link>
      <description>&lt;P&gt;So you are performing 5000 http requests a minute (7.2million a day) which all return a JSON (feels quite a lot of requests to single endpoint from one IP) or in some other way you are getting those 5000 separate files a minute?&lt;/P&gt;</description>
      <pubDate>Fri, 06 Sep 2024 07:16:47 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795947#M97238</guid>
      <dc:creator>jthi</dc:creator>
      <dc:date>2024-09-06T07:16:47Z</dc:date>
    </item>
    <item>
      <title>Re: For large amounts of data, is it faster to use python to process JSON asynchronously into structured data?</title>
      <link>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795952#M97239</link>
      <description>&lt;UL&gt;&lt;LI&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;200 a minute is all I can handle in real time right now&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;A lot of it is after-the-fact&lt;/SPAN&gt;&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 06 Sep 2024 07:25:51 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/795952#M97239</guid>
      <dc:creator>lala</dc:creator>
      <dc:date>2024-09-06T07:25:51Z</dc:date>
    </item>
    <item>
      <title>Re: For large amounts of data, is it faster to use python to process JSON asynchronously into structured data?</title>
      <link>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/802476#M97893</link>
      <description>&lt;P class=""&gt;&lt;SPAN class=""&gt;Well, by comparing the consumption of each step, less time to download the data each time.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;However, it takes more time to assemble and sort many data each time.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;Ask experts: Which database has a speed advantage in this regard: splicing, sorting.&lt;/SPAN&gt;&lt;SPAN class=""&gt;The key is that this is only intermediate data, not stored.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;I found that JMP18's splicing speed is significantly not as fast as JMP14's.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;Thanks Experts!&lt;/P&gt;</description>
      <pubDate>Sun, 29 Sep 2024 02:31:24 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/For-large-amounts-of-data-is-it-faster-to-use-python-to-process/m-p/802476#M97893</guid>
      <dc:creator>lala</dc:creator>
      <dc:date>2024-09-29T02:31:24Z</dc:date>
    </item>
  </channel>
</rss>

