cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
lala
Level VIII

How can I quickly reassemble and generate different tables from many JMP tables?

There are now 2400 such JMP files in the same directory C:\8: JMP tables with file names with detailed time stamps and 5000 rows of data each.For example, C:\8\20241031112503.JMP.

S5 E5 S4 E4 S3 E3 S2 E2 S1 E1 B1 T1 B2 T2 B3 T3 B4 T4 B5 T5 enco
6024 2 6023 10 6021 6 6020 96 6019 18 6018 400 6017 16 6016 78 6015 41 6014 8 603198
1263 88 1262 290 1261 254 1260 347 1259 301 1258 544 1257 364 1256 630 1255 1123 1254 362 600377

Now need to reorganize the table to save the file according to the code number of the "enco" column, of course, the same "enco" so that the timestamp data in the original file name should be added to distinguish.

Instructions:

"enco" is not repeated on each line of the same file.And enco isn't exactly the same in 2400 files

 

 

 

The key is to find a way to do this quickly.Either JSL or python methods can be used with JMP software (indicating python in the JMP environment).
Of course, if the original file is in CSV format or some database format that is more convenient for this kind of splitting and sorting, you can provide better suggestions.Thank you!

7 REPLIES 7
lala
Level VIII

Re: How can I quickly reassemble and generate different tables from many JMP tables?

I tried to combine them all into a JMP table and then extract subsets.The speed is slow.

Whether to create different memory arrays in memory to concatenate is faster, but this will make the code more complex.

Thanks Experts!

jthi
Super User

Re: How can I quickly reassemble and generate different tables from many JMP tables?

I would first try with Multiple File Import if it can handle that (build single table with source (filename) and go from there). Then I would maybe try some JSL solution, then possible some other JSL solution and lastly Python Integration with DuckDB.

-Jarmo
lala
Level VIII

Re: How can I quickly reassemble and generate different tables from many JMP tables?

Thanks Experts!

Yes, I also want to try new methods.Ask experts how to implement Python and DuckDB specifically?Thank you very much!

jthi
Super User

Re: How can I quickly reassemble and generate different tables from many JMP tables?

I can provide suggestions for JMP (and JSL) but for that I do need more examples and proper understanding of what you are trying to achieve. Are you trying to have one unique file for each enco? Do all tables contain exactly the same columns? Do they always contain similar data (integers)? Do they always have just two rows? Are the file names always in the same format? Do the tables have headers properly set or is the first row character data with the headers?

 

I consider that (pure) Python questions belong (mostly) to other forums. If you have the data as .csv files using DuckDB should be simple task. Also when using Python it is definitely worth it to learn to read the documentation of each of the libraries you are using instead of just relying on LLM (Reading Multiple Files – DuckDB).

-Jarmo
lala
Level VIII

Re: How can I quickly reassemble and generate different tables from many JMP tables?

Yes, I also know that GPT-4o is not familiar with JSL of JMP software.

So later I let it handle CSV directly.

Then add the JSL syntax to make it modify again.

import os
import duckdb
import pandas as pd
from pathlib import Path
output_dir = Path(r"C:\3")
output_dir.mkdir(exist_ok=True)
con = duckdb.connect(":memory:")
def process_csv_files():
csv_files = list(Path(r"C:\8").glob("*.CSV"))
first_file = csv_files[0]
timestamp = first_file.stem
con.execute("""
CREATE TABLE all_data AS
SELECT *, ? as file_timestamp
FROM read_csv_auto(?)
""", [timestamp, str(first_file)])
for csv_file in csv_files[1:]:
timestamp = csv_file.stem
con.execute("""
INSERT INTO all_data
SELECT *, ? as file_timestamp
FROM read_csv_auto(?)
""", [timestamp, str(csv_file)])
enco_values = con.execute("SELECT DISTINCT enco FROM all_data").fetchall()
for (enco,) in enco_values:
query = """
SELECT *
FROM all_data
WHERE enco = ?
ORDER BY file_timestamp
"""
df = con.execute(query, [enco]).df()
output_file = output_dir / f"{enco}.csv"
df.to_csv(output_file, index=False)
if __name__ == "__main__":
try:
process_csv_files()
except Exception as e:
print(f"E: {str(e)}")
lala
Level VIII

Re: How can I quickly reassemble and generate different tables from many JMP tables?

By contrast

Processing CSV is much faster.

2024-11-01_18-09-48.png

 

lala
Level VIII

Re: How can I quickly reassemble and generate different tables from many JMP tables?

Compared, or directly use CSV libraries and dictionaries, containers are the fastest.