- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Search functionality for substructures of SMILES
For Spotfire exists an Addin (Signals Inventa, former signal leads discovery) in which you can select a substructure of a SMILES graph and then search a database for all SMILES graphs that contain this substructure (isomorphic search). Does anyone know if there is software available with this functionality that could (easily) be connected to JMP?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Search functionality for substructures of SMILES
Hi @Bernd2Heinen,
There are several packages available for chemoinformatics, but one of my favorite is RDKit.
You can use it via Python (and run it with JMP), see "Substructure Searching" on this link : https://www.rdkit.org/docs/GettingStartedInPython.html
Since I'm not a coder (my goal this year is to get to know Python and start using it), you can use RDKit through low/no-code platform like KNIME.
Another option that could be done in JMP directly would be to do a sort of Regex pattern identification script, that can highlight rows containing the same pattern as the one you have in your input/query. In your example from the image, it could be possible to do a sort of Regex search to find all rows in which the SMILES formula contains the pattern "C=1(C=CC=CC1)" for the phenyl group.
I hope this answer will help you,
"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Search functionality for substructures of SMILES
Thanks Victor,
I will take a look on RD Kit and going through KNIME might also help.
Regex search was my first thought as well, but different formulas can produce the same graph and then you would miss structures when you do a pattern search.
thanks again
Bernd
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Search functionality for substructures of SMILES
To generate unique SMILES formula for each specific graph/structures, you can use Canonical SMILES. Each library/package (RDKit, ChemAxon, ...) may have different ways to create them, so Canonical SMILES might not be "universal", but if the SMILES are generated from the same tool, you will get only unique SMILES for the same structure, avoiding the problem you mentioned.
https://www.daylight.com/dayhtml/doc/theory/theory.smiles.html
Hope this might help you,
"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Search functionality for substructures of SMILES
Hi Bernd,
Not sure if this is helpful but I found this from Ian Cox in the knowledge base.
https://community.jmp.com/t5/JMP-Add-Ins/JMP-Add-In-to-Visualise-Molecular-SMILES-Strings/ta-p/22532
Regards, Lou
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Search functionality for substructures of SMILES
Hello @Bernd2Heinen ,
I created the new add-in Python wrapper for RDKit (you need JMP 18).
You can search using Add-ins > Toolkit for Materials Informatics > Substructure Searching.
If you have any feedback, please let me know.
https://community.jmp.com/t5/JMP-Add-Ins/Toolkit-for-Materials-Informatics/ta-p/750690