<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Find and identify duplicates for unsorted data in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/369903#M61959</link>
    <description>&lt;P&gt;Here's another example using a column formula.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;dt = Open( "$SAMPLE_DATA/Big Class.jmp" );

// Create a column that indicates duplicate names (there are two "Robert")

dt &amp;lt;&amp;lt; New Column( "Duplicate", Formula( If( Col Number( Row(), :name ) &amp;gt; 1, 1, 0 ) ) );

&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 19 Mar 2021 22:09:46 GMT</pubDate>
    <dc:creator>ms</dc:creator>
    <dc:date>2021-03-19T22:09:46Z</dc:date>
    <item>
      <title>Find and identify duplicates for unsorted data</title>
      <link>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/369663#M61934</link>
      <description>&lt;P&gt;Fairly new to JMP, so my knowledge of how to use formulas is pretty limited.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If I have a column of values (character) and I want to identify them as duplicates in a different column, what formula would I use? (My data has 10 columns already sorted by a different column's values)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In the past, I have used the formula below (when the data that needs duplicates identified has been sorted alphabetically).&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;If(
	Row() == 1, 1,
	:Column1 != Lag( :Column1 ), 1,
	0
)&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Since I am not allowed to present data sorted in any other way than the way it comes in, I am kind of at a loss as to how to indicate duplicates.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any and all help is appreciated.&lt;/P&gt;</description>
      <pubDate>Sat, 10 Jun 2023 23:27:30 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/369663#M61934</guid>
      <dc:creator>CMG</dc:creator>
      <dc:date>2023-06-10T23:27:30Z</dc:date>
    </item>
    <item>
      <title>Re: Find and identify duplicates for unsorted data</title>
      <link>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/369675#M61936</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maybe this line helps&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;dt = Current Data Table();

dt &amp;lt;&amp;lt; Select duplicate rows( Match( :column_1, :column_2, :column_3) );&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This will select rows that are not unique in the upper combination of column 1 to 3.&lt;/P&gt;
&lt;P&gt;With the selection you can now delete, hide and exclude, fill an additional column or what you want to do with them.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;r_select = dt &amp;lt;&amp;lt; select rows( dt &amp;lt;&amp;lt; get selected rows ); // this gets you a vector of row number.

r_select &amp;lt;&amp;lt; Delete Rows;

// or
r_select &amp;lt;&amp;lt; Hide and Exclude;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Mar 2021 15:34:15 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/369675#M61936</guid>
      <dc:creator>Mauro_Gerber</dc:creator>
      <dc:date>2021-03-19T15:34:15Z</dc:date>
    </item>
    <item>
      <title>Re: Find and identify duplicates for unsorted data</title>
      <link>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/369692#M61937</link>
      <description>&lt;P&gt;Thank you for the response.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I don't think I explained myself well. What I want is a helper column to indicate if a certain value is duplicate, maybe with a "1" indicating all duplicates, and a "0" indicating no duplicates. I do not want to hide or delete the duplicates.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Mar 2021 14:48:52 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/369692#M61937</guid>
      <dc:creator>CMG</dc:creator>
      <dc:date>2021-03-19T14:48:52Z</dc:date>
    </item>
    <item>
      <title>Re: Find and identify duplicates for unsorted data</title>
      <link>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/369699#M61939</link>
      <description>&lt;P&gt;Below are at least two ways to do this with scripting:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;Names Default To Here(1);

dt = New Table("Untitled",
	Add Rows(7),
	Compress File When Saved(1),
	New Column("Column 1",
		Numeric,
		"Continuous",
		Format("Best", 12),
		Set Values([1, 1, 2, 3, 2, 5, 1]),
		Set Display Width(60)
	)
);


dt &amp;lt;&amp;lt; New column("Dublicate_formula", Numeric, &amp;lt;&amp;lt;Formula(
	If(Col Rank(1, :column 1) &amp;gt; 1, 1,0))
);

dt &amp;lt;&amp;lt; Select duplicate rows(Match(:column 1));
dubRows = dt &amp;lt;&amp;lt; Get Selected Rows;
dt &amp;lt;&amp;lt; New Column("Dublicates", Numeric, Nominal);
Column(dt, "Dublicates")[dubRows] = 1;
Column(dt, "Dublicates")[dt &amp;lt;&amp;lt; Get Rows Where(IsMissing(:Dublicates))] = 0;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;You can also do this without any scripting:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Select columns you are interested in.&lt;/LI&gt;&lt;LI&gt;From Rows menu: Row Selection -&amp;gt; Select Dublicate Rows&lt;/LI&gt;&lt;LI&gt;Rows menu: Row Selection -&amp;gt; Name Selection In Column&lt;/LI&gt;&lt;LI&gt;Choose name, Selected as 1 and Unselected as 0&lt;/LI&gt;&lt;/OL&gt;</description>
      <pubDate>Fri, 19 Mar 2021 15:01:49 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/369699#M61939</guid>
      <dc:creator>jthi</dc:creator>
      <dc:date>2021-03-19T15:01:49Z</dc:date>
    </item>
    <item>
      <title>Re: Find and identify duplicates for unsorted data</title>
      <link>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/369712#M61942</link>
      <description>&lt;P&gt;You can also do this as a formula column using the below formula which is using the Big Class data table as an example&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;If( Row() == 1,
	Current Data Table() &amp;lt;&amp;lt; select duplicate rows(
		Match( :age, :sex, Empty() )
	)
);
If( Selected( Row State( Row() ) ),
	"Group 1",
	"Group 2"
);&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Mar 2021 15:36:35 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/369712#M61942</guid>
      <dc:creator>txnelson</dc:creator>
      <dc:date>2021-03-19T15:36:35Z</dc:date>
    </item>
    <item>
      <title>Re: Find and identify duplicates for unsorted data</title>
      <link>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/369903#M61959</link>
      <description>&lt;P&gt;Here's another example using a column formula.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;dt = Open( "$SAMPLE_DATA/Big Class.jmp" );

// Create a column that indicates duplicate names (there are two "Robert")

dt &amp;lt;&amp;lt; New Column( "Duplicate", Formula( If( Col Number( Row(), :name ) &amp;gt; 1, 1, 0 ) ) );

&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Mar 2021 22:09:46 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/369903#M61959</guid>
      <dc:creator>ms</dc:creator>
      <dc:date>2021-03-19T22:09:46Z</dc:date>
    </item>
    <item>
      <title>Re: Find and identify duplicates for unsorted data</title>
      <link>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/370378#M62012</link>
      <description>&lt;P&gt;Thank you very much. I was able to mark without using the scripting.&lt;/P&gt;&lt;P&gt;Appreciate your help!&lt;/P&gt;</description>
      <pubDate>Mon, 22 Mar 2021 17:25:46 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Find-and-identify-duplicates-for-unsorted-data/m-p/370378#M62012</guid>
      <dc:creator>CMG</dc:creator>
      <dc:date>2021-03-22T17:25:46Z</dc:date>
    </item>
  </channel>
</rss>

