Subscribe Bookmark RSS Feed

Is there om JMP a comparable function like "proc contents" in SAS?

ursula_garczare

Community Trekker

Joined:

May 1, 2012


Dear all,

I an not find some pretty basic feature, and I am assuming I am just being blind....

Is there some easy-to-use way to get a report on all variables I have in a JMP data set and their attributes - something like proc contents in SAS. I do not want to do it via calling a SAS function because in the long run my scripts should work independent from a SAS installation. Is there something implementent (I am using JMP 10)?

Regards,

Ursula

1 ACCEPTED SOLUTION

Accepted Solutions
mpb

Super User

Joined:

Jun 23, 2011

Solution

"You can see column properties one by one if you right click on a column and select Column Info."

It may be worth mentioning that if you select all or some of the columns in a data set and right click on one of the selected columns and choose Column Info you will get a scrollable window containing info for all of the selected columns. Less compact than then PMroz' script and not all the info in David's script/add in but maybe useful as well.

7 REPLIES
pmroz

Super User

Joined:

Jun 23, 2011

I don't think there is an equivalent to PROC CONTENTS in JMP.  You can see column properties one by one if you right click on a column and select Column Info.  There are many column properties that could be listed; here's a short program that lists some of the major ones in the Log window.  It uses Big Class as an example.

dt = open("$sample_data\Big Class.jmp");

col_list  = dt << get column names();

col_names = dt << get column names(string);

for (i = 1, i <= nitems(col_list), i++,

      one_col = col_list[i];

     

      col_data_type     = one_col << get data type;

      col_modeling_type = one_col << get modeling type;

      col_format        = char(one_col << get format);

      col_role          = one_col << get role;

      col_formula       = char(one_col << get formula);

     

      print("Column Name: " || col_names[i]);

      print("       Data Type: " || col_data_type);

      print("   Modeling Type: " || col_modeling_type);

      print("          Format: " || col_format);

      print("            Role: " || col_role);

      print("         Formula: " || col_formula);

);

Here's the output from log window:

"Column Name: name"

"       Data Type: Character"

"   Modeling Type: Nominal"

"          Format: Format(\!"Best\!", 9)"

"            Role: None"

"         Formula: Empty()"

"Column Name: age"

"       Data Type: Numeric"

"   Modeling Type: Ordinal"

"          Format: Format(\!"Fixed Dec\!", 5, 0)"

"            Role: None"

"         Formula: Empty()"

"Column Name: sex"

"       Data Type: Character"

"   Modeling Type: Nominal"

"          Format: Format(\!"Best\!", 1)"

"            Role: None"

"         Formula: Empty()"

"Column Name: height"

"       Data Type: Numeric"

"   Modeling Type: Continuous"

"          Format: Format(\!"Fixed Dec\!", 5, 0)"

"            Role: None"

"         Formula: Empty()"

"Column Name: weight"

"       Data Type: Numeric"

"   Modeling Type: Continuous"

"          Format: Format(\!"Fixed Dec\!", 5, 0)"

"            Role: None"

"         Formula: Empty()"

mpb

Super User

Joined:

Jun 23, 2011

Solution

"You can see column properties one by one if you right click on a column and select Column Info."

It may be worth mentioning that if you select all or some of the columns in a data set and right click on one of the selected columns and choose Column Info you will get a scrollable window containing info for all of the selected columns. Less compact than then PMroz' script and not all the info in David's script/add in but maybe useful as well.

ursula_garczare

Community Trekker

Joined:

May 1, 2012

Dear mpd,

thanks for your answer :>). Basically what I need is mostly for scripting and automising analyses - and there Davids tool is perfect, and the PMroz script is a good starting point for generating a report -when directing the output not to the log but to a journal.

Regards,

Ursula

Hello again Ursula: if you look in the "Add-Ins" section of the JMP File Exchange now, you'll find an upgraded version of the original "Data Set Contents" script in which:

  • Two extra columns have been added to the summary table, showing the row number of the first instance of the minimum and maximum values respectively, which should help with the location of potential outliers in large data sets;
  • A third extra column shows whether each variable is Nominal, Ordinal or Continuous, and the total number of each of these modeling types present has been added to the parameter summary table;
  • A "Frequency" tab has been added, showing a bar chart of all the distinct levels of any selected column (either numeric or character) - the default maximum number of distinct levels that can be charted is 25, but this can be changed to 10, 50, 75 or no upper limit (though using this last one with more than 100 distinct levels can produce a rather messy chart);
  • The "Rename" bug has now been fixed, so if you accidentally try to rename a file as itself, nothing happens.

Would you like to download a copy and give it a try?

Regards,

David

Hi Ursula,

There's a script that's just been uploaded to the JMP File Exchange that I think should do at least part of what you need: it's called "Data Set Contents", and you'll find it in the Add-Ins section, almost at the bottom of the list of uploads.  It was written to provide a quick way to display an assortment of summary statistics on every column of any data table in whichever directory the user specifies when the script is run, and also includes some basic housekeeping functions for displaying, renaming, copying and deleting any data set in the directory.  You can either install it as an add-in or run it as a standalone script: both are included in the ZIP file.

Best regards,

David

ursula_garczare

Community Trekker

Joined:

May 1, 2012

Dear David,

it does what I needed now and beyond, as its really a nice little tool to get an overview on a larger set of JMP files in a folder - and scriptable :>).

And well, what I need beyond is for each of the nominal variables a list of the values - may be with a threshold that can be set (only the ten values with highest occurences...). That is my wish list ;>).

Ursula

Hi again Ursula - I'm glad it does what you need.  I like that idea of the list of levels for the nominal variables: give me a few days and I'll incorporate that.

In the meantime, I'm afraid I've spotted a bug - and it's potentially an important one.  If you try to rename a data set, but don't actually change the name from the default (which is simply the original name), it will delete the file you're trying to rename (because it renames a file by saving it under the new name, and then deleting the original).  It's easily fixed but obviously dangerous - so please don't do that.  I'll get it fixed ASAP.

David