BookmarkSubscribeSubscribe to RSS Feed

Re: Text based analysis

dannyfinn11

Occasional Contributor

Joined:

Jun 10, 2018

I think that would work. What would be the best way to get started?

markbailey

Staff

Joined:

Jun 23, 2011

Solution

Run the following script to simulate your data set:

Names Default to Here( 1 );

// simulate data set with client responses
dt = New Table( "Loans",
	Add Rows( 25 ),
	New Column( "Loan Description", "Character", "Unstructured Text" ),
	New Column( "Key Term", "Character", "Nominal" ),
	New Column( "Purpose of the Loan", "Character", "Nominal" ),
	New Column( "Modified Purpose of Loan", "Character", "Nominal" )
);

// make a list of some target terms
target term = List(
	"credit card balance ",
	"home improvement ",
	"new car "
);

For Each Row(
	// make unstructured text
	embedded term = target term[Random Integer( 1, N Items( target term ) )];
	description =
		Repeat( "blah ", Random Integer( 2, 5 ) ) ||
		embedded term ||
		Repeat( "blah ", Random Integer( 2, 5 ) );
	Column( dt, 1 )[] = description;
	// make structured response
	purpose = If( Random Uniform() < 0.25,
		"Other",
		embedded term
	);
	Column( dt, 3 )[] = purpose;
);

dt << Suppress Formula Eval;( 1 );

// now show one solution
Column( dt, 2 ) << Set Formula(
	Regex(
		:Loan Description,
		" credit card balance | home improvement | new car "
	)
);

Column( dt, 4 ) << Set Formula(
	If( :Purpose of the Loan == "Other",
		:Key Term,
		:Purpose of the Loan
	)
);

dt << Suppress Formula Eval( 0 ) << Run Formulas;

 

Ignore the script and now focus on the data table example:

Capture.PNG

I assume that your real data set has a column like the first and the third data columns in my example above.

You can now use the formulas that I made in the second and third data columns to begin your solution.

Learn it once, use it forever!
Highlighted
dannyfinn11

Occasional Contributor

Joined:

Jun 10, 2018

Thank you so much! this will definitely put me on the write track with this. I very much appreciate the assistance!