Using JMP to visualize a solid state drive reconditioning process
Our World Statistics Day conversations have been a great reminder of how much statistics can inform our lives. Do you have an example of how statistics has made a difference in your life? Share your story with the Community!
Using JMP to visualize a solid state drive reconditioning process
May 29, 2014 9:51 AM
This past week, I noticed that my computer had seriously slowed down. My usual tasks seems to be taking forever, and even my standard JMP demos were taking quite a bit longer than I was used to. I tried the normal things such as repairing permissions, checking the memory, seeing if there were any corrupt kernel extensions and even going so far as installing a clean version of Mac OS Mavericks to see if that would fix the behavior I was seeing.
Then it dawned on me that it might be something going on with my solid state drive (SSD). Since I still had the original hard drive (a standard spinning drive) that came with my MacBook Pro, I installed that and tried booting from the original drive.
A program that I use for seeing how your computer is performing is Geekbench. It does a number of processor, graphics and disk-intensive tasks and then reports back a single- and multi-core performance number that you can then compare to a database of computers that are similarly specified to yours. Then you can see if you are achieving comparable performance.
Well, as it turns out, the performance of my computer should be around 2,000 for single-core and 10,000 for multi-core. I was getting 700 for single-core and 3,000 for multi-core with the SSD. When I put the original HDD back into my laptop, performance increased to about 2,000 and 8,500, much closer to what it should be.
So obviously something was going on with my drive. Not giving up, I decided to see if I could recondition it. I also thought this was a great opportunity to collect and visualize some data using JMP. I used a program called DiskTester, from the Digiloyd Tools Suite. One of the functions in DiskTester is a recondition SSD function. This writes a large chunk of data to all of the free space of a drive and lets you iterate a number of times. The program reports the chunk offset in MB, average write speed, current write speed as well as a minimum and maximum write speed.
The drive, according to the Digiloyd Tools developer, “responds to this treatment by cleaning up and defragmenting itself.” If this process works on my drive, I should see some pretty bad performance for the first iteration that drastically improves after a few iterations.
So I erased my drive, booted from my other internal drive and started the reconditioning process, letting it run overnight and collecting eight iterations of raw data.
DiskTester gave me the option to copy the raw data to the clipboard, which I did, and then created a .txt file that I will now import into JMP.
I’ll use File > Open to get the .txt file, and then when given the option, I’ll choose Open As: Data (Using Preview), which gives me an option to inspect the data before getting it into JMP.
I’m happy with the way the columns look, and I see by the 123 icon that all my data will be coming in as continuous, which is what I want. In this window, I have the option to give the columns names, which I will do.
And now I have my raw data into JMP, but there is one more step I will need to do before I can visualize the results. You can see I am missing an important column, which is the iteration number. I’ll need this to use as a phase or grouping variable. Fortunately, I can generate this pretty easily in JMP. I'll create a new column and then right-click to get to the column info. When you create a new column, you have the option to initialize data. I'll pick sequence data, and then enter 1 for the From, 8 for the To (as I want data from iteration from 1 to 8) and 1 for the Step.
I know that my last block is 482,176 MB and the program is writing 128 MB chunks, which means each iteration will have 3,767 unique measurements. So I will put 3767 into the Repeat each value N times field.
I can check my work by looking at rows 3,767 and 3,768. Sure enough, 3,767 is labeled iteration 1, and 3,768 is labeled iteration 2. Now I’m ready to go.
I’ll use Graph Builder to see what’s going on in the data. Having the Offset in GBs instead of MBs may be a better way to display the labels on the X-axis for all of my plots, so before I go any further, I'm going to transform MB to GB by right-clicking on the Offset MB, clicking on Formula and then taking Offset MB and dividing by 1000. This creates a custom transform column without having to go back to the data table. I'll want to use this later, so I'll rename it Offset GB, right-click and then select: Add to Data Table. The plot below shows the pattern of performance readings across the drive for each iteration. I’ve turned the transparency of the points down to 0.1 so we can see them better and also added a lower spec limit of 110 MB/sec to the graph.
As you can see in the graph, the performance on the first iteration is all over the place. While the average write performance is decent (91 MB/sec), there are many 128 MB chunks that are being written much more slowly to the disk. By iteration 2, however, things are starting to improve drastically. The average write performance has increased to 115 MB/sec. By iteration 5, things are starting to settle in, and at that point, I seem to be seeing asymptotic behavior in write performance.
What remains through all the iterations, however, is a band of blocks that are in the high 40s for write performance. Even by the end of the test, 130 blocks are in the lower performing sector. This is vastly improved from the first iteration, where 2,294 blocks are below the spec limit. If I add a Local Data Filter to Graph Builder, I can focus on just the first and last iteration, and compare performance. While the write performance seems to be greatly variable on iteration 1, by iteration 8, there are three straight bands, indicating consistent performance over all the tested sectors of the drive. The cause of the lower-performing sectors is still a bit of a mystery to me, but I suspect it may be something in the operating system, where the program is being interrupted by some other system task causing a drop in measured performance. (If anyone has a better hypothesis, leave it in the comments).
So this big question is: “Did it work?” Well, I am happy to report that it did. After running the reconditioning procedure and reinstalling the drive in my laptop, my Geekbench score is back to 2,900/9,500, which is what I should expect given my hardware specifications. And the drastic drop in speed that I noticed on my computer is no longer there.