Large CSV Files with Millions of Records: Delimit Rocks!

This past couple of weeks, I’ve been working furiously with a colleague to meet expectations around some data extract work for a divestiture.  My colleague wrote a PowerShell script to pull data, but we were getting heap space errors when we tried to extract all the Task records we needed to pull with the conditions we had in place.  We were pressed for time, so I went looking for a fast solution.

A while back, I’d learned of a tool that would let us open large CSV files in an Excel-like interface and work with those files.  The software is called Delimit, and is made by Delimitware, and you can read all about it at http://www.delimitware.com/.

Now, what I want to share with my Salesforce friends out in the world is that Delimit is AWESOME.  I was able to open a CSV with 1.2 million lines, and then extract/filter out 106,000 or so of those rows based on the 17,000+ IDs in a column from ANOTHER CSV I opened in Delimit.  Less than a minute after I kicked off the extract/filter, I had an extract CSV file with the filtered records I wanted.  The formula-based way to accomplish such a CSV filtering by another CSV is documented in Delimit’s help, here.

The software is reasonably priced, and the vendor worked with me around a licensing model that worked for my department.  If you’re looking for a solution for dealing with Salesforce export CSVs that contain millions of rows, check out this software.  And, do yourself a favor and do what I did (out of shear luck):  Buy a license for this before the emergency in which you really need it!