Download PHP CSV Utililties v0.2
Read Documentation for PHP CSV Utilities
I have just wrapped up version 0.2 of our csv library. It includes several new features. The most exciting of which is the new Csv_Sniffer class.
Csv_Sniffer
sniff(string $sample)
Csv_Sniffer’s sniff method accepts a sample of csv data and attempts to deduce its format. In this library, there is a class called Csv_Dialect, which tells Csv_Reader and Csv_Writer the format they should read and write in. If Csv_Sniffer::sniff is successful, it will return a Csv_Dialect object representing the format of the csv file (or at least it’s best guess). You can then pass this dialect to Csv_Reader and it will know how to read the file. You can also pass it to Csv_Writer if you need to append the file or write one in the same format.
try { $sample = implode("", array_slice(file('./data/products.csv'), 0, 20)); // grab 20 lines $sniffer = new Csv_Sniffer(); $dialect = $sniffer->sniff($sample); $reader = new Csv_Reader('./data/products.csv', $dialect); } catch (Csv_Exception_CannotDetermineDialect $e) { printf("<p>%s</p>", $e->getMessage()); }
hasHeader(string $sample)
Csv_Sniffer’s hasHeader method accepts a sample of csv data and attempts to detect if the file has a header row or not. If so it will return true.
$sample = implode("", array_slice(file('./data/products.csv'), 0, 20)); // grab 20 lines $sniffer = new Csv_Sniffer(); if ($sniffer->hasHeader($sample)) { print("The file probably has a header"); } else { print "The file probably doesn't have a header"; }
Csv_Reader_String
This new class is exactly the same as Csv_Reader, except instead of accepting a filename and reading from a file, it reads directly from a string. This could be useful if for some reason somebody had stored csv data in a database and you were retrieving it from there, or if you needed to collect submitted csv data directly from a web form.
if (isset($_POST['csv_data'])) { $data = $_POST['csv_data'] $reader = new Csv_Reader_String($data); foreach ($reader as $row) { // now you could insert it into a database or whatever else you need to do with it } }
Csv_Dialect::__construct([array $options])
You may now pass an associative array to Csv_Dialect’s constructor to override any of it’s properties. While this doesn’t actually provide any new features, it definitely is a convenience.
$dialect = new Csv_Dialect(array('quotechar' => "'", 'escapechar' => "'", 'quoting' => Csv_Dialect::QUOTE_NONNUMERIC)); $reader = new Csv_Reader('./data/orders.csv', $dialect);
Plans for version 0.3
- Csv_Writer will write immediately, rather than when you call close() - This won’t change the interface at all, but in the next version, instead of writing to disk when the user calls close(), it will write immediately when writeRow() or writeRows() is called.
- Interface changes for Csv_Sniffer - I don’t like how you have to pass the same sample data to both sniff() and hasHeader(). I will probably change it to accept $sample in its constructor instead. Another issue I have with it is that if there is a tie between delimiter characters in the sniff method, it just chooses by ascii order. I would like to allow the user to specify an array of characters in order of priority in case of a tie.
- A more advanced unit testing interface - The unit tests I have written are all run at once and since they are reading / writing actual csv files it is beginning to take a while to run them all. I’m putting together an interface that will allow me to run tests seperately or all together as well as a way to time some operations so that I can speed them up as much as possible.
- Csv_Dialect classes for any and all formats I can dig up (Open Office, Miva Merchant, Google Docs and Spreadsheets, standard csv?, etc.)
- Csv_Mapper - A class that maps keys to columns so that you can access them like $row['first_name'].
- Even more documentation I have written some documentation on the google code wiki, but I am planning on writing more consistent docs. The ones I have now are all sort of willy-nilly.
- Csv_Reader_Zip - A csv reader that can read zipped files
- Character encoding - This will be the first time I have really had to deal with multiple character encodings, so this may take me a while. I will need to do some research on the subject.
- More to come - I will finish writing about the new features and complete the docs within the next week or so, for I am tired and I’m going to bed.
Download PHP CSV Utililties v0.2
Read Documentation for PHP CSV Utilities
matthijs Says:
March 19th, 2008 at 3:04 am
Great work Luke. I’ll take a look and try the code.
Luke Visinoni Says:
March 21st, 2008 at 3:41 pm
I am putting together another post about this library and its development cycle. I’d like to make it sort of a conversational post, so if you don’t mind, pop on in that post as well and let me know what you think (It should be posted some time this weekend).
franck bret Says:
April 2nd, 2008 at 12:57 pm
I’ve tried the library, works great !
Here is a proposal function for the Reader Class, make an array with header row as keys.
public function toHeadedArray(){
$return = array();
$content = $this->toArray();
$header=$this->getRow();
//assign strings indice from header
foreach ($content as $row => $value) {
$return[]=array_combine($header,$content[$row]);
}
//remove first row, the one with header label
array_shift($return);
// be kinds, please rewind
$this->rewind();
return $return;
}
Luke Visinoni Says:
April 3rd, 2008 at 3:32 am
Thanks franck, I’ll make note to add this type of functionality in the next version. I was thinking of doing something more along the lines of
$reader = new Csv_Reader();
$reader->hasHeader(true);
$array = $reader->toArray(); // this will now use the header as keys
The reason being that I would like for the Csv_Sniffer class to be able to tell Csv_Reader that the file has a header. So something like this:
$sniffer = new Csv_Sniffer(’./data/orders.csv’); // reads first 20 lines and attempts to deduce format
$reader = new Csv_Reader(’./data/orders.csv’, $sniffer->sniff());
$reader->hasHeader($sniffer->hasHeader());
I’m still playing around with the interface though.