Friday 18 September 2009

readdir in the wrong order

This week we found an interesting bug (which seems quite obvious really, but still threw a spanner in the works) which was that using the readdir function doesn't equate to ls'ing a directory, i.e. you can't be sure that the files will be in alphanumeric order.

So, as you can probably work out, I had a method using this function to obtain a list of files in a directory, and then was passing the list to another program.

This program, although we make a promise that if two(or three if multpilexed) files for a lane generated with this program everything would be in the correct order, doesn't actually do any internal ordering of the list of files passed to it (where was that documented?) and so just readdir alone, passing the list of files, meant that they weren't guaranteed to be in the correct order needed. AAHHHH!

So, a quick sort to the list, and everything is now fine, but that was a bit of a surprise. So was spending 2 days trying to reorder the files that had been created wrongly, although I now have a script that can do this should it happen again, which on the farm only takes about 10 minutes.

So, as I said, an interesting bug. I will just have to remember that if I do anything with a list from readdir in future, run a sort on it afterwards, just in case.

2 comments:

chorny said...

Order returned is not documented at all. So you should not rely on any order.

They are returned in order provided by OS. Returning sorted list them would mean necessity to read them all first, not one by one.

Unknown said...

Hi,

ls sorts by default. The output of readdir is not documented, but you can get the same order from readdir() and ls -1f in case you need it.