Limitations of the Mac OSX 'locate' Utility

2010-08-26 14:36

[Update 2010-08-28] Some helpful readers on a private mailing list provided clues to what is going on with locate. The update is at the bottom.

locate is a command-line (Terminal) utility that runs on the Unix underpinnings of Mac OSX. Its purpose is to help you find files, by their name or partial name, wherever they are stored in the file system. But OSX’s locate will not report on files stored in particular locations. It turns out to be a question of file permissions; the limitation may have been sensible on a multi-user Unix machine 20 years ago, but it makes little sense on a single-user personal computer such as mine.

For example, if you use the Finder to examine files in ~/Library/Preferences, and pick one of them by name, locate will not find it. Same for files in ~/Library/Cookies and ~/Library//FileSync. Yet the utility does index and report on files in ~/Library/Logs and ~/Library/Calendars.

The database that locate uses is created by running the script /usr/libexec/­locate.updatedb. (Before Snow Leopard this script used to be run automatically, as a cron job, once per week; from SL onward it is not run weekly, and it is left to the user to run it or to set up automatic execution.) Looking through this shell script, one sees that the directories /private/tmp, /private/var/folders, and /private/var/tmp are excluded from indexing (as is any Time Machine backup volume that happens to be mounted). Yet details of the directory-by-directory selectivity inside ~/Library are not manifested in the script.

After a little poking around, the pattern becomes evident: if a parent directory has Unix permissions of 755 = drwxr-xr-x — i.e., full access to Owner and readable to Group and Others — its files are indexed. If the parent directory’s permissions are 700 = drwx------ — no access to Group and Others — the contained files are not indexed. This makes little sense, because the indexing command I use runs as root (sudo nice /usr/libexec/locate.updatedb), and files in the more restricted directories are visible to root.

[Update 2010-08-28]

Thanks to a few knowledgeable folks on a private mailing list, here is what I have learned about OSX’s locate:

  • The locate shipped with Mac OSX is ancient; how old I am not sure, but the man page dates from 2006. (A more modern version is slocate, a Linux utility available via the MacPorts project.)
  • The database /var/db/locate.database includes all files on the volume, but locate only reports on files readable by user nobody, group nobody, or world.
  • One can edit a copy of /usr/libexec/locate.updatedb and put the edited copy in place either by hand or via launchctl. I replaced three occurrences of “nobody” with my username in that script, reran it, and now locate reports on all files under ~/Library — indeed on all files readable by my user.
  • The find utility finds and reports on all files regardless of ownership or permissions. It's great if you know where to begin looking, but slow and resource-intensive if you begin at / . The point of locate, as I originally used it when encountering the limitations described here, is to help you figure out where to begin looking. updatedb actually uses find to populate the database, which locate searches rapidly.