Did Sophos Free A-V for Mac kill my Time Machine backups?

2010-11-09 10:50

[Update 2010-11-17] Sophos just announced an update to the free Mac anti-virus. Though Sophos did not say as much outright, it looks very much as if the initial release of SAV did indeed kill my (and others’) Time Machine backup(s).

Following our investigations, we have now modified the way in which Sophos Anti-Virus handles infections on TimeMachine backups. This modification was released on 16 November 2010, in the IDE file dloa-dei.ide. All computers that receive regular updates should now have been automatically updated.

[Update 2010-11-12] Graham Cluley posted the following on the Sophos support forum this afternoon:

Sophos is still investigating the issue reported by a small number of users on this forum about Time Machine backups being deleted whilst running Sophos Anti-Virus for Mac Home Edition. As a precautionary measure, while our investigation continues, we would recommend that, if you detect malware in your Time Machine backup, you do not tell Sophos to clean it up.

From a protection point of view, you are still safe. Sophos Anti-Virus for Mac Home Edition continues to protect you (through its on-access scanner), checking any file you access for malware, including files restored from backup.

As our investigations continue we will provide further updates.

[Update 2010-11-12] Graham Cluley has answered a few more questions by email to shed light on the workings of Sophos A-V in the presence of Time Machine; reprinted at the bottom (see ). I commend Sophos for their responsiveness to this issue. Bottom line for me: I’m still running Time Machine (on a new disk); I am not running Sophos A-V.

[Update 2010-11-11] One other user of TM and Sophos A-V has reported what may be a similar data loss.

[Update 2010-11-10] A few additions are shown in light type; at the bottom is a helpful response from Sophos’s Graham Cluley (see ).

The title isn’t merely rhetorical; I really want to know. I hope by this blog post to begin a conversation with folks from Sophos and those who know more about the inner workings of Time Machine. My backups are gone — 19 months of irreplaceable data — but perhaps this post and the ensuing conversation can spare others from similar trouble.

First, of course, I should have had a backup of my 500-GB Time Capsule. It’s been in the back of my mind for months to go out and acquire a suitable disk and get that essential job done. So, before we get started, let me just say: mea culpa. Also, once the trouble described below transpired, I uninstalled Sophos AV. Their uninstaller is unusually comprehensive: it seems to have removed any log files along with the software.

I downloaded and installed Sophos Anti-Virus for Mac Home Edition as soon as I heard about it, on Nov. 3. As a long-time Mac user I’ve been following the debate about whether or when the Mac platform would be targeted at a level that makes A-V protection prudent. My personal conclusion: by the time the Mac’s market share approaches 20%, we will be embroiled in the same battle that has swirled around PC users for a decade. Now that (by some estimates) the Mac has broken the 10% barrier, I was prepared for the first sounds of distant battle trumpets.

My work habits have not prepared me for life under an A-V regime. I run a few Windows instances under VMWare — Win2K, XP, and a now-crippled beta of Vista — but those environments are not usually live to the Net above 5 minutes per month. In my blissful Mac computing environment, my computer is indeed mine: a root login to the Unix underpinnings gives me (I had always imagined) complete visibility into and control over the workings of the system.

People who live with always-on A-V don’t think that way. Once their A-V package has identified and quarantined some threat, they might not be able to touch or examine the affected files until A-V has become satisfied that they are no longer dangerous.

After a day with Sophos Anti-Virus for Mac running uneventfully in the background, I initiated a full scan of the local hard drive. When it completed hours later, Sophos had identified two baddies: something in the folder the Mac environment shares with VMWare/XP, and a malicious JavaScript file I had downloaded months before in order to see how it works.

I was grateful for the heads-up, and impressed that Sophos had found anything at all to complain about. Intending to have a look at the infected JS file, I switched to Terminal (it’s one of a dozen or so programs that are always live while my computer is on) and attempted to examine bad.js via the Unix commands cat and vi. Each time I tried, including via sudo, a Sophos overriding dialog popped up informing me that a threat had been detected, and preventing my access. The A-V’s realtime scanning component was reacting to my attempt to touch the now-quarantined file in any way.

Eventually I calmed down from my automatic Whose machine is this anyway?! reaction and let myself be guided by Sophos’s advice. First I followed the offered links and read about what it meant that no automatic recovery was possible for these threats, and then deleted the infected files by hand as recommended.

Now the trouble starts. The next time the hourly Time Machine backup kicked in, TM identified files on the backup volume that had been deleted from my main disk — i.e., the malware instances. As soon as TM tried to touch those files on the backup volume, the A-V’s realtime component kicked in and blocked the access. This time the blocking dialog offered the option of deleting the offending file(s). Without much thought I clicked OK to let Sophos (try to) remove files from /Volumes/Time Machine Backups/…, forgetting that files on the backup volume are read-only to any process other than TM.

Um. Bad move.

Sophos went spinning pizza of death — after a couple of minutes I issued a three-finger salute and terminated the hung process. As some point here I probably stopped the Time Machine backup as well, without resorting to undue force (i.e.; selected Stop Backing Up from the TM menu bar control; I didn't force-quit the process).

The next time TM’s automatic backup kicked in, System Preferences reported that it saw 417 GB free on the Time Capsule — previously, the sparsebundle containing backups dating to April 2009 had left 117 GB free — and the date of the “Oldest Backup” was ominously blank. TM wanted to back up 67 GB. In a panic I stopped the backup, but the damage had already been done. Looking at the backup volume from Terminal, I saw only a single backup set — the very beginning of the one I had just stopped — and over 400 GB of free space. Nineteen months of my Mac life gone, poof. Had Sophos managed to delete or damage the sparsebundle? Had Time Machine damaged it for some reason when I killed the Sophos process?

The Sophos support forum does not currently have any posts mentioning such a problem. The free AV version does not include support. I reached out to Sophos’s @gcluley via Twitter, and his reply is below.

In our investigations and testing, we’ve found no incompatibilities with Time Machine. Hundreds of thousands of people are already using the free Sophos Anti-Virus for Mac Home Edition, and as far as I’m aware, you’re the only one who has reported a possible problem like this with TM.

We have also not seen this issue reported by existing paying customers (business users, universities including students, etc.) who have been using Sophos on their Macs for many years.

For what it’s worth, I use the free Mac product on my computers at home — all of which are backing up via Time Machine. I’ve had no problems.

I’m really sorry you had a bad experience, and I hope we’re able to find out what went wrong.

Here is Graham Cluley's further explanation, interspersed with my questions (received 2010-11-11):

> As I've read over the Slashdot comments and others, it's
> becoming clearer to me that my unfamiliarity with the way
> A-V works, and the way TM works, probably had a lot to do
> with triggering whatever disaster it was that happened.
> And my panicked reaction has pretty much guaranteed that
> there's no evidence left to speak of on my machine or on
> the Time Capsule disk. (I uninstalled the A-V and that
> seems to have cleaned up any log files that might have
> existed.)

Yes, uninstalling SAV will remove the log files.

> Is my supposition correct that when the A-V popped up a
> warning about the malware on the backup volume, that it was
> intercepting an attempt of TM at that point to delete it?
> Or was A-V independently scanning the new (backup) volume
> that had come into view?

So yes, SAV intercepted a syscall by the Time Machine process. What Time Machine was doing at the time depends on where it was in the process, but it is possible that it was checking the files in the sparse bundle (the backup image) to then determine the backup delta (what had changed on the system since the last backup).

Most of the time, the system itself keeps track of the changes to the file system, so Time Machine doesn’t needs to scan files to know what to backup. However, when the system has failed to keep track of the changes (this could occur on near to full discs), TM does perform a scan.

If in this process it called a read or open against a file in the sparse bundle that was malicious or infected then we would have detected that, and then popped up the dialog.

> It would seem to be useful to feature a prominent note
> about A-V/TM interactions so that nobody else gets caught
> out as I did. For example: is is good practice to exclude
> the backup volume from scanning? A note to that effect could
> have prevented my trouble.

Our analysis to date, and the experience of existing users for some years, makes us reasonably confident that Sophos Anti-Virus cannot corrupt TM backups. If a user becomes confused, however, they might take a rash action (such as killing the TM process) that could lead to data corruption.

There would be no problem with you choosing to exclude TM from on-demand scans, and to configure our on-access scanner at its default setting of “deny access”. As long as malware stays in the Time Machine backup it is not active. Any attempt to open / copy an infection will still be intercepted by the on-access scanner. So you are safe.

As an aside, some users are describing what appears to be normal Time Machine behaviour. In other words, if the device being written to is starting to get full, then Time Machine removes its oldest backup and writes a new one. This continues just like log rotation, so older backups are removed to allow newer ones to be written.

I’ll ask our team to look into documenting a best practice guide that can talk about anti-virus scanning and backups. We can probably post that on the support forum.

In summary, we have been investigating the issue since you raised it, and our testing has confirmed our earlier belief that we are not the cause of the problem. Nevertheless we will continue to investigate as appropriate should other evidence come to light.