Catawba County Schools files injunction claiming Google spiders “hacked” their site

It’s a sad thing when people talk about technology they don’t understand.
Catawba County Schools had a document that was cached by Google’s spiders that could contain some student social security information and other types of identity theft prone information. They claim that because the file was stored on a DocuShare server, it had to be secure due to password protection.


A quick look at what DocuShare does, and one can immediately know that this is a web-based CMS that can be locked down. Unfortunately, if this server was serving up important documents such as student records, then it shouldn’t have been connected to the Web in the first place and should have been purely intranet. This would be strike one against the school system for poor network design.
Strike two would be allowing someone to have open permissions. Obviously since it was connected the Web, there was two different things going on. One is that DocuShare has a bug, but then Xerox would be at fault. A quick search of Bugtraq revealed that the last reported security hole with Docushare was back in 2002. The other which is more likely, is that a student/faculty/staff had open permissions allowing the Google spiders into an open folder.
Just because you run a DocuShare server, doesn’t mean it’s automatically locked down. Policy breaches are very common in school systems. In the words of the CTO Judith Ray:

“One of the students on the list had a presence on the Web,” she said. “In Google’s effort to get information on her, one of its spiders latched onto her name in this document. We were not aware that password-protected sites are set up like that. To our knowledge, Google could only cache unsecure information that did not require a password or username.”

Umm. Let’s look up “spidering” or “web crawling” shall we? Hmm. Nothing says anything about hacking password entry. Mainly because spiders don’t do that. They’re not tied to hacking tools, nor would anyone at Google have some sudden urge to “hack” Catawba County Schools. Really. Lastly, I am sure that no one bothered to implement Google’s partial website removal as pointed out on Google’s Help Center via robots.txt. If they didn’t want the full site to be indexed, there was a method for that also by using the automated public removal tool.
All of the above are information technology and web standards in addition to Google indexing policies and can be researched very easily. All preliminary looks at the issues from the given information seemingly point to issues that lie in misuse of software or policy violations instead of the far-fetched Google hack theory. What’s so amusing is that school officials are spreading FUD because they don’t know any better about how web crawling technology works.
Via Winston-Salem Journal