« Why I Love Wal-Mart | Main | The Myth of the Anti-Roe Majority »

What is AG Gonzalez Thinking?

Amazingly, Bush may just have lost my vote. Not because of threats to civil liberties, but merely through his administration's ungentlemanly thuggishness with aggregate data and his Justice Department's obsession with Project: No Child Sees A Behind.

When the Attorney General subpeonas Google asking for massive amounts of data, I think it's fair to ask what he wants to do with it. Having read the relevant court document, I've only come away more confused. Let's look at what the government has asked for:

  1. A list of 1 million random URLs available for search in Google. (This down from a request for all URLs available through Google. The mind boggles at the size of that file.)
  2. All queries entered on Google's search engine over a one week period (originally one month)

Those are some big files. While I agree with Chris that the privacy concerns aren't that significant (they're not asking for IP addresses), it still seems like a ridiculous fishing expedition.

The legal arguments for turning over the data are fairly straightforward. AG Gonzalez's memo becomes an exercise in obfuscation, however, when it comes to how all these URLs are going to help his case. The data will allow the Government to "draw conclusions as to the prevalence of harmful-to-minors material on the portion of the internet available through search engines" (Motion at 8) or to "understand the behavior of web users" (Id. at 4). Apparently the AG needs massive data files to conclusively prove that (a) there's a lot of porn on the internet, and (b) people search for that porn. I simply can't believe that the ACLU wouldn't stipulate to those facts. (See UPDATE.)

Of course, one suspects those aren't the primary issue. This elaborate exercise in datamining is actually supposed to "measure the effectiveness of filtering technologies in screening [obscene material]." (Id.) But in the immortal words of Ogden Nash, "You can't get there from here," although I can see some stunningly bad ways to massage this data. For instance, you could have someone trawl through one million URLs and figure out how many were obscene sites. You could then run one week worth of searches and figure out how often those obscene sites appeared. (That's a pretty big task in itself.) Finally, you could measure whether nasty sites still turned up when you added filtering software, from which you'd then derive the "effectiveness" of the filters.

But this result is methodologically flawed. To be fair, one would have to account for which search strings were searching for porn in the first place, an inherently subjective matter. Searches for "breast," for instance, can have any meaning from the pornographic to the medical to the culinary. Further, one would have to assume that the filter is the only source of control withthat comes from filtering software. Most programs include simple add-ons that let parents see what has been browsed on the machine. The most effective "filter"? Simply telling the child, "I can track what you see, and if I find you've been visiting Playboy.com, I'll punish you once for breaking my rules on porn and a second time for not being able to find any better dirty material in all the great wide internet."

That, however, is the closest I can get to "proving" the effectiveness or otherwise of filters from the data the AG wants. The best I can see resulting from this subpeona are some spurious statistical arguments that will "show" that some mythical aggregate internet user will stumble upon pornography once every X number of days. Given that the government's civil liberties credentials aren't everything they could be right now, it would seem prudent for the AG to outline in detail exactly how he plans on using this data before throwing requests for data at one of the most-used (and possibly most-beloved) companies out on the Net.

Then again, I could be missing something. Comments on exactly how one measures the effectiveness of filtering software from these two massive data files (or privacy problems that I might have overlooked) are very welcome.

UPDATE: Above I say that I can't believe the ACLU isn't willing to stipulate to some very broad claims, a point which is flippant enough to obscure my argument. For clarity, I can see why the AG would want some relatively solid data on the prevelance of pornography online, but don't see why one has to subpeona search engines to get that data. Assuming the DoJ has the number-crunching resources necessary to process Google's records if it gets them, it must also be able to send out spiders to index portions of the net, or to run simulated searches based upon the most common search terms used. Certainly this could be handled without the ugly mallet of a subpeona and the thuggish aura it exudes.

Comments

I see.... So this ONE hits closer to *home* than spying on ole Grandma!!! Hahahahahaha *wink*
As I've often pointed out, I'm not a fetishist for the Bill of Rights: they're pragmatic compromises that I don't imbue with any mystical importance. On the other hand, the dislike of Gonzalez's subpeonas stems from the same source as my objection to lawsuits against fast food companies for making people fat: using the law for ends I'd consider illegitimate. (And in the case, seemingly technically clumsy.) You, of course, get a laugh, but mostly because you continue to insist that my concerns must be yours.
This is what stops you voting for Bush? Truly we live in interesting times... Still, it's unlikely anyone will vote for Bush again, because he won't be running for anything again. Anyway. Could Gonzalez be trying to establish a precedent 'we can request masses of data on vague contexts'? I have no idea. Or more likely, some staffer has decided this is the best way to solve a problem they don't understand - incompetence explains most conspiracies.
Martin: I sort of figured people would take it as read that I meant "Bush may have lost my vote for the Republican Party." But it's always good to have people keep me on my toes. And I'm willing to bet you're right on an explanation, but I was really hoping some commentor would give me a justification that wasn't a cheap excuse. In any event, suffice it to say that I'm not that worried about "creeping fascism" or the rest of such theoretical problems. However, like most Republicans I am worried about the power of the state. When it asks for more than it needs, even if there is a legal restriction, then that does bother me.
A list of 1 million random URLs available for search in Google. (This down from a request for all URLs available through Google. The mind boggles at the size of that file.) I suspect that it would take all the lawyers at Justice well into the second term of Bush's successor in office (or the term of Bush's successor's successor if s/he's not reelected) to go through all the URLs available through Google....
Incidently, All URLs available through Google You may find the user interface unsuitable for browsing them though.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

NOTICE TO SPAMMERS, COMMENT ROBOTS, TRACKBACK SPAMMERS AND OTHER NON-HUMAN VISITORS: No comment or trackback left via a robot is ever welcome at Three Years of Hell. Your interference imposes significant costs upon me and my legitimate users. The owner, user or affiliate who advertises using non-human visitors and leaves a comment or trackback on this site therefore agrees to the following: (a) they will pay fifty cents (US$0.50) to Anthony Rickey (hereinafter, the "Host") for every spam trackback or comment processed through any blogs hosted on threeyearsofhell.com, morgrave.com or housevirgo.com, irrespective of whether that comment or trackback is actually posted on the publicly-accessible site, such fees to cover Host's costs of hosting and bandwidth, time in tending to your comment or trackback and costs of enforcement; (b) if such comment or trackback is published on the publicly-accessible site, an additional fee of one dollar (US$1.00) per day per URL included in the comment or trackback for every day the comment or trackback remains publicly available, such fee to represent the value of publicity and search-engine placement advantages.

Giving The Devil His Due

And like that... he is gone (8)
Bateleur wrote: I tip my hat to you - not only for ... [more]

Law Firm Technology (5)
Len Cleavelin wrote: I find it extremely difficult to be... [more]

Post Exam Rant (9)
Tony the Pony wrote: Humbug. Allowing computers already... [more]

Symbols, Shame, and A Number of Reasons that Billy Idol is Wrong (11)
Adam wrote: Well, here's a spin on the theory o... [more]

I've Always Wanted to Say This: What Do You Want? (14)
gcr wrote: a nice cozy victorian in west phill... [more]

Choose Stylesheet

What I'm Reading

cover
D.C. Noir

My city. But darker.
cover
A Clockwork Orange

About time I read this...


Shopping

Projects I've Been Involved With

A Round-the-World Travel Blog: Devil May Care (A new round-the-world travel blog, co-written with my wife)
Parents for Inclusive Education (From my Clinic)

Syndicated from other sites

The Columbia Continuum
Other Blogs by CLS students