Internet

What's a 'Website,' Anyway?

As a recent Indiana Supreme Court case amply demonstrates, the term "website" is not nearly precise enough for use in our criminal law, and judges and legislators need to stop pretending that it is.

|The Volokh Conspiracy |


There are many strange things about the Internet and the manner in which it operates. On occasion, our law-makers—legislators and judges—have managed to make law that adapts and conforms, more-or-less, to the new landscape. Other times, not so much; some features of Internet communication continue to mystify them, and terrible law gets made because of those misunderstandings.

On March 28, 2015, thirty-four-year-old Kristopher Weida had sexual intercourse with his sixteen-year-old niece. The State charged him with felony incest, and he pleaded guilty. The trial court sentenced him to three years—one year in prison and two years of probation.

The court imposed a number of probation conditions, two of which Weida challenged as unconstitutional. The first (Condition #8) prohibited him "from accessing or using certain web sites, chat rooms, or instant messaging programs frequented by children." The second (Condition # 26) prohibited him from accessing the Internet "or any other public or private computer network" without the prior approval of his probation officer.

Violation of either of these conditions would, of course, subject Weida to the prospect of re-incarceration.

Weida challenged Condition 8 on the ground that it was unconstitutionally vague.** On the face of it, it looks like he has a good argument. Imagine for the moment that you were told that you could not, on pain of imprisonment, "access or use certain web sites frequented by children." Surely you'd say: "Yes, but which websites?" What does it mean to say that "certain web sites frequented by children" are off-limits without any indication of which ones those might be? And how is one supposed to know whether a website is, or is not, "frequented by children"?

**Weida also challenged both Conditions as unconstitutional abridgements of his First Amendment right to access the Internet. That is a substantial and important claim, but, because I want to focus entirely on the vagueness challenge to Condition 8, I won't discuss his First Amendment claim further here.

The Indiana Supreme Court, in a recent decision [text available here], got this one all wrong, holding (unanimously, no less) that Condition 8 was not unconstitutionally vague. In doing so, the court fell into a deep epistemological hole—the "Illusory Website Boundary Trap."

Here, again, is Condition 8, in full:

"You are prohibited from accessing or using certain web sites, chat rooms, or instant messaging programs frequented by children. You are prohibited from deleting, erasing, or tampering with information on your personal computer with intent to conceal an activity prohibited by this condition.

The court first cleared up that little grammatical peculiarity involving the proscription on accessing "certain web sites …frequented by children." The court gave Condition 8 a limiting construction: reading it "in context", alongside a number of other provisions limiting Weida's rights to communicate with children, it declared that

"… a person of ordinary intelligence, reading all the probation conditions in context, would understand that Condition 8 forbids him from visiting websites that allow him to contact or communicate with children."

So that's the answer to the question about which websites frequented by children are off-limits: The ones that allow him to contact or communicate with children. This, the court stated, cures the ambiguity; it provides Weida with "fair notice of what is generally proscribed," while still permitting him, "like normal Hoosiers, [to] use the Internet to communicate with adults and complete the myriad everyday online tasks."

But actually, it does no such thing. There are, to start, some pretty serious ambiguities remaining. Does a website allow Weida to "communicate with children" if he can post comments at the site that persons under the age of 18 can access and read? If he can read posts that may have been authored by "children"? Or does "communicate with children" refer only to one-to-one communication of some sort?

But put those objections aside; there is a much more serious, and more far-reaching, problem here. Determining whether a "website" does, or does not, allow users to communicate with children is a task lying somewhere between the unbelievably difficult and the literally impossible. Websites are too big, and the boundaries between different websites are too indistinct, so that one can never tell where Website 1 (which allows users to communicate with children) ends and Website 2 (which doesn't) begins.

For ordinary conversation, those ambiguities don't matter very much. The meaning of "The Volokh Conspiracy has moved to the Reason website," or "The Amalgamated Bank's website finally allows me to check my balance," or "I think they're live-streaming the game at the ESPN website," is clear enough.

But if your ability to avoid prison depends on your ability to determine whether the ESPN (or any other) website allows you to communicate with children—or, for that matter, whether the ESPN website allows you to view pornographic images, or to read terrorist manifestos—these problems are fatal.

To illustrate, suppose Weida does a very "normal Hoosier" thing: he accesses the "Indiana Pacers" page at the ESPN website (located at http://www.espn.com/nba/team/_/name/ind/indiana-pacers ).

Has he violated Condition 8? Has he accessed a website that allows him to contact or communicate with children?

Have a look for yourselves; as far as I can tell, there's nothing at that Indiana Pacers Page that allows Weida to contact or communicate with children, or with anybody else.

But of course the Pacers Page is just a small part of "the ESPN website," and we'll need to look around a bit to see if, elsewhere on that website, such contact/communication is allowed.

Before we do so, let's pause a moment to get our bearings and to consider the scope of the undertaking. [This is the 'unbelievably difficult' portion of the task] The ESPN website—whatever it contains, and wherever it begins and ends—could be a very, very, very large place, consisting of many thousands, or hundreds of thousands, or millions of individual pages. There is no technical limit to the number of pages a website may contain, and many of the better-known Internet websites – Amazon, Youtube, Blogger, Reddit, Yahoo, Facebook, Instagram, LinkedIn, etc.—link together many millions of individual web pages. Saying "you may not access that website if it allows you to communicate with children" is a bit like pointing to a 100,000-story skyscraper and saying: "You may not go into that building if it allows you to buy alcohol." [And we can pass over the illogic inherent in prohibiting "access" to websites that allow you to communicate with children when you need to access a site in order to determine whether it does or does not offer such a capability.]

The Pacers Page has a prominent link to the ESPN "home page" (www.espn.com), which is a good place to start our search; surely that page is part of "the ESPN website."

But again, I don't see any way to communicate with other users, children or adults, on that page either.

Let's keep looking, shall we? The ESPN home page does contain links to a number of what are called "ESPN Sites":

Which (if any) of these are part of "the ESPN website," and which (if any) are different websites? How can you tell?

You and I can shrug our shoulders and say: "Who cares?" or "What difference does it make?" Mr. Weida, of course, cannot. By following the link to "SportsNation," for instance, we might find that we can indeed "communicate with children."* If Sportsnation is part of "the ESPN website," then Weida violates Condition 8 when he accesses the Pacers Page, because the Pacers Page is part of a website (the ESPN website) that allows users to communicate with children.

*"SportsNation"does indeed appear to offer a "chat" feature [http://www.espn.com/sportsnation/chat/archive] allowing users to send and receive messages in real time to/from other users (some of whom I would assume may be under the age of 18). ESPNFC also lets users communicate with other users (some of whom I would assume may be under the age of 18) via their Facebook feeds.

On the other hand, if following the link to SportsNation leads to a different website—call it "the SportsNation website"—the fact that one can communicate with children there is irrelevant to the question whether "the ESPN website" allows such communication.

So which is it? And how can we tell? What should we be looking for to make the determination? What makes a website a website, anyway?

To begin with, notice that the cues we ordinarily use to make determinations like this in real-space have gone missing here. When I went out this morning and bought a tube of toothpaste, a bunch of pencils, and some milk, I made all three purchases at the same store. How did I know it was all the same store? Let's see:

  • The toothpaste, stationery, and food aisles were quite close together—walking time < 30 seconds;
  • they were all under one roof;
  • there were no walls or gates separating the aisles from one another;
  • employees and customers could wander from one aisle to another without impediment;
  • a single entrance and a single exit served all three aisles;
  • there was a big sign over the main entrance that read "CVS";
  • cashiers were available to charge me for anything I took from any of the three aisles.

None of those, however, will help us decide whether the SportsNation page and the Pacers Page are part of a single website. The two pages are "close together" in the sense that one click of the mouse can take you from one to the other and back again, and they're both similarly "close to" the ESPN home page. But in the strange new geometry of the Net, that's true for all pages; everything is one click away from everything else. There are no entrances, exits, roofs, or walls to guide us.

Nor can we fall back on a technical definition of the term, for there doesn't appear to be one. I'm no engineer, so correct me if I'm wrong, but term hardly shows up in the network engineering literature with which I'm familiar. For instance, the Internet Engineering Task Force's compilation of the many thousands of Internet standards and Internet protocols (here)—the fundamental design specifications that define the technical architecture of the Internet (including the World Wide Web)—contains nothing whatsoever about "websites." Web linking, web sockets, web file transmission, web pages, web hosts, web transaction security, …. all are dealt with by various standards and specifications—but not "websites." [And yes—I did check for the two-word phrase "web site," too]

So that leaves us with the ordinary, everyday meaning of the term, which I think is captured pretty well in this defintion (from the American Heritage Dictionary):

A "website" is a set of interconnected webpages, usually including a homepage, generally located on the same server, and prepared and maintained as a collection of information by a person, group, or organization." (emphasis added).

That makes sense. If the same person or entity—ESPN, Inc., say—is responsible for preparing and maintaining both the Sportsnation page and the Pacers Page as part of a "set of interconnected webpages" ("the ESPN website"), then the two pages are both part of that website. If not, they're not.

That's a nice simple rule. The problem, though, is that none of the information needed to make that determination is ascertainable by users. [This is the "literally impossible" portion of our program]

If this were a class on Internet law, I'd give you the following assignment: determine what "person, group, or organization" maintains each of those 10 "ESPN sites" and you'd see what I mean.

Some of those sites prominently (e.g., www.espn.com/doubletruck/ and www.espn.com/insider/) or not-so-prominently (fivethirtyeight.com, www.espn.com/espnw/) display the familiar ESPN logo on their Home Pages, along with a prominent link to the ESPN Home Page. Others don't (e.g., theundefeated.com, xgames.espn.com/xgames/, www.secsports.com). But that doesn't tell you anything about the entities or organizations responsible for those pages; anyone can put a link to the ESPN Home Page on their own Home Page, and ESPN, Inc., like any trademark owner, can license unrelated third parties to use its logo in any number of ways, including in its webpage design.

Nor do the webpage URLs give us the information we need to determine whether the same, or different, organizations are involved in maintaining and operating the various pages. Some of these "ESPN sites" have URLs at the espn.com domain: SportsNation at www.espn.com/sportsnation, "espnW" (www.espn.com/espnW), Doubletruck at www.espn.com/doubletruck, "Insider" at www.espn.com/insider "X Games" at xgames.espn.com, ESPN Deportes at espndeportes.espn.com. Others are located at other domains (fivethirtyeight.com, theundefeated.com, secsports.com, and espnfc.us).

But that, too, provides no information about whether a single entity—ESPN, Inc.—is or is not the party responsible for the pages located at those addresses. It could be that all 10 are owned and operated by ESPN, Inc. – or it could that none of them is. As the owner of the "espn.com" domain, ESPN, Inc. is entirely free to contract with third parties to, in effect, "lease space" at that domain for their own operations,* and, conversely, there is absolutely nothing to prevent the owner of the espn.com domain from purchasing rights to other domains (fivethirtyeight.com, theundefeated.com, etc.) for its own use.

*Many large and successful online businesses operate in this fashion. WordPress, for instance, is one of many thousands of sites that function as a "hosting service," or "platform," for user-created web pages and blogs; over 70 million pages are served under the wordpress.com domain, covering everything from astronomy (https://10minuteastronomy.wordpress.com/) to local Florida politics (https://clearwaterneighborhoodscoalition.wordpress.com) to bluegrass music (https://tennysonbluegrass.wordpress.com/) to stamp collecting (https://traffordps.wordpress.com/) to everything in between. Are these different blog pages all part of the "Wordpress website," or are they 74 million different websites? If some of them include user-to-user communication functionality, does that mean that the WordPress website is off-limits to Mr. Weida?

This problem – the impossibility of determining the boundaries of a website that has multiple services within it or linked to it, some of which may be operated independently of one another, others not—is not an abstract or abstruse technical detail regarding the nature of communication on the World Wide Web; it is an inherent consequence of the linking architecture and linking infrastructure that underlies all Web communication. Everyone who spends any time at all on the Web knows the feeling of being unable to determine "where you are" after even a short Web browsing session, as you follow the links that take you from ESPN.com to Fivethirtyeight.com to https://fivethirtyeight.com/features/eclipse-towns-planning/ to https://www.washingtonpost.com/news/speaking-of-science/wp/2017/08/02/august-total-solar-eclipse-gives-scientists-rare-chance-to-study-suns-corona/ to https://www.theguardian.com/world to http://www.100resilientcities.org/ and on and on. Without a reasonable way to determine where one website ends and another begins, Condition 8, even as narrowed by the court, imposes an impossible burden on Weida.

You might think that I'm making a mountain out of a molehill in all this, but I don't think I am, for two reasons. First, there are hundreds of thousands, and quite possibly millions, of people whose access to particular "websites" is restricted in one way or another. That would include many people who are currently incarcerated or, like Weida, subject to parole, probation, supervised release, or other continuing supervision by the criminal justice system. It would also include the many hundreds of thousands of individuals on state sex offender registries; many state registry statutes (even after last Term's decision in Packingham v. North Carolina) either prohibit access to specified categories of "websites," or require registrants to inform law enforcement of the identifiers that they used when accessing particular "websites." So getting rid of the term will have real and substantial real-world consequences, for the better.

And beyond that, one does not have to be Nostradamus to think that we are entering a period in which the regulation of online activity is going to take a more vigorous turn, what with the clamor over Facebook's privacy policies, Cambridge Analytica, the dissemination of fake news, and the rest of it. It would be nice to think that however that turns out, it will at least be based on a more solid semantic foundation than has been the case up to now.