Intellectual Property

Uppity Young Geek Steals Legitimately Downloads Millions of Documents. Again.



Aaron Swartz's smirky* visage has appeared on Hit&Run before, when he took it upon himself to download millions of pages of public court records from the PACER database—which generally charges 8 cents per page—using a free law library account and then make them public in 2009. The FBI looked into the incident, decided they couldn't make anything stick and let the former Reddit co-owner and digital activist go free. 

Now the transparency vigilante has done it again. Only this time he wound up in handcuffs—facing a $1 million fine and/or 35 years in prison—and the hows and whys of this case are much murkier than his last caper:

The grand jury indictment accuses Swartz of evading MIT's attempts to kick his laptop off the network while downloading more than four million documents from JSTOR, a not-for-profit company that provides searchable, digitized copies of academic journals. The scraping, which took place from September 2010 to January 2011 via MIT's network, was invasive enough to bring down JSTOR's servers on several occasions.

Swartz had to sneak in and out of server closets to do the JSTOR scraping, which makes the whole thing seem seedier and more illegal. But the differences between the PACER grab and the JSTOR grab are less about how he got the data than what kind of data it was, and what he did with it. In both cases, he seems to have done little more than violate the sites' Terms of Service to get the docs—a move that has been greeted with varying degrees of condemnation by the courts. But as Ars Technica explains

There's an important difference between PACER and JSTOR. As works of the federal government, PACER documents are in the public domain. In contrast, many JSTOR documents are protected by copyright. The PACER documents Swartz downloaded are now available for download. Distributing the JSTOR documents, in contrast, would be a clear case of copyright infringement.

Whether he intended to release all those JSTOR documents, many of which are copyrighted, the way he released the public domain PACER data is far from clear, although a peek at Swartz's resume suggests that all parties would be right to be suspicious. Still: 

His history includes a study co-authored with Shireen Barday, which looked through thousands of law review articles looking for law professors who had been paid by industry patrons to write papers. That study was published in 2008 in the Stanford Law Review.

JSTOR says it got the docs back from Swartz and isn't behind the prosecution. Meanwhile, Swartz is out on $100,000 bail

More on PACER and transparency chic here.

*As a lifelong smirker myself, I do not mean to use the term derogatorily here.

NEXT: In Debt Talks, Democrats Want Everything On the Table...Except the Entitlements They've Taken Off the Table

Editor's Note: We invite comments and request that they be civil and on-topic. We do not moderate or assume any responsibility for comments, which are owned by the readers who post them. Comments do not represent the views of or Reason Foundation. We reserve the right to delete any comment for any reason at any time. Report abuses.

  1. The pigs are pissed they couldn’t get him on PACER, so they will go to the mat on this one to make something stick. Since JSTOR hasn’t pressed charges, that’s pretty much the indication. I hope he has a good lawyer.

    1. It is too bad, so many of my more abstract thinking-threads through the internet end up at JSTOR on the cover page of a really cool paper.

      I hope they make it quick for him at least in Gitmo, use a large caliber round and all.

      1. nah he’ll be assigned…ahh, make that sentenced to govt counter-hacking

    2. I’m having trouble having a huge amount of love for Mr. Swartz.

      You know what they say:

      When marching in the streets to depose a tyrant, the person next to you may be marching because the tyrant didn’t go far enough.

      1. From Aaron Swartz’s PCCC mission statement:

        In 2008, one first-time progressive candidate in a key congressional district went through four campaign managers before losing.

        Another spent $47,000 to retain a media firm that never produced a single TV ad. Another spent $40,000 on field consultants ? enough to pay 10 field staffers for two months, but which only bought a few hand-holding consultant calls. And others wasted thousands of dollars and weeks of staff time designing C-rate websites.

        Every election cycle, inexperienced candidates who run on bold progressive ideas ? candidates who political insiders predict “can’t win” ? come within a few points of victory. But too many lose winnable races due to the mistakes and inefficiencies of their campaigns.

        Apparently, these are the people we want running our government healthcare and pension funds.

    3. The pigs are pissed

      What “pigs,” specifically?

  2. My main reaction is, “Man, that is one punchable face.”

    I’m not proud of that, but it’s my initial reaction.

    1. Very high meme pontential.

    2. Me too.


      1. Why are you advocating punching him in the face? I thought you followed the non-aggression principle. Why can’t I discern between jokes, reality, and sarcasm?

        1. Oh, thanks a lot Nutra Sweet, I just read your post and then went and punched my smirky neighbor in the face. I was thinking of helping him with some tree trimming, then read your invitation to violence and obviously could not resist.

          Next time, at least advocate violence against some political figure. If I am bound to commit violent crimes from reading your posts they might as well be worthwhile.

          1. I am the violence inherent in the system.

            1. Great, now I punched my DVD copy of Holy Grail and broke it. Thanks a lot, SugarFree.

              1. Don’t ever go up against munge when DEATH is on the line!

      2. I bet you could count the number of faces you’ve punched on one hand, just like the number of vaginas you’ve touched. Ha! Vaginas!

        1. Vaginas! I got so PWN’D!

        2. Something about an Alaskan cannery worker…

        3. Yeah, cuz something’s wrong with you if you’re not violent and promiscuous!

        4. Wow, somebody got under your skins…

    3. I fucking hate him, just looking at him.

      I’ve got no problem with PACER end run, and I don’t like the overdrive copyright law is operating at these days, but this latest isn’t quite as defensible, if, in fact, it can be proven that he was going to publish copyrighted works.

      1. Pro L’s last sentence – that’s it – that’s where I get a little, “…nnnnggghhhh, I was with you RIGHT up to there, then…not so much.”

        PUNCH! And that’s just cause you look like a smarmy DICK, Swartz.

      2. How is anyone going to prove that?

        The prosecutor will of course try, and will go for life in prison–or possibly death, of course–for attempting to publish copyrighted works.

        1. They probably can’t.

        2. And you’re right–copyright infringement is now treason.

          1. Hyperbole is the language of…never mind.

      3. Based on past behavior, wouldnt the assumption be he was going to run data-mining on the papers?

        Which would not be a violation.

        Note: I have written scrapers for a client in the past.

        1. That’s what I’m thinking. If he didn’t actually release the copyrighted material, what can they get him for other than breaching terms of service? Is that even a crime?

          1. His intent is irrelevant. What he did was download a lot of copyrighted documents in a way that a terms of service contract says he can’t. What the courts should be looking at is, what are the penalties outlined in the terms of service for this breach?

            And why are the government even pursuing this matter when the offended party seems disinclined to press charges?

            I suspect that the JSTOR folks might be better off hiring this guy to walk them through how to beef up their security to keep other people from doing the same thing.

            1. I don’t think the feds are pursuing this because of the copyright violations. He’s in trouble because he repeatedly circumvented efforts by JSTOR and MIT to prevent him from accessing the JSTOR data and MIT network – by changing IP addresses and MAC addresses when they were blocked, and ultimately breaking in to a network closet in order to plug directly into a switch.

    4. My main reaction is, “Man, that is one punchable face.”

      I was actually surprised when that wasn’t the alt-text.

      Question: more punchable or less punchable than Ezra Klein’s?

      (Blah blah I am not advocating punching ANYONE in the face blah blah)

  3. There is nothing to indicate that he planned to “free” any copyrighted material.

    1. I bet they can’t prove it, but they can allege it and cost him a lot of money in defense. Which is one of the problems with our hyperactive copyright protection regime.

      1. I predict the copyright thing will not be his biggest problem. Sneaking into server closets while MIT attempted to kick him off the network, and bringing down JSTOR’s servers several times, is what they’ll get him for.

        1. Good point. Wouldn’t JSTOR have to go after him for copyright anyway? The article said it was the Fed’s going after him, no JSTOR. They must think they can get him for some type of criminal mischief… over the wires… interstate… BS.

  4. I’m for IP and capitalism, but when the house of cards that academic publishing has built itself crumbles, I’m planning to spend the day getting drunk and laughing so hard my ribs break.

    1. Ditto with the insane Westlaw/Lexis scam.

      1. I wrote a paper about the insane faux-economy of academic publishing. It has to be one of the biggest non-illegal scams. Even a Ponzi scheme pays out for some people; Elsevier is the only winner in that whole fucking mess.

        1. Suck it, bitch. We haven’t survived 400 years by playing nice.

        2. This. Academic publishing has to be the industry with the least value added as a percentage of profit.

          1. Is there a more blatant example of the captive audience problem and the third party player problem working together to form a toxic new compound? Or at least one with such a simple solution in ratio to the damage it does?

    2. As an opponent of IP, I can break those ribs for you for free, if you’d like.

      1. I bet you can count the number of ribs you’ve broken on one hand. It is amazing how many “anarcho-capitalists” wish LITERAL DEATH on something something something.

        1. (breaks Warty’s rib)

          That’s one. And it will eventually cause you LITERAL DEATH. And that’s what I wish.

        2. Did I miss something?

          Did Minge have another “tractor pull” moment, and I missed it?

          1. Look at the Ezra Klein thread from yesterday.

            1. Ok. That explains. The words “Ezra Klein” tend to make me skip threads.

              1. What? Those are usually the best ones!

                1. ehh …. shooting fish in a dixie cup is cool the first few times you watch it ….. after that, I prefer to at least have a small pond and give the fish a sporting chance. Him and Krugman are basically just real life trolls. There arguments are so transparently self-refuting I don’t think they even deserve attention most of the time.

    3. Was going to reply higher in the thread, and saw that you guys already made my points.

      How much does it cost financially, adding a peer review step to scientific publishing? Enough to justify the subscription rates from Elsevier/SpringerLink?

      1. The referees are paid nothing, but the time it takes may add up to something.

  5. I’m finding it very difficult to feel more than just a smidgen of sympathy for this guy.…..?source=fb

    1. You noticed, too.

  6. Since he was, apparently, unauthorized to use the MIT network then the act of copying seems to me to be the infringing act not his intent (whether it existed or not) to publish elsewhere.

    He looks like one of the “information wants to be free” crowd and that together with the smirk makes me want to smash his face.

    1. I can’t blame you. If ever there were a face that needed smashing, it’s that one.

    2. If the mere downloading is the issue, what he planned to do with it does matter, as there’s always a fair use argument. Also, the damages for one guys seeing everything versus him publishing it to the world would vary dramatically.

      What I could see him legitimately on the hook for are whatever use fees he evaded in his activities.

      1. The fair use argument would be particularly applicable here, since he has previously published meta-research about academic papers.

        So he can argue that his action was research, and therefore fair use.

    3. Information wants to be free!

      1. So does smallpox.

        1. We Shall O-o-verc-ome!!!
          We Shall O-o-verc-ome!!!
          We Shall O-o-verc-ome!!!

    4. Actually he was authorized to used the network, at least initially as a guest. And like all guests, he DID have legal access to JSTOR.

      It was when they noticed the flood of network activity from his downloading script, did they try to block him.

      In any case, he didn’t publish anything, and maybe he wanted to do as mentioned, to look for financial backers of papers so it all comes down to can the government criminalize violating TOS. Even JSTOR themselves do not want to push forward with the charge:

      “It’s even more strange because the alleged victim has settled any claims against Aaron, explained they’ve suffered no loss or damage, and asked the government not to prosecute,”

  7. So we know that the feds don’t like him for the PACER thing, but there’s something else here: How did he get into the network closet at MIT? Why was he able to walk into it again with his face deliberately obscured from the security camera? Is MIT leaning on the feds to prosecute because they want to know who was on the inside helping him? Just wondering.

    1. If ever there were a face that should be deliberately obscured, it’s that one.

  8. All of you who want to punch him in the face are clearly anti-Semites. Clearly.

    1. Is he Jewish? Doesn’t matter. I’m Jewish, and I think his face looks pretty damn punchable.

      1. The sheer number of people wanting to punch him does, in fact, indicate that he is Jewish. When they wandered the desert for 40 years, they were constantly being punched in the face.

        1. Of course you would be the one to be glib about the sufferings of the Heebs. You disgust me.

    2. He looks Finnish to me.

      1. He looks kinda elvish to me. Especially around the ears.

    3. I got nothin’ against the Hebes. MY FIRST GIRLFRIEND WAS TEH JOO!

  9. You feel it, too, and it’s not anti-Semite. He’s a smarmy ass–I’m sure of it.

    1. So you’re admitting to being an anti-Smarmite, then? You’re despicable.

      1. It’s true.

  10. JSTOR says it got the docs back from Swartz

    How can they possibly know this? Couldn’t he have just kept a copy?

    1. Yeah, a 1TB portable drive goes for 70 bucks these days. A terabyte is an insane amount of straight text data.

    2. Yeah, I don’t get that concept with digital data. Asking for the documents to be returned means… nothing.

  11. Which way to the end of the face-punchine queue?

  12. I don’t feel any urge to punch this particular face. Something is amiss.

    1. Neither do I. I like anti-authority assholes, personally.

      1. Yeah. Hopefully he asks the judge, “U MAD?” in his trial.

        1. Judge PWNED!

  13. “The scraping, which took place from September 2010 to January 2011 via MIT’s network, was invasive enough to bring down JSTOR’s servers on several occasions.

    Swartz had to sneak in and out of server closets to do the JSTOR scraping, which makes the whole thing seem seedier and more illegal.”

    Did the person writing the headline read the article, or was snarking about “legitimate” too much of a temptation? Can you not have headlines which contradict what is written in the article?

    1. He did legitimately have access to JSTOR documents, like I mentioned above, if that’s the contradiction you’re trying to point out.

      It was only after the high network activity on JSTOR servers, and banning his IP addresses back and forth, did he resort to physically accessing the server closets.

      Curiously it seems they were not protected or physically restricted since the actual indictment does not list any sort of breaking-and-entering crime

      1. That bit about getting into the network closet strikes me as curious as well. Did he have some sort of legitimate access?

        1. I find it hard to believe MIT, of all institutions, would be dumb enough to leave server closets accessible.

      2. If I am understanding correctly, what he did impaired the functioning of the network for everybody else, got him kicked off, whereby he tried the same thing again, and resorted to physically accessing the servers directly. At some point this goes beyond “legitimate”.

  14. One of the things I find interesting about this is that…there seems to be different standards we the public hold people to.

    I know the beef against Murdoch and company isn’t necessarily the methods he used as much as who the victims of those hacks were. I don’t think people would have been so upset if News of the World had just hacked politicians’ voice mail, etc. It was the teenage murder victim that really set people off…

    And I guess that’s why we don’t mind people like this guy and WikiLeaks doing what they do. In other words, hacking seems to have become a generally accepted practice–so long as the victims don’t generally elicit sympathy.

    1. As an addendum, I suspect people tend to tolerate this behavior more when it isn’t seen as an attempt to make money.

      Rupert Murdoch is a greedy capitalist–so hacking to make money is evil. There’s no indication this guy was trying to make any money from what he was doing–so that makes it okay.

      I’m not saying that’s the way it should be–but public perception of this stuff does seem to be split along those lines. Who the victims were and whether the hackers were trying to make money from it.

      1. You’re comparing apples & baseballs here. He is charged with violating terms of service, not with hacking into private communications.

      2. What worries me is none of the above. It’s the implication that’s drawn here in terms of criminalizing TOS violations since there is no actual hacking involved. There is no cracking of code, no breaking of systems

        When he did have legitimate access as a guest on he MIT network, he fundamentally used the same means as anyone else would to obtain the JSTOR documents. Except his simply automated.

        He evaded banning by the sysadmins initially by grabbing another IP address (not spoofing mind you) so again no hacking there.

        Furthermore, what’s also worrisome is why the Feds are adamant on pursuing this when MIT/JSTOR themselve do not want to press charges and have recommend the gov not to prosecute?

        1. turn him into a second Lamo.

          1. Recruiting him? ..doubtful with a very lefty guy like him.

            I’m thinking they want to establish some precedent in making the 6 felony charges stick for “lighter weight” seemingly non-criminal (misdemeanor at worst) type of activity

            1. JSTOR has denied seeking prosecution, but they were obviously aware of it.

              If I had a stake in those copyrights for all those copyrighted JSTOR documents, I would have assumed those documents were being downloaded to be redistributed.

        2. “There is no cracking of code, no breaking of systems”

          Aren’t you just talking semantics there?

          Here’s what I read:

          “JSTOR detected the script and blocked his IP address. The complaint alleges that there followed a game of cat and mouse in which Swartz repeatedly changed his IP and MAC address to evade JSTOR and MIT’s efforts to block access. Swartz also bought a second laptop to speed up the downloading process. Finally, on October 9, JSTOR gave up and and blocked the entire MIT campus from using JSTOR.”


          “he entered an MIT network closet, “hard-wired into the network and assigned himself two IP addresses. He hid the Acer laptop and a succession of external storage drives under a box in the closet, so that they would not be obvious to anyone who might enter the closet.”


          Compared to someone dialing into someone’s voice mail and trying the default password or typically used passwords, this JSTOR caper looks more like hacking to the average person on the street.

          Again, the big difference–from the average Joe’s perspective–appears to mostly be about the relative sympathy to the victim and the question of whether it was done for money.

  15. Very simple: he violated the private contract and stole the private property of a company. IP is sacrosanct to freedom. Jail that fucker!

  16. “JSTOR says it got the docs back from Swartz…”

    No harm, no foul. He clearly gave back all the stolen electrons.

    1. Electricity is merely electrons. Do you condone tapping into a power grid without paying for it? After all, it’s only electrons.

      1. I think you missed the point.

Please to post comments

Comments are closed.