Wayback Machine, where you can find the previous iterations of websites. You, every presidential term that’s over, have been running the End of Term Archive to archive, ensure that what’s—the information of a presidential administration is preserved when it changes over, like we see from Obama to Trump.
Brewster Kahle talking:
his administration, or upcoming administration, has promised radical change, even potentially canceling whole departments. So the services that those departments have traditionally served are now online and could be deleted, changed, modified in ways that we really don’t know what’s coming up. So, where we’ve always gone and preserved paper records, which is—provides some level of preservation, these digital is a new—new aspect. And it goes much beyond, just like the previous speaker just said, beyond just recording webpages. We need the whole databases and the structures that science now depends on. But it’s now within an administration that we’re really not sure what’s coming up.
Things like the whole WhiteHouse.gov site, of course, just disappears. And so, anybody accessing any of the press releases or any of the information that used to be on that will get broken links. There are—some of the browser manufacturers are starting to point to the Wayback Machine, which we encourage, to be able to continue to find information that used to be on those sites. But it’s now beyond just that. It’s also social media feeds that can be manipulated and changed retroactively, which is done all the time now by a very media-savvy upcoming administration. So, I think we will see more control of the message, especially through the digital channels, and that makes archives, libraries and permanent access even more important.
The Wayback Machine operates by crawling the World Wide Web, and, actually, with many, many partners, crawling the World Wide Web, and adding those into the Internet Archive’s collections. And those collections become something that, from Archive.org, you can type in a URL or search to go and find a website to be able to then see the web as it was and surf the web as it was. You could see President-elect Trump’s 2008 and 2012 election websites or Hillary Clinton’s old Senate websites. So these websites are now available again as they were. But they’re just pictures of webpages, so they’re not the services behind it. They’re not the databases that climate scientists need, that are currently being used, of NOAA, NASA’s data sets, that have services on them. We would love to go and make it so that we’re not taking snapshots of websites, but whole web services get archived such that they can be used as they were in 2016. So we’re calling out to federal website masters, webmasters, to go and work with us to archive the whole working systems themselves in snapshot form.
there are groups that are collecting the web FTP sites now. They’re going in and trying to do special scripts to go and download all of the different data records that are in these databases. There’s groups in Toronto. There’s going to be a hackathon at the Internet Archive in—on January 7th to try to help tour through the important parts of the federal record, that we can then make a record outside of the government to make sure that it’s permanently available. Then we need to do—beyond that, we need to move it to other countries, because the history of libraries is one of loss. Usually libraries are burned, like the Library of Alexandria in ancient times, and they’re burned by governments. Just the new guys don’t want the old stuff around. They’re often sorry about it tens or hundreds of years later. But if you didn’t make a copy, then it’s just gone. So the idea of having multiple copies keeps stuff safe.
Laurie Allen talking:
I’ll also point out that the Bush administration did—also closed, or attempted to close, some EPA libraries all over the country. I think we—we know that people are concerned. I think there’s good reason to be tremendously concerned. My partners on this project in the Program for Environmental Humanities at Penn have been talking to so many scientists who are deeply concerned. And I think the important point is to—better to be safe than sorry. We have—you know, as Brewster just said, lots of copies keep stuff safe. It’s a kind of a good rule. And it is the role of all of us to make sure that this material continues to be available. And so, yes, we’re concerned, but more than that, it’s just wise to take steps to make sure that, whatever happens, these important facts remain available to future researchers.
Brewster Kahle talking:
how do we stop things from getting hacked? I think it’s copies, really, and putting them on other sides of fault lines, whether it’s earthquakes or hard drives failing or institutional failure, law changes, regime change. So, Canada is warm to digital libraries in many ways that the United States is becoming potentially less so. So the idea of having multiple legs to the stool. We looked at the television archive, so we will record all of television at the Internet Archive, to find out what the Trump campaign promises had been. And things like closing part of the internet up or threatening to—freedom of the press, going and actively saying—hating journalists—all of these are the things that libraries are built on, the idea of having ongoing access to information, historical information. These are what makes libraries work. And so, let’s just plan for whatever might happen. And who knows? Maybe it’s going to be just a dry run and we never needed to do it, but it’s a good idea in any case.
The campaign promises that have been made in the past, or policies and the like, can be changed by anybody that controls the current website. So those who control the present control the past. And as Orwell has warned, those who control the present control the future, so that it’s—we really need to make sure there’s a record of these things. So, Pence has made those go away. There have been Trump—within a day of getting control of dot-gov, they put up websites going and trumpeting Trump properties, that were taken away very quickly. And so, there’s actively managing what it is people can see on the World Wide Web. So, the Archive.org is a free resource for being able to see what was on those websites before. We’ve seen press releases change. George W. Bush announced from the aircraft carrier, and the headline read from the press release, that combat operations in Iraq had ceased. And then, a couple months later, it changed to say major combat operations had ceased. And then, a couple years after that, even during the still same administration, they removed the press release altogether. So, I’m not sure what is more Orwellian: not telling you that you’ve changed a previous press release or making it go away altogether. But unless we have libraries, we wouldn’t know any of that happened.
the Internet Archive, working with partners, have been archiving tweets, YouTube, Instagram, these different platforms. Facebook makes it very difficult, unfortunately, to go and record what it is that has been said, and now potentially later deleted. All these things are deleted at some point. The companies go under or whatever. And so, going and keeping a record of these pronouncements—there are now 10,000 official government Twitter channels. So we archive those. But we also do the ones from the campaigns and surrogates and the like, to be able to make rich data sets and making those available now back to researchers, so that we can know what it is that was promised.
Television, for instance, is very difficult to access. But on TV.Archive.org, another free resource, you can search based on what people said, including Democracy Now!, and be able to retrieve clips and to put into your blogs and be able to think critically about what has happened. If you can’t quote, compare and contrast, then it just flows over, and you say, “Wait a minute. I think I remember,” but you don’t really remember. So the key thing is to be able to quote, compare and contrast. And libraries are there to preserve a permanent record of things that are often ephemeral, like television, Twitter, websites and the like. And it’s a growing importance.
The internet is, I think, just an amazing experiment in sharing and mutual trust. And people are putting their ideas out there in a very public forum. And unless we go and ensure that that trust is warranted, if we don’t see too much spying so people will run away from it thinking that they’re going to get in trouble for it, these are very important things towards—that have made the World Wide Web possible in the first place. And it may be hard to remember, but it used to be very difficult to get this type of information. [inaudible] the government records might go into the National Archives after an administration changed, and then you’d have to wait six months, 12 months, to be able then to even make a request for one document at a time. But now we have the opportunity to being able to see what’s changed, what the development are, but also enjoy the benefits of enormous taxpayers’ funding towards building databases around climate change, about the weather data, that’s much more available than it ever was before. Let’s keep that going. Let’s continue to build on the trust that has been the hallmark of the World Wide Web. We just need libraries and archives, academics, people that are working in federal websites that may be displaced over—as changes in administration happen, to work together to make permanent what it is the taxpayers have paid for.
computer engineer, internet entrepreneur, activist and digital librarian. He is the founder of the Internet Archive.
assistant director for digital scholarship at the University of Pennsylvania Libraries and member of the Data Refuge Project to rescue climate and environmental data.
— source democracynow.org