History As Big Data: 500 Years Of Book Images And Mapping Millions Of Books – Forbes

Source: History As Big Data: 500 Years Of Book Images And Mapping Millions Of Books – Forbes

This is a fascinating post about the value of libraries in this digital age, with one online library in particular in mind – Internet Archive (see link below).  While books make up only a part of this digital collection, with this preserved data Kalev Leetara was able to put together this collage of books covering 500 years and over 1,000 libraries.

Read on and marvel at this online history of books.

…Libraries, on the other hand, filled with endless rows of dusty books, are likely not the first thing that comes to mind. Yet, what if we could use libraries to reimagine our past, creating a gallery of all the images from half a millennium of books or creating a 215-year animated map of human history as seen through millions of books?

Libraries have reinvented themselves in the digital era and one library in particular, the Internet Archive, stands among the forefront of the big data era. The Archive, most famous for its historical archive of the Internet, today holds more than 23 petabytes of historical data that is growing at a rate of 50-60 terabytes per week. On its servers reside more than 436 billion web pages back to 1996, 750,000 television shows back to 2009, over 100,000 pieces of software dating back 30 years, and over half a billion pages of books dating back 500 years from over 1,000 libraries around the world. It is that last collection, of millions of books dating to the year 1500, that we will explore further here.

This is the image that is found on the Forbes’ website, along with this further description by Leetara.


What would it look like to reimagine the book not as pages of text, but as a global distributed gallery of illustrations, drawings, charts, maps, and photographs that together comprise one of the world’s greatest art collections? In Fall 2013 I approached the Internet Archive with the idea of using computer algorithms to extract every image found on all 600 million pages of their digitized book collection, along with the text surrounding each image and the basic metadata about the book. In just over a month I did precisely that, creating a massive gallery that is slowly being uploaded to Flickr.

The URI to TrackBack this entry is: https://cjts3rs.wordpress.com/2015/09/25/history-as-big-data-500-years-of-book-images-and-mapping-millions-of-books-forbes/trackback/

RSS feed for comments on this post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: