Libarc
In programming, Libarc is a C++ library that accesses contents of GZIP compressed ARC files. These ARC files are generated by the Internet Archive's Heritrix web crawler.[1]
Written in | C++ |
---|---|
Type | library or framework |
Overview
Libarc allows users to open and scan contents of GZIP compressed ARC Files. It also allows users to get an iterator that walks over the contents of said ARC files, member by member.
Users are able to specify the media type in order to limit the types seen. This allows them to access information in the member’s URL record and response headers from http servers and access to the member’s data in a single API call.
Additionally to the API reference documentation there are two other sources: Programming with libarc - This describes the libarac API, and the license and copyright policies held by the Basis Technology Corp.