Index

A C D G H I M N P R S T V W 
All Classes All Packages

A

available() - Method in class com.github.bottomlessarchive.warc.service.AvailableInputStream
 
AvailableInputStream - Class in com.github.bottomlessarchive.warc.service
This class is a hack to bypass a bug in the GZIPInputStream.
AvailableInputStream() - Constructor for class com.github.bottomlessarchive.warc.service.AvailableInputStream
 

C

close() - Method in class com.github.bottomlessarchive.warc.service.AvailableInputStream
 
com.github.bottomlessarchive.warc.service - package com.github.bottomlessarchive.warc.service
 
com.github.bottomlessarchive.warc.service.content.domain - package com.github.bottomlessarchive.warc.service.content.domain
 
com.github.bottomlessarchive.warc.service.content.request - package com.github.bottomlessarchive.warc.service.content.request
 
com.github.bottomlessarchive.warc.service.content.request.domain - package com.github.bottomlessarchive.warc.service.content.request.domain
 
com.github.bottomlessarchive.warc.service.content.response - package com.github.bottomlessarchive.warc.service.content.response
 
com.github.bottomlessarchive.warc.service.content.response.domain - package com.github.bottomlessarchive.warc.service.content.response.domain
 
com.github.bottomlessarchive.warc.service.header - package com.github.bottomlessarchive.warc.service.header
 
com.github.bottomlessarchive.warc.service.http - package com.github.bottomlessarchive.warc.service.http
 
com.github.bottomlessarchive.warc.service.record - package com.github.bottomlessarchive.warc.service.record
 
com.github.bottomlessarchive.warc.service.record.domain - package com.github.bottomlessarchive.warc.service.record.domain
 
CONTINUATION - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
Record blocks from 'continuation' records must be appended to corresponding prior record block(s) (e.g. from other WARC files) to create the logically complete full-sized original record.
CONVERSION - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
A 'conversion' record shall contain an alternative version of another record’s content that was created as the result of an archival process.
createWarcRecord(BoundedInputStream) - Method in class com.github.bottomlessarchive.warc.service.content.request.RequestContentBlockFactory
 
createWarcRecord(HeaderGroup, BoundedInputStream) - Method in class com.github.bottomlessarchive.warc.service.record.WarcRecordFactory
Creates a WARC record with specified WARC Headers.

D

DEFAULT_CHARSET - Static variable in class com.github.bottomlessarchive.warc.service.WarcReader
The default Charset used by the parser when no other Charset is provided.
DefaultContentBlock - Class in com.github.bottomlessarchive.warc.service.content.domain
A simple implementation of a WarcContentBlock for Most of WARC-Types.
DefaultContentBlock(InputStream) - Constructor for class com.github.bottomlessarchive.warc.service.content.domain.DefaultContentBlock
DefaultContentBlock constructor

G

getCharset() - Method in class com.github.bottomlessarchive.warc.service.content.response.domain.ResponseContentBlock
The charset of the response.
getContentBlock() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
Returns the WARC record's WarcContentBlock.
getHeader(String) - Method in class com.github.bottomlessarchive.warc.service.content.request.domain.RequestContentBlock
Return a value of a header from the request.
getHeader(String) - Method in class com.github.bottomlessarchive.warc.service.content.response.domain.ResponseContentBlock
Return a value of a header from the response.
getHeader(String) - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
 
getHeaders() - Method in class com.github.bottomlessarchive.warc.service.content.request.domain.RequestContentBlock
Return all of the headers of a WARC request.
getHeaders() - Method in class com.github.bottomlessarchive.warc.service.content.response.domain.ResponseContentBlock
Return all of the headers of a WARC response.
getHeaders() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
 
getPayload() - Method in class com.github.bottomlessarchive.warc.service.content.domain.DefaultContentBlock
Return content block stream as payload
getPayload() - Method in interface com.github.bottomlessarchive.warc.service.content.domain.WarcContentBlock
Return an InputStream of WARC payload Payload referred to, or contained by a WARC record as a meaningful subset of the content block
getPayloadAsString() - Method in class com.github.bottomlessarchive.warc.service.content.response.domain.ResponseContentBlock
Return the payload as a String instance.
getRecordId() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
Returns WARC-Record-ID of a WARC record.
getString(byte[], int, int, Charset) - Static method in class com.github.bottomlessarchive.warc.service.http.HttpParser
Converts the byte array of HTTP content characters to a string.
getType() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
Returns the type of a WARC record.
getWarcContentBlock() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
Deprecated, for removal: This API element is subject to removal in a future version.

H

hasNext() - Method in class com.github.bottomlessarchive.warc.service.SafeWarcRecordIterator
 
hasNext() - Method in class com.github.bottomlessarchive.warc.service.WarcRecordIterator
 
HeaderParser - Class in com.github.bottomlessarchive.warc.service.header
 
HeaderParser() - Constructor for class com.github.bottomlessarchive.warc.service.header.HeaderParser
 
HttpParser - Class in com.github.bottomlessarchive.warc.service.http
 

I

isContinuation() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
 
isConversion() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
 
isMetadata() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
 
isRequest() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
 
isResource() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
 
isResponse() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
 
isRevisit() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
 
isWarcinfo() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
Deprecated, for removal: This API element is subject to removal in a future version.
isWarcInfo() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
 
iteratorOf(InputStream) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordIteratorFactory
 
iteratorOf(InputStream, Charset) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordIteratorFactory
 
iteratorOf(InputStream, Charset, boolean) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordIteratorFactory
 
iteratorOf(String) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordIteratorFactory
 
iteratorOf(URI) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordIteratorFactory
 
iteratorOf(URL) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordIteratorFactory
 

M

METADATA - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
A 'metadata' record contains content created in order to further describe, explain, or accompany a harvested resource, in ways not covered by other record types.

N

newResponseContentBlock(InputStream) - Method in class com.github.bottomlessarchive.warc.service.content.response.ResponseContentBlockFactory
Create a ResponseContentBlock from a content block InputStream of a response WARC entry.
next() - Method in class com.github.bottomlessarchive.warc.service.SafeWarcRecordIterator
 
next() - Method in class com.github.bottomlessarchive.warc.service.WarcRecordIterator
 

P

parse() - Method in class com.github.bottomlessarchive.warc.service.WarcReader
This method based on the WARC format specification parses a WARC record and creates a WarcRecord object.
parseHeaders(InputStream, Charset) - Static method in class com.github.bottomlessarchive.warc.service.http.HttpParser
Parses headers from the given stream.
parseHeaders(HttpMessage) - Method in class com.github.bottomlessarchive.warc.service.header.HeaderParser
 
payload - Variable in class com.github.bottomlessarchive.warc.service.content.domain.DefaultContentBlock
 

R

read() - Method in class com.github.bottomlessarchive.warc.service.AvailableInputStream
 
read(byte[]) - Method in class com.github.bottomlessarchive.warc.service.AvailableInputStream
 
read(byte[], int, int) - Method in class com.github.bottomlessarchive.warc.service.AvailableInputStream
 
readLine(InputStream, Charset) - Static method in class com.github.bottomlessarchive.warc.service.http.HttpParser
Read up to "\n" from an (unchunked) input stream.
readRawLine(InputStream) - Static method in class com.github.bottomlessarchive.warc.service.http.HttpParser
Return byte array from an (unchunked) input stream.
readRecord() - Method in class com.github.bottomlessarchive.warc.service.WarcReader
Read a WARC record from the provided data source.
REQUEST - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
A 'request' record holds the details of a complete scheme-specific request, including network protocol information, where possible.
RequestContentBlock - Class in com.github.bottomlessarchive.warc.service.content.request.domain
 
RequestContentBlock() - Constructor for class com.github.bottomlessarchive.warc.service.content.request.domain.RequestContentBlock
 
RequestContentBlockFactory - Class in com.github.bottomlessarchive.warc.service.content.request
 
RequestContentBlockFactory() - Constructor for class com.github.bottomlessarchive.warc.service.content.request.RequestContentBlockFactory
 
RESOURCE - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
A 'resource' record contains a resource, without full protocol response information.
RESPONSE - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
A 'response' record should contain a complete scheme-specific response, including network protocol information, where possible.
ResponseContentBlock - Class in com.github.bottomlessarchive.warc.service.content.response.domain
An implementation of WarcContentBlock interface to handle contents block's of WARC responses.
ResponseContentBlock() - Constructor for class com.github.bottomlessarchive.warc.service.content.response.domain.ResponseContentBlock
 
ResponseContentBlockFactory - Class in com.github.bottomlessarchive.warc.service.content.response
This class is responsible for creating new ResponseContentBlock instances.
ResponseContentBlockFactory() - Constructor for class com.github.bottomlessarchive.warc.service.content.response.ResponseContentBlockFactory
 
REVISIT - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
A 'revisit' record describes the revisitation of content already archived, and might include only an abbreviated content body which has to be interpreted relative to a previous record.

S

SafeWarcRecordIterator - Class in com.github.bottomlessarchive.warc.service
 
SafeWarcRecordIterator() - Constructor for class com.github.bottomlessarchive.warc.service.SafeWarcRecordIterator
 
streamOf(InputStream) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(InputStream, Charset) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(InputStream, Charset, boolean) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(InputStream, Charset, boolean, List<WarcRecordType>) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(InputStream, Charset, List<WarcRecordType>) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(InputStream, List<WarcRecordType>) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(String) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(String, WarcRecordType...) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(String, List<WarcRecordType>) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(URI) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(URI, WarcRecordType...) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(URI, List<WarcRecordType>) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(URL) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(URL, WarcRecordType...) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 
streamOf(URL, List<WarcRecordType>) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
 

T

toString() - Method in class com.github.bottomlessarchive.warc.service.content.domain.DefaultContentBlock
 

V

valueOf(String) - Static method in enum com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
Returns the enum constant of this type with the specified name.
values() - Static method in enum com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
Returns an array containing the constants of this enum type, in the order they are declared.

W

WarcContentBlock - Interface in com.github.bottomlessarchive.warc.service.content.domain
WarcContentBlock interface represents content block of a WARC record Here is a list of known implementations of this interface RequestContentBlock ResponseContentBlock DefaultContentBlock
WarcFormatException - Exception in com.github.bottomlessarchive.warc.service
 
WarcFormatException(String) - Constructor for exception com.github.bottomlessarchive.warc.service.WarcFormatException
 
WarcFormatException(String, Throwable) - Constructor for exception com.github.bottomlessarchive.warc.service.WarcFormatException
 
WARCINFO - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
A 'warcinfo' record describes the records that follow it, up through end of file, end of input, or until next 'warcinfo' record.
WarcNetworkException - Exception in com.github.bottomlessarchive.warc.service
 
WarcNetworkException(String, Throwable) - Constructor for exception com.github.bottomlessarchive.warc.service.WarcNetworkException
 
WarcParsingException - Exception in com.github.bottomlessarchive.warc.service
 
WarcParsingException(String, Throwable) - Constructor for exception com.github.bottomlessarchive.warc.service.WarcParsingException
 
WarcReader - Class in com.github.bottomlessarchive.warc.service
This class provides basic functions to read and parse a WARC file.
WarcReader(InputStream) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
Create a new WarcReader and set the provided stream as the data source.
WarcReader(InputStream, Charset) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
Create a new WarcReader and set the provided stream as the data source.
WarcReader(InputStream, Charset, boolean) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
Create a new WarcReader and set the provided stream as the data source.
WarcReader(URL) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
Create a new WarcReader and set the file on the provided URL location as the data source.
WarcReader(URLConnection, Charset, boolean) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
Create a new WarcReader and set the file on the provided URLConnection as the data source.
WarcReader(URL, Charset) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
Create a new WarcReader and set the file on the provided URL location as the data source.
WarcReader(URL, Charset, boolean) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
Create a new WarcReader and set the file on the provided URL location as the data source.
WarcRecord<T extends WarcContentBlock> - Class in com.github.bottomlessarchive.warc.service.record.domain
Basic constituent of a WARC file.
WarcRecord() - Constructor for class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
 
WarcRecordFactory - Class in com.github.bottomlessarchive.warc.service.record
 
WarcRecordFactory() - Constructor for class com.github.bottomlessarchive.warc.service.record.WarcRecordFactory
 
WarcRecordIterator<T extends WarcContentBlock> - Class in com.github.bottomlessarchive.warc.service
 
WarcRecordIterator() - Constructor for class com.github.bottomlessarchive.warc.service.WarcRecordIterator
 
WarcRecordIteratorFactory - Class in com.github.bottomlessarchive.warc.service
 
WarcRecordStreamFactory - Class in com.github.bottomlessarchive.warc.service
 
WarcRecordType - Enum in com.github.bottomlessarchive.warc.service.record.domain
Describes the various types of WARC records.
A C D G H I M N P R S T V W 
All Classes All Packages