Index
A C D G H I M N P R S T V W
All Classes All Packages
All Classes All Packages
All Classes All Packages
A
- available() - Method in class com.github.bottomlessarchive.warc.service.AvailableInputStream
- AvailableInputStream - Class in com.github.bottomlessarchive.warc.service
-
This class is a hack to bypass a bug in the
GZIPInputStream. - AvailableInputStream() - Constructor for class com.github.bottomlessarchive.warc.service.AvailableInputStream
C
- close() - Method in class com.github.bottomlessarchive.warc.service.AvailableInputStream
- com.github.bottomlessarchive.warc.service - package com.github.bottomlessarchive.warc.service
- com.github.bottomlessarchive.warc.service.content.domain - package com.github.bottomlessarchive.warc.service.content.domain
- com.github.bottomlessarchive.warc.service.content.request - package com.github.bottomlessarchive.warc.service.content.request
- com.github.bottomlessarchive.warc.service.content.request.domain - package com.github.bottomlessarchive.warc.service.content.request.domain
- com.github.bottomlessarchive.warc.service.content.response - package com.github.bottomlessarchive.warc.service.content.response
- com.github.bottomlessarchive.warc.service.content.response.domain - package com.github.bottomlessarchive.warc.service.content.response.domain
- com.github.bottomlessarchive.warc.service.header - package com.github.bottomlessarchive.warc.service.header
- com.github.bottomlessarchive.warc.service.http - package com.github.bottomlessarchive.warc.service.http
- com.github.bottomlessarchive.warc.service.record - package com.github.bottomlessarchive.warc.service.record
- com.github.bottomlessarchive.warc.service.record.domain - package com.github.bottomlessarchive.warc.service.record.domain
- CONTINUATION - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
-
Record blocks from 'continuation' records must be appended to corresponding prior record block(s) (e.g. from other WARC files) to create the logically complete full-sized original record.
- CONVERSION - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
-
A 'conversion' record shall contain an alternative version of another record’s content that was created as the result of an archival process.
- createWarcRecord(BoundedInputStream) - Method in class com.github.bottomlessarchive.warc.service.content.request.RequestContentBlockFactory
- createWarcRecord(HeaderGroup, BoundedInputStream) - Method in class com.github.bottomlessarchive.warc.service.record.WarcRecordFactory
-
Creates a WARC record with specified WARC Headers.
D
- DEFAULT_CHARSET - Static variable in class com.github.bottomlessarchive.warc.service.WarcReader
-
The default
Charsetused by the parser when no otherCharsetis provided. - DefaultContentBlock - Class in com.github.bottomlessarchive.warc.service.content.domain
-
A simple implementation of a WarcContentBlock for Most of WARC-Types.
- DefaultContentBlock(InputStream) - Constructor for class com.github.bottomlessarchive.warc.service.content.domain.DefaultContentBlock
-
DefaultContentBlock constructor
G
- getCharset() - Method in class com.github.bottomlessarchive.warc.service.content.response.domain.ResponseContentBlock
-
The charset of the response.
- getContentBlock() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
-
Returns the WARC record's
WarcContentBlock. - getHeader(String) - Method in class com.github.bottomlessarchive.warc.service.content.request.domain.RequestContentBlock
-
Return a value of a header from the request.
- getHeader(String) - Method in class com.github.bottomlessarchive.warc.service.content.response.domain.ResponseContentBlock
-
Return a value of a header from the response.
- getHeader(String) - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
- getHeaders() - Method in class com.github.bottomlessarchive.warc.service.content.request.domain.RequestContentBlock
-
Return all of the headers of a WARC request.
- getHeaders() - Method in class com.github.bottomlessarchive.warc.service.content.response.domain.ResponseContentBlock
-
Return all of the headers of a WARC response.
- getHeaders() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
- getPayload() - Method in class com.github.bottomlessarchive.warc.service.content.domain.DefaultContentBlock
-
Return content block stream as payload
- getPayload() - Method in interface com.github.bottomlessarchive.warc.service.content.domain.WarcContentBlock
-
Return an InputStream of WARC payload Payload referred to, or contained by a WARC record as a meaningful subset of the content block
- getPayloadAsString() - Method in class com.github.bottomlessarchive.warc.service.content.response.domain.ResponseContentBlock
-
Return the payload as a
Stringinstance. - getRecordId() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
-
Returns WARC-Record-ID of a WARC record.
- getString(byte[], int, int, Charset) - Static method in class com.github.bottomlessarchive.warc.service.http.HttpParser
-
Converts the byte array of HTTP content characters to a string.
- getType() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
-
Returns the type of a WARC record.
- getWarcContentBlock() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
-
Deprecated, for removal: This API element is subject to removal in a future version.
H
- hasNext() - Method in class com.github.bottomlessarchive.warc.service.SafeWarcRecordIterator
- hasNext() - Method in class com.github.bottomlessarchive.warc.service.WarcRecordIterator
- HeaderParser - Class in com.github.bottomlessarchive.warc.service.header
- HeaderParser() - Constructor for class com.github.bottomlessarchive.warc.service.header.HeaderParser
- HttpParser - Class in com.github.bottomlessarchive.warc.service.http
I
- isContinuation() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
- isConversion() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
- isMetadata() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
- isRequest() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
- isResource() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
- isResponse() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
- isRevisit() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
- isWarcinfo() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
-
Deprecated, for removal: This API element is subject to removal in a future version.
- isWarcInfo() - Method in class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
- iteratorOf(InputStream) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordIteratorFactory
- iteratorOf(InputStream, Charset) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordIteratorFactory
- iteratorOf(InputStream, Charset, boolean) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordIteratorFactory
- iteratorOf(String) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordIteratorFactory
- iteratorOf(URI) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordIteratorFactory
- iteratorOf(URL) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordIteratorFactory
M
- METADATA - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
-
A 'metadata' record contains content created in order to further describe, explain, or accompany a harvested resource, in ways not covered by other record types.
N
- newResponseContentBlock(InputStream) - Method in class com.github.bottomlessarchive.warc.service.content.response.ResponseContentBlockFactory
-
Create a
ResponseContentBlockfrom a content blockInputStreamof a response WARC entry. - next() - Method in class com.github.bottomlessarchive.warc.service.SafeWarcRecordIterator
- next() - Method in class com.github.bottomlessarchive.warc.service.WarcRecordIterator
P
- parse() - Method in class com.github.bottomlessarchive.warc.service.WarcReader
-
This method based on the WARC format specification parses a WARC record and creates a
WarcRecordobject. - parseHeaders(InputStream, Charset) - Static method in class com.github.bottomlessarchive.warc.service.http.HttpParser
-
Parses headers from the given stream.
- parseHeaders(HttpMessage) - Method in class com.github.bottomlessarchive.warc.service.header.HeaderParser
- payload - Variable in class com.github.bottomlessarchive.warc.service.content.domain.DefaultContentBlock
R
- read() - Method in class com.github.bottomlessarchive.warc.service.AvailableInputStream
- read(byte[]) - Method in class com.github.bottomlessarchive.warc.service.AvailableInputStream
- read(byte[], int, int) - Method in class com.github.bottomlessarchive.warc.service.AvailableInputStream
- readLine(InputStream, Charset) - Static method in class com.github.bottomlessarchive.warc.service.http.HttpParser
-
Read up to "\n" from an (unchunked) input stream.
- readRawLine(InputStream) - Static method in class com.github.bottomlessarchive.warc.service.http.HttpParser
-
Return byte array from an (unchunked) input stream.
- readRecord() - Method in class com.github.bottomlessarchive.warc.service.WarcReader
-
Read a WARC record from the provided data source.
- REQUEST - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
-
A 'request' record holds the details of a complete scheme-specific request, including network protocol information, where possible.
- RequestContentBlock - Class in com.github.bottomlessarchive.warc.service.content.request.domain
- RequestContentBlock() - Constructor for class com.github.bottomlessarchive.warc.service.content.request.domain.RequestContentBlock
- RequestContentBlockFactory - Class in com.github.bottomlessarchive.warc.service.content.request
- RequestContentBlockFactory() - Constructor for class com.github.bottomlessarchive.warc.service.content.request.RequestContentBlockFactory
- RESOURCE - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
-
A 'resource' record contains a resource, without full protocol response information.
- RESPONSE - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
-
A 'response' record should contain a complete scheme-specific response, including network protocol information, where possible.
- ResponseContentBlock - Class in com.github.bottomlessarchive.warc.service.content.response.domain
-
An implementation of WarcContentBlock interface to handle contents block's of WARC responses.
- ResponseContentBlock() - Constructor for class com.github.bottomlessarchive.warc.service.content.response.domain.ResponseContentBlock
- ResponseContentBlockFactory - Class in com.github.bottomlessarchive.warc.service.content.response
-
This class is responsible for creating new
ResponseContentBlockinstances. - ResponseContentBlockFactory() - Constructor for class com.github.bottomlessarchive.warc.service.content.response.ResponseContentBlockFactory
- REVISIT - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
-
A 'revisit' record describes the revisitation of content already archived, and might include only an abbreviated content body which has to be interpreted relative to a previous record.
S
- SafeWarcRecordIterator - Class in com.github.bottomlessarchive.warc.service
- SafeWarcRecordIterator() - Constructor for class com.github.bottomlessarchive.warc.service.SafeWarcRecordIterator
- streamOf(InputStream) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(InputStream, Charset) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(InputStream, Charset, boolean) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(InputStream, Charset, boolean, List<WarcRecordType>) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(InputStream, Charset, List<WarcRecordType>) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(InputStream, List<WarcRecordType>) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(String) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(String, WarcRecordType...) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(String, List<WarcRecordType>) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(URI) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(URI, WarcRecordType...) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(URI, List<WarcRecordType>) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(URL) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(URL, WarcRecordType...) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
- streamOf(URL, List<WarcRecordType>) - Static method in class com.github.bottomlessarchive.warc.service.WarcRecordStreamFactory
T
- toString() - Method in class com.github.bottomlessarchive.warc.service.content.domain.DefaultContentBlock
V
- valueOf(String) - Static method in enum com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
-
Returns the enum constant of this type with the specified name.
- values() - Static method in enum com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
-
Returns an array containing the constants of this enum type, in the order they are declared.
W
- WarcContentBlock - Interface in com.github.bottomlessarchive.warc.service.content.domain
-
WarcContentBlock interface represents content block of a WARC record Here is a list of known implementations of this interface RequestContentBlock ResponseContentBlock DefaultContentBlock
- WarcFormatException - Exception in com.github.bottomlessarchive.warc.service
- WarcFormatException(String) - Constructor for exception com.github.bottomlessarchive.warc.service.WarcFormatException
- WarcFormatException(String, Throwable) - Constructor for exception com.github.bottomlessarchive.warc.service.WarcFormatException
- WARCINFO - com.github.bottomlessarchive.warc.service.record.domain.WarcRecordType
-
A 'warcinfo' record describes the records that follow it, up through end of file, end of input, or until next 'warcinfo' record.
- WarcNetworkException - Exception in com.github.bottomlessarchive.warc.service
- WarcNetworkException(String, Throwable) - Constructor for exception com.github.bottomlessarchive.warc.service.WarcNetworkException
- WarcParsingException - Exception in com.github.bottomlessarchive.warc.service
- WarcParsingException(String, Throwable) - Constructor for exception com.github.bottomlessarchive.warc.service.WarcParsingException
- WarcReader - Class in com.github.bottomlessarchive.warc.service
-
This class provides basic functions to read and parse a WARC file.
- WarcReader(InputStream) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
-
Create a new
WarcReaderand set the provided stream as the data source. - WarcReader(InputStream, Charset) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
-
Create a new
WarcReaderand set the provided stream as the data source. - WarcReader(InputStream, Charset, boolean) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
-
Create a new
WarcReaderand set the provided stream as the data source. - WarcReader(URL) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
-
Create a new
WarcReaderand set the file on the providedURLlocation as the data source. - WarcReader(URLConnection, Charset, boolean) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
-
Create a new
WarcReaderand set the file on the providedURLConnectionas the data source. - WarcReader(URL, Charset) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
-
Create a new
WarcReaderand set the file on the providedURLlocation as the data source. - WarcReader(URL, Charset, boolean) - Constructor for class com.github.bottomlessarchive.warc.service.WarcReader
-
Create a new
WarcReaderand set the file on the providedURLlocation as the data source. - WarcRecord<T extends WarcContentBlock> - Class in com.github.bottomlessarchive.warc.service.record.domain
-
Basic constituent of a WARC file.
- WarcRecord() - Constructor for class com.github.bottomlessarchive.warc.service.record.domain.WarcRecord
- WarcRecordFactory - Class in com.github.bottomlessarchive.warc.service.record
- WarcRecordFactory() - Constructor for class com.github.bottomlessarchive.warc.service.record.WarcRecordFactory
- WarcRecordIterator<T extends WarcContentBlock> - Class in com.github.bottomlessarchive.warc.service
- WarcRecordIterator() - Constructor for class com.github.bottomlessarchive.warc.service.WarcRecordIterator
- WarcRecordIteratorFactory - Class in com.github.bottomlessarchive.warc.service
- WarcRecordStreamFactory - Class in com.github.bottomlessarchive.warc.service
- WarcRecordType - Enum in com.github.bottomlessarchive.warc.service.record.domain
-
Describes the various types of WARC records.
All Classes All Packages