Class ReadDocumentsFromPdfChainBase<I>
java.lang.Object
com.github.hakenadu.javalangchains.chains.data.reader.ReadDocumentsFromPdfChainBase<I>
- Type Parameters:
I- input type to read pdfs from
- Direct Known Subclasses:
ReadDocumentsFromInMemoryPdfChain,ReadDocumentsFromPdfChain
public abstract class ReadDocumentsFromPdfChainBase<I> extends Object implements Chain<I,Stream<Map<String,String>>>
provides base functionality for all pdf reading chains
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected classReadDocumentsFromPdfChainBase.PdDocumentWrapper(PDDocument, PDF-Name) pairstatic classReadDocumentsFromPdfChainBase.PdfReadModethis enum is used to configure how each pdf content is read into a string -
Constructor Summary
Constructors Modifier Constructor Description protectedReadDocumentsFromPdfChainBase(ReadDocumentsFromPdfChainBase.PdfReadMode readMode, boolean parallel)creates aReadDocumentsFromPdfChainBase -
Method Summary
-
Constructor Details
-
ReadDocumentsFromPdfChainBase
protected ReadDocumentsFromPdfChainBase(ReadDocumentsFromPdfChainBase.PdfReadMode readMode, boolean parallel)creates aReadDocumentsFromPdfChainBase- Parameters:
readMode-readModeparallel-parallel
-
-
Method Details
-
loadPdDocuments
protected abstract Stream<ReadDocumentsFromPdfChainBase.PdDocumentWrapper> loadPdDocuments(I input) throws IOExceptionload a pdf from an input instance- Parameters:
input- input instance- Returns:
PDDocument- Throws:
IOException- on error loading pdf
-
run
Description copied from interface:ChainExecute thisChain
-