Class JtokkitTextSplitter

java.lang.Object
com.github.hakenadu.javalangchains.chains.qa.split.MaxLengthBasedTextSplitter
com.github.hakenadu.javalangchains.chains.qa.split.JtokkitTextSplitter
All Implemented Interfaces:
TextSplitter

public final class JtokkitTextSplitter
extends MaxLengthBasedTextSplitter
This TextSplitter splits documents based on their token count. For that purpose jtokkit is utilized.
  • Constructor Details

    • JtokkitTextSplitter

      public JtokkitTextSplitter​(com.knuddels.jtokkit.api.Encoding encoding, int maxTokens, TextStreamer textStreamer)
      creates an instance of JtokkitTextSplitter
      Parameters:
      encoding - encoding
      maxTokens - max amount of tokens for each chunk
      textStreamer - the TextStreamer used for streaming the base text
    • JtokkitTextSplitter

      public JtokkitTextSplitter​(com.knuddels.jtokkit.api.Encoding encoding, int maxTokens)
      creates an instance of JtokkitTextSplitter with sentence based text streaming
      Parameters:
      encoding - encoding
      maxTokens - max amount of tokens for each chunk
  • Method Details