Class GroupDataRDD


  • public class GroupDataRDD
    extends java.lang.Object
    An RDD to comprise Group level data.
    Author:
    Anthony Bradley
    • Constructor Summary

      Constructors 
      Constructor Description
      GroupDataRDD​(org.apache.spark.api.java.JavaPairRDD<java.lang.String,​org.biojava.nbio.structure.Group> groupRdd)
      Constructor of the RDD from a JavaPairRDD of Group
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void cacheData()
      Cache the data - for multi-processing.
      java.util.Map<java.lang.String,​java.lang.Long> countByGroupName()
      Count the number of times each group name appears.
      AtomDataRDD getAtoms()
      Get the atoms from the groups.
      org.apache.spark.api.java.JavaPairRDD<java.lang.String,​org.biojava.nbio.structure.Group> getGroupRdd()
      Get the JavaPairRDD of Group data.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • GroupDataRDD

        public GroupDataRDD​(org.apache.spark.api.java.JavaPairRDD<java.lang.String,​org.biojava.nbio.structure.Group> groupRdd)
        Constructor of the RDD from a JavaPairRDD of Group
        Parameters:
        groupRdd - the input JavaPairRDD of Group
    • Method Detail

      • cacheData

        public void cacheData()
        Cache the data - for multi-processing.
      • getGroupRdd

        public org.apache.spark.api.java.JavaPairRDD<java.lang.String,​org.biojava.nbio.structure.Group> getGroupRdd()
        Get the JavaPairRDD of Group data.
        Returns:
        the JavaPairRDD of Group data
      • countByGroupName

        public java.util.Map<java.lang.String,​java.lang.Long> countByGroupName()
        Count the number of times each group name appears.
        Returns:
        the map of group names (e.g. LYS for Lysine) and the number of times they appear in the RDD
      • getAtoms

        public AtomDataRDD getAtoms()
        Get the atoms from the groups.
        Returns:
        the atoms for all the groups