Class WorkSheet


  • public class WorkSheet
    extends Object
    Need to handle very large spreadsheets of expression data so keep memory footprint low
    Author:
    Scooter Willis
    • Method Detail

      • clear

        public void clear()
        See if we can free up memory
      • randomlyDivideSave

        public void randomlyDivideSave​(double percentage,
                                       String fileName1,
                                       String fileName2)
                                throws Exception
        Split a worksheet randomly. Used for creating a discovery/validation data set The first file name will matched the percentage and the second file the remainder
        Parameters:
        percentage -
        fileName1 -
        fileName2 -
        Throws:
        Exception
      • getCopyWorkSheetSelectedRows

        public static WorkSheet getCopyWorkSheetSelectedRows​(WorkSheet copyWorkSheet,
                                                             ArrayList<String> rows)
                                                      throws Exception
        Create a copy of a worksheet. If shuffling of columns or row for testing a way to duplicate original worksheet
        Parameters:
        copyWorkSheet -
        rows -
        Returns:
        Throws:
        Exception
      • getCopyWorkSheet

        public static WorkSheet getCopyWorkSheet​(WorkSheet copyWorkSheet)
                                          throws Exception
        Create a copy of a worksheet. If shuffling of columns or row for testing a way to duplicate original worksheet
        Parameters:
        copyWorkSheet -
        Returns:
        Throws:
        Exception
      • getMetaDataColumns

        public ArrayList<String> getMetaDataColumns()
        Returns:
      • shuffleColumnsAndThenRows

        public void shuffleColumnsAndThenRows​(ArrayList<String> columns,
                                              ArrayList<String> rows)
                                       throws Exception
        Randomly shuffle the columns and rows. Should be constrained to the same data type if not probably doesn't make any sense.
        Parameters:
        columns -
        rows -
        Throws:
        Exception
      • shuffleColumnValues

        public void shuffleColumnValues​(ArrayList<String> columns)
                                 throws Exception
        Need to shuffle column values to allow for randomized testing. The columns in the list will be shuffled together
        Parameters:
        columns -
        Throws:
        Exception
      • shuffleRowValues

        public void shuffleRowValues​(ArrayList<String> rows)
                              throws Exception
        Need to shuffle rows values to allow for randomized testing. The rows in the list will be shuffled together
        Parameters:
        rows -
        Throws:
        Exception
      • hideMetaDataColumns

        public void hideMetaDataColumns​(boolean value)
        Parameters:
        value -
      • hideMetaDataRows

        public void hideMetaDataRows​(boolean value)
        Parameters:
        value -
      • setMetaDataRowsAfterRow

        public void setMetaDataRowsAfterRow()
      • setMetaDataColumnsAfterColumn

        public void setMetaDataColumnsAfterColumn()
      • setMetaDataRowsAfterRow

        public void setMetaDataRowsAfterRow​(String row)
        Parameters:
        row -
      • setMetaDataColumnsAfterColumn

        public void setMetaDataColumnsAfterColumn​(String column)
        Parameters:
        column -
      • setMetaDataColumns

        public void setMetaDataColumns​(ArrayList<String> metaDataColumns)
        Clears existing meta data columns and sets new ones
        Parameters:
        metaDataColumns -
      • markMetaDataColumns

        public void markMetaDataColumns​(ArrayList<String> metaDataColumns)
        marks columns as containing meta data
        Parameters:
        metaDataColumns -
      • markMetaDataColumn

        public void markMetaDataColumn​(String column)
        Parameters:
        column -
      • isMetaDataColumn

        public boolean isMetaDataColumn​(String column)
        Parameters:
        column -
        Returns:
      • isMetaDataRow

        public boolean isMetaDataRow​(String row)
        Parameters:
        row -
        Returns:
      • markMetaDataRow

        public void markMetaDataRow​(String row)
        Parameters:
        row -
      • setMetaDataRows

        public void setMetaDataRows​(ArrayList<String> metaDataRows)
        Parameters:
        metaDataRows -
      • hideRow

        public void hideRow​(String row,
                            boolean hide)
        Parameters:
        row -
        hide -
      • hideColumn

        public void hideColumn​(String column,
                               boolean hide)
        Parameters:
        column -
        hide -
      • replaceColumnValues

        public void replaceColumnValues​(String column,
                                        HashMap<String,​String> values)
                                 throws Exception
        Change values in a column where 0 = something and 1 = something different
        Parameters:
        column -
        values -
        Throws:
        Exception
      • applyColumnFilter

        public void applyColumnFilter​(String column,
                                      ChangeValue changeValue)
                               throws Exception
        Apply filter to a column to change values from say numberic to nominal based on some range
        Parameters:
        column -
        changeValue -
        Throws:
        Exception
      • addColumn

        public void addColumn​(String column,
                              String defaultValue)
        Parameters:
        column -
        defaultValue -
      • addColumns

        public void addColumns​(ArrayList<String> columns,
                               String defaultValue)
        Add columns to worksheet and set default value
        Parameters:
        columns -
        defaultValue -
      • addRow

        public void addRow​(String row,
                           String defaultValue)
        Parameters:
        row -
        defaultValue -
      • addRows

        public void addRows​(ArrayList<String> rows,
                            String defaultValue)
        Add rows to the worksheet and fill in default value
        Parameters:
        rows -
        defaultValue -
      • isValidRow

        public boolean isValidRow​(String row)
        Parameters:
        row -
        Returns:
      • isValidColumn

        public boolean isValidColumn​(String col)
        Parameters:
        col -
        Returns:
      • setCacheDoubleValues

        public void setCacheDoubleValues​(boolean value)
        Parameters:
        value -
      • changeRowHeader

        public void changeRowHeader​(ChangeValue changeValue)
        Parameters:
        changeValue -
      • changeColumnHeader

        public void changeColumnHeader​(ChangeValue changeValue)
        Parameters:
        changeValue -
      • changeColumnsHeaders

        public void changeColumnsHeaders​(LinkedHashMap<String,​String> newColumnValues)
                                  throws Exception
        Change the columns in the HashMap Key to the name of the value
        Parameters:
        newColumnValues -
        Throws:
        Exception
      • getRandomDataColumns

        public ArrayList<String> getRandomDataColumns​(int number)
        Parameters:
        number -
        Returns:
      • getRandomDataColumns

        public ArrayList<String> getRandomDataColumns​(int number,
                                                      ArrayList<String> columns)
        Parameters:
        number -
        columns -
        Returns:
      • getAllColumns

        public ArrayList<String> getAllColumns()
        Get the list of column names including those that may be hidden
        Returns:
      • getColumns

        public ArrayList<String> getColumns()
        Get the list of column names. Does not include hidden columns
        Returns:
      • getDiscreteColumnValues

        public ArrayList<String> getDiscreteColumnValues​(String column)
                                                  throws Exception
        Get back a list of unique values in the column
        Parameters:
        column -
        Returns:
        Throws:
        Exception
      • getAllRows

        public ArrayList<String> getAllRows()
        Get all rows including those that may be hidden
        Returns:
      • getRows

        public ArrayList<String> getRows()
        Get the list of row names. Will exclude hidden values
        Returns:
      • getDataRows

        public ArrayList<String> getDataRows()
        Get the list of row names
        Returns:
      • getLogScale

        public WorkSheet getLogScale​(double base)
                              throws Exception
        Get the log scale of this worksheet where a zero value will be set to .1 as Log(0) is undefined
        Parameters:
        base -
        Returns:
        Throws:
        Exception
      • getLogScale

        public WorkSheet getLogScale​(double base,
                                     double zeroValue)
                              throws Exception
        Get the log scale of this worksheet
        Parameters:
        base -
        Returns:
        Throws:
        Exception
      • swapRowAndColumns

        public WorkSheet swapRowAndColumns()
                                    throws Exception
        Swap the row and columns returning a new worksheet
        Returns:
        Throws:
        Exception
      • unionWorkSheetsRowJoin

        public static WorkSheet unionWorkSheetsRowJoin​(String w1FileName,
                                                       String w2FileName,
                                                       char delimitter,
                                                       boolean secondSheetMetaData)
                                                throws Exception
        Combine two work sheets where you join based on rows. Rows that are found in one but not the other are removed. If the second sheet is meta data then a meta data column will be added between the two joined columns
        Parameters:
        w1FileName -
        w2FileName -
        delimitter -
        secondSheetMetaData -
        Returns:
        Throws:
        Exception
      • unionWorkSheetsRowJoin

        public static WorkSheet unionWorkSheetsRowJoin​(WorkSheet w1,
                                                       WorkSheet w2,
                                                       boolean secondSheetMetaData)
                                                throws Exception
        * Combine two work sheets where you join based on rows. Rows that are found in one but not the other are removed. If the second sheet is meta data then a meta data column will be added between the two joined columns
        Parameters:
        w1 -
        w2 -
        secondSheetMetaData -
        Returns:
        Throws:
        Exception
      • readCSV

        public static WorkSheet readCSV​(String fileName,
                                        char delimiter)
                                 throws Exception
        Read a CSV/Tab delimitted file where you pass in the delimiter
        Parameters:
        fileName -
        delimiter -
        Returns:
        Throws:
        Exception
      • readCSV

        public static WorkSheet readCSV​(InputStream is,
                                        char delimiter)
                                 throws Exception
        Read a CSV/Tab delimited file where you pass in the delimiter
        Parameters:
        f -
        delimiter -
        Returns:
        Throws:
        Exception
      • saveCSV

        public void saveCSV​(String fileName)
                     throws Exception
        Save the worksheet as a csv file
        Parameters:
        fileName -
        Throws:
        Exception
      • setRowHeader

        public void setRowHeader​(String value)
        Parameters:
        value -
      • appendWorkSheetColumns

        public void appendWorkSheetColumns​(WorkSheet worksheet)
                                    throws Exception
        Add columns from a second worksheet to be joined by common row. If the appended worksheet doesn't contain a row in the master worksheet then default value of "" is used. Rows in the appended worksheet not found in the master worksheet are not added.
        Parameters:
        worksheet -
        Throws:
        Exception
      • appendWorkSheetRows

        public void appendWorkSheetRows​(WorkSheet worksheet)
                                 throws Exception
        Add rows from a second worksheet to be joined by common column. If the appended worksheet doesn't contain a column in the master worksheet then default value of "" is used. Columns in the appended worksheet not found in the master worksheet are not added.
        Parameters:
        worksheet -
        Throws:
        Exception
      • save

        public void save​(OutputStream outputStream,
                         char delimitter,
                         boolean quoteit)
                  throws Exception
        Parameters:
        outputStream -
        delimitter -
        quoteit -
        Throws:
        Exception
      • getIndexColumnName

        public String getIndexColumnName()
        Returns:
        the indexColumnName
      • setIndexColumnName

        public void setIndexColumnName​(String indexColumnName)
        Parameters:
        indexColumnName - the indexColumnName to set
      • getMetaDataColumnsHashMap

        public LinkedHashMap<String,​String> getMetaDataColumnsHashMap()
        Returns:
        the metaDataColumnsHashMap
      • getRowHeader

        public String getRowHeader()
        Returns:
        the rowHeader