Dear,
I am writing a Flink program(Recommender system) needed a matrix as a state
which is the rating matrix, While the matrix is very sparse, I implemented a
sparse binary matrix to save the memory and save only the ones, not all the
matrix and use it as a data type and save it in a value State but unexpectedly
the performance became terrible and the job became very slow, I wonder any
suggestion to know what is the problem?
My first implementation for the rating matrix state :
MapState<String, Map<String, Float>> ratingMatrix;
The second implementation (the slow one) for rating matrix state:
ValueState<SparseBinaryMatrix> userItemRatingHistory;
and this apart from sparseBinaryMatrix class
public class SparseBinaryMatrix implements Serializable {
private ArrayList<Row> content;
private int rowLength;
private HashMap<String, Integer> columnLabels;
private HashMap<Integer, String> inverseColumnLabels;
private HashMap<String, Integer> rowLabels;
private HashMap<Integer, String> inverseRowLabels;
private enum LabelerType{Row, Column};
public Integer colNumber;
public Integer rowNumber;
// This constructor initializes the matrix with zeros
public SparseBinaryMatrix(int rows, int columns)
{
content = new ArrayList<>(rows);
rowLength = columns;
// for (int i = 0; i < rows; i++)
// content.add(new Row(columns));
}
Is depending on other class (Row) may lead to this terrible performance while
Row is class I have implemented and this is part of it
public class Row implements Serializable {
//This is an alternating sorted array
private ArrayList<Integer> content;
private int length=0;
public Row (int numbColumns)
{
length = numbColumns;
for (int i = 0; i < numbColumns;i++)
setColumnToZero(i);
}
public Row (int[] initialValues )
{
length = initialValues.length;
content = new ArrayList<>(length);
for (int i = 0; i < length;i++)
setColumn(i, initialValues[i]);
}
Regards,
Heidy