In the past, I used Apache POI to parse Excel, but I don’t like dealing with the model conversions and numeric type conversions by myself, and the POI uses too much memory when the Excel file is large. Later I heared about the EasyExcel which is an open source project created by Alibaba, so I decided to try it out.
Introduction from offical website:
A quick, easy and avoiding OOM Excel tool written in Java
EasyExcel is an easy-to-use, reading/writting Excel with low memory open source project. Support to handle bigger than 100MB of Excel while saving the memory as much as possible. GitHub: https://github.com/alibaba/easyExcel
<!-- The followings are optional, add if you need --> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <version>1.18.10</version> <scope>provided</scope> </dependency> <dependency> <groupId>javax.validation</groupId> <artifactId>validation-api</artifactId> <version>2.0.1.Final</version> </dependency>
The mininum demo for reading Excel
1. Data class DemoBizExcelRow.java, storing one row data of Excel
1 2 3 4 5 6
@Data// use Lombok, or add get()/set() publicclassDemoBizExcelRow{ private String string; private Date date; private Double doubleData; }
2. Custom data row listener class DemoDataListener.java, parsing data row by row
If you need the parsed results, you can receive them by passing them as arguments to the constructor as follows. If you want to save something into the DB in this listener class, please see the official example DemoDataListener.java
List<DemoBizExcelRow> list = new ArrayList<DemoBizExcelRow>();
publicDemoDataListener(){ }
/** * If you are using Spring, please use this constructor. * Every time you create Listener, it is needed to passing the class managed by Spring into the method. */ publicDemoDataListener(List<DemoBizExcelRow> list){ this.list = list; }
/** * This function is called every time while the every row data is parsed. * * @param data * one row value. It is same as {@link AnalysisContext#readRowHolder()} * @param context */ @Override publicvoidinvoke(DemoBizExcelRow data, AnalysisContext context){ list.add(data); }
/** * This is called after all the date parsing is done. * * @param context */ @Override publicvoiddoAfterAllAnalysed(AnalysisContext context){ } }
3. Use the above to read Excel
1 2 3 4 5 6 7 8 9 10 11
String file = "D:\\Demo.xlsx"; List<DemoExcelRow> list = new ArrayList<DemoExcelRow>();
It is easy to use, there are only two things you need to do: 1. define data class 2. custom listener class (for getting Excel data or operating dao, or whatever else)
Custom reading Excel
The above is good enough, but in my own actual usage, I still need other functions:
A table header may has multiple rows, and headers are different in many bussiness Excel files
Get the header info, including at least colume name, column order
Verify the colume name, so as to ensure the Excel meets the requirements, if not, return directly
Verify the cell data after it is parsed, e.g. required field can’t be blank, value for some column can’t exceed a range, and value for a particular column has to be converted based on ENUM
Get the sets of error colume in each row, however, it is default to ignore the remaining cells and skip to the next row when it encounters a certain error cell
1. The result class for returning
In the above example, only the valid data List<DemoExcelRow> can be returned, but now I want to get a) valid data b) error info c) table header info, so it is convenient to design a result class to solve this problem.
/** * table header map. * row index -> header data of one row */ Map<Integer, Map<Integer, CellData>> rowIdx2HeadMap;
/** * valid data map. * row index -> content data of one row */ Map<Integer, T> rowIdx2RowDataMap;
/** * error row/column index map. * row index -> error column index */ Map<Integer, Set<Integer>> rowIdx2ErrColIdxMap;
/** * Initialize fields, let them not be null */ publicReadExcelResult(){ this.rowIdx2HeadMap = new TreeMap<>(); this.rowIdx2RowDataMap = new TreeMap<>(); this.rowIdx2ErrColIdxMap = new TreeMap<>(); } }
@Data// use lombok, or add get()/set() publicclassDemoExcelRow{
/** * The number of header row. * row index starts from 1 */ @ExcelIgnore privatefinalstaticint HEAD_ROW_NUMBER = 5; /** * The number of column. * last_index==COLUMN_LAST_NUMBER-1 * column index starts from 0. */ @ExcelIgnore privatefinalstaticint COLUMN_LAST_NUMBER = 37; /** * The expected header. * <p>Mainly used for table legality checks. Only the necessary fields have to be verified, i.e., a subset of the actual header </p> * <p>If it is null, do not verify the header.</p> * <p>key是表头排序,即columnIndex,从0开始;</p> * <p>value是表头名,可以忽略前后空格,但必须包含中间空格和换行</p> */ @ExcelIgnore privatefinalstatic Map<Integer, String> HEAD_CHECK_MAP = new HashMap<>() { { put( 0, "索引号"); put( 1, "选取样本特征"); put( 2, "发函单位(客户)*"); } };
/*---- The following correspond to each field in the DB table ----*/
Field subject is a converter, we’ll talk about it in the next section
Field doubleData is limited to two decimal points by the annotation @NumberFormat("#.##"), specified at the 6th column by the annotation @ExcelProperty(index = 5)
Default all fields can find a matched column in Excel, but @ExcelIgnore shows the field is not in Excel
HEAD_ROW_NUMBER shows that the table header is on this row
Put any fields of header in HEAD_CHECK_MAP to check if the Excel is valid
3. Converter
Converter SubjectConverter.java will turn the text of some column into number through the enum, Eg. in this demo, convert “应收账款” to 16.
I only changed the content of convertToJavaData() in the following class, the rest is inherited and generated by the IDE automatically, so let’s ignore them for brevity.
/** * Convert Excel objects to Java objects * * @param cellData Excel cell data.NotNull. * @param contentProperty Content property.Nullable. * @param globalConfiguration Global configuration.NotNull. * @return Data to put into a Java object * @throws Exception Exception. */ @Override public String convertToJavaData(CellData cellData, ExcelContentProperty contentProperty, GlobalConfiguration globalConfiguration)throws Exception { String key = cellData.getStringValue(); Integer val = CONTACT_SUBJECT_MAP.get(key); if (val == null) { thrownew ParseException("fail to convert 会计科目: " + key, -1); } return val.toString(); }
}
4. Custom listener for parsing cell
ReadAllCellDataThrowExceptionLastListener.java is copied from the offical class ModelBuildEventListener.java, I just changed the code that stops parsing the rest of the current row when an error cell is encountered to read the rest of the row.
Only the modified codes are showed here, go ahead to ModelBuildEventListener.java to see the rest which is ignored.
ReadRowHolder readRowHolder = context.readRowHolder(); int rowIndex = readRowHolder.getRowIndex();
rowIdx2HeadMap.put(rowIndex, headMap);
/* * 表头合法性校验 */ if (headCheckMap != null && !headCheckMap.isEmpty()) { int headRowNumber = context.readSheetHolder().getHeadRowNumber(); if (headRowNumber == rowIndex + 1) { for (Integer key : headCheckMap.keySet()) { String expect = headCheckMap.get(key).trim(); CellData cell = headMap.get(key); if (null == cell) { //模板不符!退出 thrownew ExcelHeadException("表头与预期不符。未找到表头:" + expect); }
String real = cell.getStringValue(); real = (real==null? null : real.trim()); if (!expect.equalsIgnoreCase(real)) { //模板不符!退出 thrownew ExcelHeadException("表头与预期不符。期望:" + expect + " <--> 实际:" + real); } } } }
}
/** * When analysis one row trigger invoke function. * 自动跳过空行,即,空行不会进入这个函数 * * @param data one row value. Is is same as {@link AnalysisContext#readRowHolder()} * @param context - */ @Override publicvoidinvoke(T data, AnalysisContext context){
currentRowErrorColumnIndexSet = new TreeSet<>();
ReadRowHolder readRowHolder = context.readRowHolder(); int rowIndex = readRowHolder.getRowIndex();
/** * if have something to do after all analysis * * @param context - */ @Override publicvoiddoAfterAllAnalysed(AnalysisContext context){ readExcelResult.setRowIdx2HeadMap(rowIdx2HeadMap); readExcelResult.setRowIdx2ErrColIdxMap(rowIdx2ErrColIdxMap); readExcelResult.setRowIdx2RowDataMap(rowIdx2RowDataMap); }
/** * The current method is called when extra information is returned * * @param extra extra information * @param context - */ @Override publicvoidextra(CellExtra extra, AnalysisContext context){ }
/** * All listeners receive this method when any one Listener does an error report. If an exception is thrown here, the * entire read will terminate. * 获取其他异常下会调用本接口。抛出异常则停止读取。如果这里不抛出异常则 继续读取下一行。 * * @param exception - * @param context - */ @Override publicvoidonException(Exception exception, AnalysisContext context)throws Exception {
EasyExcel supports the listener for parsing cell by default, that is ModelBuildEventListener.java above mentioned, but if you need to use your custom listener, you should comment out the default useDefaultListener(false) first, then register your own registerReadListener(new ReadAllCellDataThrowExceptionLastListener())
In the case of the custom cell listener is used, we can’t pass in your custom listener for parsing row through EasyExcel.read(), only can do it through registering registerReadListener(), and there is an important point we have to register the custom cell listener first then regiser the custom row listener
Supplement
Format conversion
Date
1 2 3
@ExcelProperty("开始时间") @DateTimeFormat("yyyy-MM-dd") private Date startTime;