0%

Java | Read Excel using EasyExcel

What is EasyExcel

In the past, I used Apache POI to parse Excel, but I don’t like dealing with the model conversions and numeric type conversions by myself, and the POI uses too much memory when the Excel file is large. Later I heared about the EasyExcel which is an open source project created by Alibaba, so I decided to try it out.

Introduction from offical website:

A quick, easy and avoiding OOM Excel tool written in Java

EasyExcel is an easy-to-use, reading/writting Excel with low memory open source project. Support to handle bigger than 100MB of Excel while saving the memory as much as possible. GitHub: https://github.com/alibaba/easyExcel

Import dependency

A maven pom demo as below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<properties>
<jdk.version>11</jdk.version>
</properties>

<dependency>
<groupId>com.alibaba</groupId>
<artifactId>easyExcel</artifactId>
<version>2.2.7</version>
</dependency>

<!-- The followings are optional, add if you need -->
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.10</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>javax.validation</groupId>
<artifactId>validation-api</artifactId>
<version>2.0.1.Final</version>
</dependency>

The mininum demo for reading Excel

1. Data class DemoBizExcelRow.java, storing one row data of Excel

1
2
3
4
5
6
@Data // use Lombok, or add get()/set()
public class DemoBizExcelRow {
private String string;
private Date date;
private Double doubleData;
}

2. Custom data row listener class DemoDataListener.java, parsing data row by row

If you need the parsed results, you can receive them by passing them as arguments to the constructor as follows.
If you want to save something into the DB in this listener class, please see the official example DemoDataListener.java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
public class DemoDataListener extends AnalysisEventListener<DemoBizExcelRow> {

List<DemoBizExcelRow> list = new ArrayList<DemoBizExcelRow>();

public DemoDataListener() { }

/**
* If you are using Spring, please use this constructor.
* Every time you create Listener, it is needed to passing the class managed by Spring into the method.
*/
public DemoDataListener(List<DemoBizExcelRow> list) {
this.list = list;
}

/**
* This function is called every time while the every row data is parsed.
*
* @param data
* one row value. It is same as {@link AnalysisContext#readRowHolder()}
* @param context
*/
@Override
public void invoke(DemoBizExcelRow data, AnalysisContext context) {
list.add(data);
}

/**
* This is called after all the date parsing is done.
*
* @param context
*/
@Override
public void doAfterAllAnalysed(AnalysisContext context) { }
}

3. Use the above to read Excel

1
2
3
4
5
6
7
8
9
10
11
String file = "D:\\Demo.xlsx";
List<DemoExcelRow> list = new ArrayList<DemoExcelRow>();

try {
EasyExcel.read(file, DemoExcelRow.class, new DemoDataListener<>(list))
.sheet(0)
.headRowNumber(1)
.doRead();
} catch (Exception e) {
LOGGER.error("e={}", e);
}

It is easy to use, there are only two things you need to do: 1. define data class 2. custom listener class (for getting Excel data or operating dao, or whatever else)

Custom reading Excel

The above is good enough, but in my own actual usage, I still need other functions:

  • A table header may has multiple rows, and headers are different in many bussiness Excel files
  • Get the header info, including at least colume name, column order
  • Verify the colume name, so as to ensure the Excel meets the requirements, if not, return directly
  • Verify the cell data after it is parsed, e.g. required field can’t be blank, value for some column can’t exceed a range, and value for a particular column has to be converted based on ENUM
  • Get the sets of error colume in each row, however, it is default to ignore the remaining cells and skip to the next row when it encounters a certain error cell

1. The result class for returning

In the above example, only the valid data List<DemoExcelRow> can be returned, but now I want to get a) valid data b) error info c) table header info, so it is convenient to design a result class to solve this problem.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
@Data
public class ReadExcelResult<T> {

/**
* table header map.
* row index -> header data of one row
*/
Map<Integer, Map<Integer, CellData>> rowIdx2HeadMap;

/**
* valid data map.
* row index -> content data of one row
*/
Map<Integer, T> rowIdx2RowDataMap;

/**
* error row/column index map.
* row index -> error column index
*/
Map<Integer, Set<Integer>> rowIdx2ErrColIdxMap;

/**
* Initialize fields, let them not be null
*/
public ReadExcelResult() {
this.rowIdx2HeadMap = new TreeMap<>();
this.rowIdx2RowDataMap = new TreeMap<>();
this.rowIdx2ErrColIdxMap = new TreeMap<>();
}
}

2. Data class DemoExcelRow.java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
@Data // use lombok, or add get()/set()
public class DemoExcelRow {

/**
* The number of header row.
* row index starts from 1
*/
@ExcelIgnore
private final static int HEAD_ROW_NUMBER = 5;
/**
* The number of column.
* last_index==COLUMN_LAST_NUMBER-1
* column index starts from 0.
*/
@ExcelIgnore
private final static int COLUMN_LAST_NUMBER = 37;
/**
* The expected header.
* <p>Mainly used for table legality checks. Only the necessary fields have to be verified, i.e., a subset of the actual header </p>
* <p>If it is null, do not verify the header.</p>
* <p>key是表头排序,即columnIndex,从0开始;</p>
* <p>value是表头名,可以忽略前后空格,但必须包含中间空格和换行</p>
*/
@ExcelIgnore
private final static Map<Integer, String> HEAD_CHECK_MAP = new HashMap<>() {
{
put( 0, "索引号");
put( 1, "选取样本特征");
put( 2, "发函单位(客户)*");
}
};

/*---- The following correspond to each field in the DB table ----*/

@ExcelProperty({"索引号"})
private String string;
@ExcelProperty(value = "会计科目", converter = SubjectConverter.class)
private String subject;
@ExcelProperty("存款日期")
private Date date;
@ExcelProperty(index = 5)
@NumberFormat("#.##")
private Double doubleData;

public static int getHeadRowNumber() {
return HEAD_ROW_NUMBER;
}

public static int getColumnLastNumber() {
return COLUMN_LAST_NUMBER;
}

public static Map<Integer, String> getHeadCheckMap() {
return HEAD_CHECK_MAP;
}
}

Code explanation:

  • Field subject is a converter, we’ll talk about it in the next section
  • Field doubleData is limited to two decimal points by the annotation @NumberFormat("#.##"), specified at the 6th column by the annotation @ExcelProperty(index = 5)
  • Default all fields can find a matched column in Excel, but @ExcelIgnore shows the field is not in Excel
  • HEAD_ROW_NUMBER shows that the table header is on this row
  • Put any fields of header in HEAD_CHECK_MAP to check if the Excel is valid

3. Converter

Converter SubjectConverter.java will turn the text of some column into number through the enum, Eg. in this demo, convert “应收账款” to 16.

I only changed the content of convertToJavaData() in the following class, the rest is inherited and generated by the IDE automatically, so let’s ignore them for brevity.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
public class SubjectConverter implements Converter<String> {

// some codes are ignored ...

/**
* 会计科目
*/
public final static Map<String, Integer> CONTACT_SUBJECT_MAP = new HashMap<>() {
{
put("应收账款", 16);
put("预收款项", 17);
put("应付账款", 18);
put("预付款项", 19);
}
};

/**
* Convert Excel objects to Java objects
*
* @param cellData Excel cell data.NotNull.
* @param contentProperty Content property.Nullable.
* @param globalConfiguration Global configuration.NotNull.
* @return Data to put into a Java object
* @throws Exception Exception.
*/
@Override
public String convertToJavaData(CellData cellData, ExcelContentProperty contentProperty, GlobalConfiguration globalConfiguration) throws Exception {
String key = cellData.getStringValue();
Integer val = CONTACT_SUBJECT_MAP.get(key);
if (val == null) {
throw new ParseException("fail to convert 会计科目: " + key, -1);
}
return val.toString();
}

}

4. Custom listener for parsing cell

ReadAllCellDataThrowExceptionLastListener.java is copied from the offical class ModelBuildEventListener.java, I just changed the code that stops parsing the rest of the current row when an error cell is encountered to read the rest of the row.

Only the modified codes are showed here, go ahead to ModelBuildEventListener.java to see the rest which is ignored.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
public class ReadAllCellDataThrowExceptionLastListener extends
AbstractIgnoreExceptionReadListener<Map<Integer, CellData>> {

// some codes are ignored ...

private Object buildUserModel(Map<Integer, CellData> cellDataMap, ReadHolder currentReadHolder,
AnalysisContext context) {
ExcelReadHeadProperty ExcelReadHeadProperty = currentReadHolder.ExcelReadHeadProperty();
Object resultModel;
try {
resultModel = ExcelReadHeadProperty.getHeadClazz().getDeclaredConstructor().newInstance();
} catch (Exception e) {
throw new ExcelDataConvertException(context.readRowHolder().getRowIndex(), 0,
new CellData(CellDataTypeEnum.EMPTY), null,
"Can not instance class: " + ExcelReadHeadProperty.getHeadClazz().getName(), e);
}
Map<Integer, Head> headMap = ExcelReadHeadProperty.getHeadMap();
Map<String, Object> map = new HashMap<String, Object>(headMap.size() * 4 / 3 + 1);
Map<Integer, ExcelContentProperty> contentPropertyMap = ExcelReadHeadProperty.getContentPropertyMap();
LinkedList<ExcelDataConvertException> exceptionLinkedList = new LinkedList<>();
for (Map.Entry<Integer, Head> entry : headMap.entrySet()) {
Integer index = entry.getKey();
if (!cellDataMap.containsKey(index)) {
continue;
}
CellData cellData = cellDataMap.get(index);
if (cellData.getType() == CellDataTypeEnum.EMPTY) {
continue;
}
ExcelContentProperty ExcelContentProperty = contentPropertyMap.get(index);
try {
Object value = ConverterUtils.convertToJavaObject(cellData, ExcelContentProperty.getField(),
ExcelContentProperty, currentReadHolder.converterMap(), currentReadHolder.globalConfiguration(),
context.readRowHolder().getRowIndex(), index);
if (value != null) {
map.put(ExcelContentProperty.getField().getName(), value);
}
} catch (ExcelDataConvertException e) {
exceptionLinkedList.add(e);
}
}

if (CollectionUtils.isEmpty(exceptionLinkedList)) {
// 没有异常,则转换为需要的map
BeanMap.create(resultModel).putAll(map);
return resultModel;
} else {
// 存在异常,挨个抛出,最后一个异常往外抛结束运行
for (int i = 0; i < exceptionLinkedList.size(); i++) {
ExcelDataConvertException exception = exceptionLinkedList.get(i);
if (i == exceptionLinkedList.size() - 1) {
// the last one
throw exception;
} else {
handleException(context, exception);
}
}
return null;
}

}

private void handleException(AnalysisContext analysisContext, Exception e) {
for (ReadListener readListenerException : analysisContext.currentReadHolder().readListenerList()) {
try {
readListenerException.onException(e, analysisContext);
} catch (RuntimeException re) {
throw re;
} catch (Exception e1) {
throw new ExcelAnalysisException(e1.getMessage(), e1);
}
}
}
}

5. Custom listener for parsing row

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
public class MyBizDemoListener<T> extends AnalysisEventListener<T> {

/**
* The expected header.
* <p>用于表格合法性校验。这里可以只校验必要的字段,即,配置实际Excel的表头字段的子集。</p>
* <p>当为null时,不校验表头</p>
* <p>key是表头排序,即columnIndex,从0开始;</p>
* <p>value是表头名,可以忽略前后空格,但必须包含中间空格和换行</p>
*/
protected Map<Integer, String> headCheckMap;
/**
* 读取Excel后,存入该对象
*/
protected ReadExcelResult<T> readExcelResult;
/**
* 表头行/列
*/
protected Map<Integer, Map<Integer, CellData>> rowIdx2HeadMap = new TreeMap<>();
/**
* 有效行/列信息
*/
protected Map<Integer, T> rowIdx2RowDataMap = new TreeMap<>();
/**
* 错误行/列信息
*/
protected Map<Integer, Set<Integer>> rowIdx2ErrColIdxMap = new TreeMap<>();
/**
* 当前行的错误列index集合
*/
protected Set<Integer> currentRowErrorColumnIndexSet;

public MyBizDemoListener() {
this(new ReadExcelResult<>(), null);
}

/**
* @param readExcelResult 读取Excel后,存入该对象
* @param headCheckMap 表格头校验map。A map contains columnIndex->header column name. If null, do not check header
*/
public MyBizDemoListener(ReadExcelResult<T> readExcelResult, Map<Integer, String> headCheckMap) {
this.readExcelResult = readExcelResult;
this.headCheckMap = headCheckMap;
}

/**
* 处理表头
* 一行一行调用该函数
* @param headMap -
* @param context -
*/
@Override
public void invokeHead(Map<Integer, CellData> headMap, AnalysisContext context) {
super.invokeHead(headMap, context);

ReadRowHolder readRowHolder = context.readRowHolder();
int rowIndex = readRowHolder.getRowIndex();

rowIdx2HeadMap.put(rowIndex, headMap);

/*
* 表头合法性校验
*/
if (headCheckMap != null && !headCheckMap.isEmpty()) {
int headRowNumber = context.readSheetHolder().getHeadRowNumber();
if (headRowNumber == rowIndex + 1) {
for (Integer key : headCheckMap.keySet()) {
String expect = headCheckMap.get(key).trim();
CellData cell = headMap.get(key);
if (null == cell) {
//模板不符!退出
throw new ExcelHeadException("表头与预期不符。未找到表头:" + expect);
}

String real = cell.getStringValue();
real = (real==null? null : real.trim());
if (!expect.equalsIgnoreCase(real)) {
//模板不符!退出
throw new ExcelHeadException("表头与预期不符。期望:" + expect + " <--> 实际:" + real);
}
}
}
}

}

/**
* When analysis one row trigger invoke function.
* 自动跳过空行,即,空行不会进入这个函数
*
* @param data one row value. Is is same as {@link AnalysisContext#readRowHolder()}
* @param context -
*/
@Override
public void invoke(T data, AnalysisContext context) {

currentRowErrorColumnIndexSet = new TreeSet<>();

ReadRowHolder readRowHolder = context.readRowHolder();
int rowIndex = readRowHolder.getRowIndex();

/**
* 非空校验方法。业务强相关,略去源码。可以通过特定的注解判断必填
*/
ReadExcelUtil.checkNotEmpty(data, context, currentRowErrorColumnIndexSet);

/**
* 值有效性校验。业务强相关,略去源码
*/
ReadExcelUtil.checkFieldValueInStringMap(data, "sendType", ReadExcelConstant.SEND_TYPE_MAP, context, currentRowErrorColumnIndexSet);

if (currentRowErrorColumnIndexSet.isEmpty()) {
rowIdx2RowDataMap.put(rowIndex, data);
} else {
Set<Integer> errColIdxMap = rowIdx2ErrColIdxMap.get(rowIndex);
if (errColIdxMap != null) {
currentRowErrorColumnIndexSet.addAll(errColIdxMap);
}
rowIdx2ErrColIdxMap.put(rowIndex, currentRowErrorColumnIndexSet);
}

}

/**
* if have something to do after all analysis
*
* @param context -
*/
@Override
public void doAfterAllAnalysed(AnalysisContext context) {
readExcelResult.setRowIdx2HeadMap(rowIdx2HeadMap);
readExcelResult.setRowIdx2ErrColIdxMap(rowIdx2ErrColIdxMap);
readExcelResult.setRowIdx2RowDataMap(rowIdx2RowDataMap);
}


/**
* The current method is called when extra information is returned
*
* @param extra extra information
* @param context -
*/
@Override
public void extra(CellExtra extra, AnalysisContext context) { }

/**
* All listeners receive this method when any one Listener does an error report. If an exception is thrown here, the
* entire read will terminate.
* 获取其他异常下会调用本接口。抛出异常则停止读取。如果这里不抛出异常则 继续读取下一行。
*
* @param exception -
* @param context -
*/
@Override
public void onException(Exception exception, AnalysisContext context) throws Exception {

// 如果是某一个单元格的转换异常,能获取到具体行号
// 如果要获取头的信息,配合invokeHeadMap使用
if (exception instanceof ExcelDataConvertException) {
ExcelDataConvertException ExcelDataConvertException = (ExcelDataConvertException)exception;
Integer cellRowIndex = ExcelDataConvertException.getRowIndex();
Integer cellColumnIndex = ExcelDataConvertException.getColumnIndex();

String cellColumnString = CellReference.convertNumToColString(cellColumnIndex);
LOGGER.error("第{}行{}列,数值转换异常:{}", cellRowIndex+1, cellColumnString, exception.getMessage());

Set<Integer> errColIdxMap = rowIdx2ErrColIdxMap.get(cellRowIndex);
if (errColIdxMap == null) {
errColIdxMap = new TreeSet<>();
}
errColIdxMap.add(cellColumnIndex);
rowIdx2ErrColIdxMap.put(cellRowIndex, errColIdxMap);
} else if (exception instanceof ExcelHeadException) {

LOGGER.error(exception.getMessage());

// 表格不符合规范,抛出异常,触发终止解析
throw exception;
} else {
LOGGER.error("第{}行解析失败,但是继续解析下一行。exception: \n{}",
context.readRowHolder().getRowIndex() + 1,
Arrays.toString(exception.getStackTrace()).replaceAll(",", "\n"));
}
}
}

Code explanation:

  • In invokeHead(), verify the Excel header by checking headCheckMap which is passed in through the constructor method
  • In onException(), parsing Excel is aborted only when throw exception, else continue to the next row
  • In invoke(), deal with the required values, validity check and so on

6. Use the above to read Excel

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
String file = "D:\\Demo.xlsx";

ReadExcelResult<DemoBizExcelRow> readExcelResult = new ReadExcelResult<>();
ExcelReader ExcelReader = null;
try {
MyBizDemoListener<DemoBizExcelRow> myBizDemoListener = new MyBizDemoListener<>(readExcelResult, DemoBizExcelRow.getHeadCheckMap());
ExcelReader = EasyExcel.read(file, DemoBizExcelRow.class, null)
.useDefaultListener(false)
.registerReadListener(new ReadAllCellDataThrowExceptionLastListener())
.registerReadListener(myBizDemoListener)
.build();
ReadSheet readSheet = EasyExcel.readSheet(0).headRowNumber(DemoBizExcelRow.getHeadRowNumber()).build();
ExcelReader.read(readSheet);
} catch (Exception e) {
if (e instanceof ExcelHeadException) {
LOGGER.error("Excel模板错误!");
} else {
LOGGER.error("其他异常");
}
} finally {
if (ExcelReader != null) {
ExcelReader.finish();
}
}

// print readExcelResult

Code explanation:

  • EasyExcel supports the listener for parsing cell by default, that is ModelBuildEventListener.java above mentioned, but if you need to use your custom listener, you should comment out the default useDefaultListener(false) first, then register your own registerReadListener(new ReadAllCellDataThrowExceptionLastListener())
  • In the case of the custom cell listener is used, we can’t pass in your custom listener for parsing row through EasyExcel.read(), only can do it through registering registerReadListener(), and there is an important point we have to register the custom cell listener first then regiser the custom row listener

Supplement

Format conversion

Date

1
2
3
@ExcelProperty("开始时间")
@DateTimeFormat("yyyy-MM-dd")
private Date startTime;
  • 2020/1/2 –> 1577894400000(即,2020-01-02)
  • 你好 –> throw new Exception()
  • // 转换失败时,会抛异常
1
2
3
@ExcelProperty("结束时间")
@DateTimeFormat("yyyy-MM-dd")
private String endTime;
  • 2019/2/5 –> 2019-02-05
  • 你好 –> 你好
  • // 转换失败时,保持原值,不会抛异常

Number

1
2
3
@ExcelProperty("存钱")
@NumberFormat("#.##")
private Double deposit;
  • 1.2345 –> 1.23
  • 你好 –> throw new Exception()
  • // 转换失败时,会异常
1
2
3
@ExcelProperty("取钱")
@NumberFormat("#.##")
private String withdraw;
  • 1.2345 –> 1.23
  • 你好 –> 你好
  • // 转换失败时,保持原值,不会抛异常

Welcome to my other publishing channels