Tuesday, 24 October 2017

How to Read Excel Files in Java using Apache POI

In this blog, we will show you how to read excel file using Apache POI (Poor Obfuscation Implementation) library which is very popular amongst the available library.

Apache POI library

Apache POI is the pure Java API for reading and writing Excel files in both formats XLS (Excel 2003 and earlier) and XLSX (Excel 2007 and later).

To use Apache POI in Java project:

For non-Maven projects:

Download the latest library from here: Apache POI - Download Release Artifacts
Extract the zip file and add the appropriate JAR files to the  project’s classpath:

  • If reading and writing only Excel 2003 format, only the file poi-VERSION.jar is enough.
  • If reading and writing Excel 2007 format, we have to include the following files:
    • poi-ooxml-VERSION.jar
    • poi-ooxml-schemas-VERSION.jar
    • xmlbeans-VERSION.jar  
For Maven projects:

Add the following dependency to the project’s pom.xml file:

  • For Excel 2003 format only(XLS)
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>VERSION</version>
</dependency

  • For Excel 2007 format(XLSX)
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>VERSION</version>
</dependency


Note: VERSION need to replace with the POI version that you are using(for example: 3.9)

The Apache API Basics

There are two main prefixes which we will encounter while working with Apache POI APIs:

HSSF: denotes the API is for working with Excel 2003 and earlier.
XSSF: denotes the API is for working with Excel 2007 and later.

To get started the Apache POI API, we just need to understand and use the following 4 interfaces:
  • Workbook: high-level representation of an Excel workbook. Implementation classes are: 
    • HSSFWorkbook
    • XSSFWorkbook.
  • Sheet: high-level representation of an Excel worksheet. Implementation classes are:
    • HSSFSheet 
    • XSSFSheet
  • Row: high-level representation of a row in a spreadsheet. Implementation classes are:
    • HSSFRow 
    • XSSFRow
  • Cell: high-level representation of a cell in a row. Implementation classes are:
    • HSSFCell 
    • XSSFCell

    Let's see the example how to read excel(xlsx) file using the Apache POI library. Below is the sample of excel file that wants to read


    Download the sample file from here
    Download the project from here

    Below is the code to read the file:

    package com.test.ankur;

    import java.io.File;
    import java.io.FileInputStream;
    import java.io.IOException;
    import java.util.Iterator;

    import org.apache.poi.ss.usermodel.Cell;
    import org.apache.poi.ss.usermodel.Row;
    import org.apache.poi.ss.usermodel.Sheet;
    import org.apache.poi.ss.usermodel.Workbook;
    import org.apache.poi.xssf.usermodel.XSSFWorkbook;
    public class ReadExcel {
          
           public static void main(String[] args) throws IOException {
            String excelFilePath = "C:/Ankur/ReadFile.xlsx";
            FileInputStream inputStream = new FileInputStream(new File(excelFilePath));
            Workbook workbook = new XSSFWorkbook(inputStream);
            Sheet sheet = workbook.getSheetAt(0);
            Iterator<Row> rowIterator = sheet.iterator();
            
            while (rowIterator.hasNext()) {
                Row nextRow = rowIterator.next();
                Iterator<Cell> cellIterator = nextRow.cellIterator();
                
                while (cellIterator.hasNext()) {
                    Cell cell = cellIterator.next();
                    
                    switch (cell.getCellType()) {
                        case Cell.CELL_TYPE_STRING:
                            System.out.print(cell.getStringCellValue());
                            break;
                        case Cell.CELL_TYPE_BOOLEAN:
                            System.out.print(cell.getBooleanCellValue());
                            break;
                        case Cell.CELL_TYPE_NUMERIC:
                            System.out.print(cell.getNumericCellValue());
                            break;
                    }
                    System.out.print(" | ");
                }
                System.out.println();
            }
                    
            inputStream.close();
        }
    }


    Run the program and see the output as below:




    No comments:

    Post a Comment