我正在嘗試使用 PDF 框閱讀器獲取當前頁面。聽到的是我寫的代碼。公共類 PDFTextExtractor{ArrayList extractText(String fileName) 拋出異常 {PDDocument document = null;try { document = PDDocument.load( new File(fileName) ); PDFTextAnalyzer stripper = new PDFTextAnalyzer(); stripper.setSortByPosition( true ); stripper.setStartPage( 0 ); stripper.setEndPage( document.getNumberOfPages() ); Writer dummy = new OutputStreamWriter(new ByteArrayOutputStream()); stripper.writeText(document, dummy); return stripper.getCharactersList();}finally { if( document != null ) { document.close(); }}}當我試圖獲取詳細信息時,我正在編寫以下代碼。public class PDFTextAnalyzer extends PDFTextStripper { public PDFTextAnalyzer() throws IOException { super(); // TODO Auto-generated constructor stub } private ArrayList<CharInfo> charactersList = new ArrayList<CharInfo>(); public ArrayList<CharInfo> getCharactersList() { return charactersList; } public void setCharactersList(ArrayList<CharInfo> charactersList) { this.charactersList = charactersList; } @Override protected void writeString(String string, List<TextPosition> textPositions) throws IOException { System.out.println("----->"+document.getPages().getCount());/* for(int i = 0 ; i < document.getPages().getCount();i++) { */ float docHeight = +document.getPage(1).getMediaBox().getHeight(); for (TextPosition text : textPositions) { /* * System.out.println((int)text.getUnicode().charAt(0)+" "+text. * getUnicode()+ " [(X=" + text.getXDirAdj()+" "+text.getX() + ",Y=" * + text.getYDirAdj() + ") height=" + text.getHeightDir() + * " width=" + text.getWidthDirAdj() + "]"); */ ); }但我無法獲取頁碼。請參閱行注釋“當前文本的頁碼”。有沒有辦法獲取頁碼。
添加回答
舉報
0/150
提交
取消