当前位置: 代码迷 >> Eclipse >> java读取pdf文件类型的源
  详细解决方案

java读取pdf文件类型的源

热度:102   发布时间:2016-04-23 12:25:03.0
java读取pdf文件类型的流
在控制台只输出了第一页的内容,其余的都没有读出来,控制台的信息是:

2012-7-31 11:22:26 org.apache.pdfbox.util.PDFStreamEngine processOperator
信息: unsupported/disabled operation: EI



求解决方法

------解决方案--------------------
public String getPdfContent(String filePath){
String excute="pdftotext";

String[] cmd=new String[]{excute, "-enc", "UTF-8", "-q", filePath,"-"};
Process p=null;
try {
p=Runtime.getRuntime().exec(cmd);
} catch (IOException e) {
e.printStackTrace();
}

BufferedInputStream bis=new BufferedInputStream(p.getInputStream());

InputStreamReader reader=null;

try {
reader=new InputStreamReader(bis,"UTF-8");
} catch (UnsupportedEncodingException e1) {
e1.printStackTrace();
}

StringBuffer sb=new StringBuffer();

try {
BufferedReader br = new BufferedReader(reader);
String line = br.readLine();
sb = new StringBuffer();
while (line != null) {
sb.append(line);
sb.append(" ");
line = br.readLine();
}
} catch (Exception e) {
e.printStackTrace();
}

return sb.toString();
}
  相关解决方案