Java 的 BufferReader 分析_综合

最近在做一个文件检索功能, 发现BufferReader的效率挺高的,但是也存在一些小问题, 所以现在把自己对BufferReader的一些东西做一下学习,

这样就可以自己进行BufferReader的改造以满足业务

BufferReader的主体的功能有三个:

1.readLine: 就是先读取一段char[] 数组到内存, 然后检索内存中的 \n 或者 \r\n 如果找到了就把这一行返回

2.read: 把他缓存的char[]中的一个char返回

3. public int read(char[] buffer, int offset, int length) 从缓存的char[]中读取一段char []数组

如果buffer不足, 会通过fillbuf()函数从硬盘中获取buf,然后再进行这个操作.

通过使用内存换取io , 提高了读写效率.

BufferReader主要使用了三个标志来进行缓存读取的判断

* { X X X X X X X X X X X X - - }
* ^ ^ ^
* | | |
* mark pos end</pre>
*
* pos 当前游标所在的位置每当一次read 就会pos 前进一段
* end 是char[] 数组的尾巴,到了这个就要进行fillbuf了
* mark 配合reset使用, 把pos回滚到mark的位置

那么一readLine为例详细分析一下,其大体顺序如下:

readline的功能主要是以 /r/n 或者 /n 为结束标志来读取文本中每一行

1.其他地方调用readline

2.使用checkNotClosed检查是否这个BufferReader是否已经关闭

3.开始进行一次fillBuf() 如果 pos ==end 或者 fillbuf()返回的是-1 那么就进行直接返回null

4.如果填充的buf不是空, 那么就开始遍历buf 知道遇到 /n 记录这个位置就可以向string中添加char并返回了.

具体注释代码如下;

 /*** Returns the next line of text available from this reader. A line is* represented by zero or more characters followed by {@code '\n'},* {@code '\r'}, {@code "\r\n"} or the end of the reader. The string does* not include the newline sequence.* 以 /r/n 或者 /n 为结束标志来读取文本中每一行* @return the contents of the line or {@code null} if no characters were*         read before the end of the reader has been reached.* @throws IOException*             if this reader is closed or some other I/O error occurs.*/public String readLine() throws IOException {synchronized (lock) {checkNotClosed();//检查是否reader已经关闭/* has the underlying stream been exhausted? 还有buf可以读吗? */if (pos == end && fillBuf() == -1) {return null;}int count = 0;for (int charPos = pos; charPos < end; charPos++) {//charPos 是 指向了 char[] 这个buf的哪个位置count ++;if (count > 1024 * 9) {//如果超出了 大约 3k汉字都没有回车,就终止Log.e("TAG", "OOps, the line is too large!" + this.getClass().getName());return null;}char ch = buf[charPos];if (ch > '\r') {//一直没有遇到 \n\rcontinue;}if (ch == '\n') { // \n的情况String res = new String(buf, pos, charPos - pos);pos = charPos + 1;return res;} else if (ch == '\r') { //  \r\n的情况String res = new String(buf, pos, charPos - pos);pos = charPos + 1;if (((pos < end) || (fillBuf() != -1))&& (buf[pos] == '\n')) {pos++;}return res;}}char eol = '\0';StringBuilder result = new StringBuilder(80);/* Typical Line Length */result.append(buf, pos, end - pos);while (true) {//如果找不到 \n 就会不断的fillbuf 不断的添加到string中直到 文件末尾count ++;if (count > 1024 * 9) {Log.e("TAG", "OOps, the line is too large!" + this.getClass().getName());return null;}pos = end;/* Are there buffered characters available? */if (eol == '\n') {return result.toString();}// attempt to fill bufferif (fillBuf() == -1) {//没有找到 /n 只能继续找了, 如没有了就返回// characters or null.return result.length() > 0 || eol != '\0'? result.toString(): null;}for (int charPos = pos; charPos < end; charPos++) {//这时候pos和end都变了, 继续填充找  \nchar c = buf[charPos];if (eol == '\0') {if ((c == '\n' || c == '\r')) {eol = c;}} else if (eol == '\r' && c == '\n') {if (charPos > pos) {result.append(buf, pos, charPos - pos - 1);}pos = charPos + 1;return result.toString();} else {if (charPos > pos) {result.append(buf, pos, charPos - pos - 1);}pos = charPos;return result.toString();}}if (eol == '\0') { //应该是文件结束就返回 拼接的字符串result.append(buf, pos, end - pos);} else {result.append(buf, pos, end - pos - 1);}}}}

 /*** Populates the buffer with data. It is an error to call this method when* the buffer still contains data; ie. if {@code pos < end}.* 把文本文件添加到char[] 数组中* @return the number of bytes read into the buffer, or -1 if the end of the*      source stream has been reached.*/private int fillBuf() throws IOException {// assert(pos == end);//buf是在构造函数中初始化的, 默认是 8kif (mark == -1 || (pos - mark >= markLimit)) {/* mark isn't set or has exceeded its limit. use the whole buffer  mark 没有设置 或者已经超过了 那个limit, 直接读取buf的长度*/int result = in.read(buf, 0, buf.length);if (result > 0) {mark = -1;pos = 0;end = result; //把end 设置为读取到的 buf的长度}return result;}//if (mark == 0 && markLimit > buf.length) {/* the only way to make room when mark=0 is by growing the buffer 如果 markLimit很大 超过 当前的buf.length, 就需要 扩充 buf了 */int newLength = buf.length * 2;if (newLength > markLimit) {newLength = markLimit;}char[] newbuf = new char[newLength];System.arraycopy(buf, 0, newbuf, 0, buf.length);buf = newbuf;} else if (mark > 0) {//mark > 0 就 截断buf  获取  mark 后面的 buf 从mark以后开始计算buf/* make room by shifting the buffered data to left mark positions */System.arraycopy(buf, mark, buf, 0, buf.length - mark);pos -= mark;//回到mark的位置end -= mark;mark = 0;}/* Set the new position and mark position */int count = in.read(buf, pos, buf.length - pos);if (count != -1) {end += count;}return count;}

JAVA中mark()和reset()用法的通俗理解mark就像书签一样，在这个BufferedReader对应的buffer里作个标记，以后再调用reset时就可以再回到这个mark过的地方。mark方法有个参数，通过这个整型参数，你告诉系统，希望在读出这么多个字符之前，这个mark保持有效。读过这么多字符之后，系统可以使mark不再有效，而你不能觉得奇怪或怪罪它。这跟buffer有关，如果你需要很长的距离，那么系统就必须分配很大的buffer来保持你的mark。    //eg.    //reader      is      a      BufferedReader    reader.mark(50);//要求在50个字符之内，这个mark应该保持有效，系统会保证buffer至少可以存储50个字符    int      a      =      reader.read();//读了一个字符    int      b      =      reader.read();//又读了一个字符    //做了某些处理，发现需要再读一次    reader.reset();    reader.read();//读到的字符和a相同    reader.read();//读到的字符和b相同