最近在做一个文件检索功能, 发现BufferReader的效率挺高的,但是也存在一些小问题, 所以现在把自己对BufferReader的一些东西做一下学习,
这样就可以自己进行BufferReader的改造以满足业务
BufferReader的主体的功能有三个:
1.readLine: 就是先读取一段char[] 数组到内存, 然后检索内存中的 \n 或者 \r\n 如果找到了就把这一行返回
2.read: 把他缓存的char[]中的一个char返回
3. public int read(char[] buffer, int offset, int length) 从缓存的char[]中读取一段char []数组
如果buffer不足, 会通过fillbuf()函数从硬盘中获取buf,然后再进行这个操作.
通过使用内存换取io , 提高了读写效率.
BufferReader主要使用了三个标志来进行缓存读取的判断
* { X X X X X X X X X X X X - - }
* ^ ^ ^
* | | |
* mark pos end</pre>
*
* pos 当前 游标所在的位置 每当一次read 就会pos 前进一段
* end 是char[] 数组的尾巴,到了这个就要进行fillbuf了
* mark 配合reset使用, 把pos回滚到mark的位置
那么一readLine为例详细分析一下,其大体顺序如下:
readline的功能主要是 以 /r/n 或者 /n 为结束标志来读取文本中每一行
1.其他地方调用readline
2.使用checkNotClosed检查是否这个BufferReader是否已经关闭
3.开始进行一次fillBuf() 如果 pos ==end 或者 fillbuf()返回的是-1 那么就进行直接返回null
4.如果填充的buf不是空, 那么就开始遍历buf 知道遇到 /n 记录这个位置就可以向string中添加char并返回了.
具体注释代码如下;
/*** Returns the next line of text available from this reader. A line is* represented by zero or more characters followed by {@code '\n'},* {@code '\r'}, {@code "\r\n"} or the end of the reader. The string does* not include the newline sequence.* 以 /r/n 或者 /n 为结束标志来读取文本中每一行* @return the contents of the line or {@code null} if no characters were* read before the end of the reader has been reached.* @throws IOException* if this reader is closed or some other I/O error occurs.*/public String readLine() throws IOException {synchronized (lock) {checkNotClosed();//检查是否reader已经关闭/* has the underlying stream been exhausted? 还有buf可以读吗? */if (pos == end && fillBuf() == -1) {return null;}int count = 0;for (int charPos = pos; charPos < end; charPos++) {//charPos 是 指向了 char[] 这个buf的哪个位置count ++;if (count > 1024 * 9) {//如果超出了 大约 3k汉字都没有回车,就终止Log.e("TAG", "OOps, the line is too large!" + this.getClass().getName());return null;}char ch = buf[charPos];if (ch > '\r') {//一直没有遇到 \n\rcontinue;}if (ch == '\n') { // \n的情况String res = new String(buf, pos, charPos - pos);pos = charPos + 1;return res;} else if (ch == '\r') { // \r\n的情况String res = new String(buf, pos, charPos - pos);pos = charPos + 1;if (((pos < end) || (fillBuf() != -1))&& (buf[pos] == '\n')) {pos++;}return res;}}char eol = '\0';StringBuilder result = new StringBuilder(80);/* Typical Line Length */result.append(buf, pos, end - pos);while (true) {//如果找不到 \n 就会不断的fillbuf 不断的添加到string中直到 文件末尾count ++;if (count > 1024 * 9) {Log.e("TAG", "OOps, the line is too large!" + this.getClass().getName());return null;}pos = end;/* Are there buffered characters available? */if (eol == '\n') {return result.toString();}// attempt to fill bufferif (fillBuf() == -1) {//没有找到 /n 只能继续找了, 如没有了就返回// characters or null.return result.length() > 0 || eol != '\0'? result.toString(): null;}for (int charPos = pos; charPos < end; charPos++) {//这时候pos和end都变了, 继续填充找 \nchar c = buf[charPos];if (eol == '\0') {if ((c == '\n' || c == '\r')) {eol = c;}} else if (eol == '\r' && c == '\n') {if (charPos > pos) {result.append(buf, pos, charPos - pos - 1);}pos = charPos + 1;return result.toString();} else {if (charPos > pos) {result.append(buf, pos, charPos - pos - 1);}pos = charPos;return result.toString();}}if (eol == '\0') { //应该是文件结束就返回 拼接的字符串result.append(buf, pos, end - pos);} else {result.append(buf, pos, end - pos - 1);}}}}
/*** Populates the buffer with data. It is an error to call this method when* the buffer still contains data; ie. if {@code pos < end}.* 把文本文件添加到char[] 数组中* @return the number of bytes read into the buffer, or -1 if the end of the* source stream has been reached.*/private int fillBuf() throws IOException {// assert(pos == end);//buf是在构造函数中初始化的, 默认是 8kif (mark == -1 || (pos - mark >= markLimit)) {/* mark isn't set or has exceeded its limit. use the whole buffer mark 没有设置 或者已经超过了 那个limit, 直接读取buf的长度*/int result = in.read(buf, 0, buf.length);if (result > 0) {mark = -1;pos = 0;end = result; //把end 设置为读取到的 buf的长度}return result;}//if (mark == 0 && markLimit > buf.length) {/* the only way to make room when mark=0 is by growing the buffer 如果 markLimit很大 超过 当前的buf.length, 就需要 扩充 buf了 */int newLength = buf.length * 2;if (newLength > markLimit) {newLength = markLimit;}char[] newbuf = new char[newLength];System.arraycopy(buf, 0, newbuf, 0, buf.length);buf = newbuf;} else if (mark > 0) {//mark > 0 就 截断buf 获取 mark 后面的 buf 从mark以后开始计算buf/* make room by shifting the buffered data to left mark positions */System.arraycopy(buf, mark, buf, 0, buf.length - mark);pos -= mark;//回到mark的位置end -= mark;mark = 0;}/* Set the new position and mark position */int count = in.read(buf, pos, buf.length - pos);if (count != -1) {end += count;}return count;}
JAVA中mark()和reset()用法的通俗理解mark就像书签一样,在这个BufferedReader对应的buffer里作个标记,以后再调用reset时就可以再回到这个mark过的地方。mark方法有个参数,通过这个整型参数,你告诉系统,希望在读出这么多个字符之前,这个mark保持有效。读过这么多字符之后,系统可以使mark不再有效,而你不能觉得奇怪或怪罪它。这跟buffer有关,如果你需要很长的距离,那么系统就必须分配很大的buffer来保持你的mark。 //eg. //reader is a BufferedReader reader.mark(50);//要求在50个字符之内,这个mark应该保持有效,系统会保证buffer至少可以存储50个字符 int a = reader.read();//读了一个字符 int b = reader.read();//又读了一个字符 //做了某些处理,发现需要再读一次 reader.reset(); reader.read();//读到的字符和a相同 reader.read();//读到的字符和b相同