Gap buffer
A gap buffer in computer science is a dynamic array that allows efficient insertion and deletion operations clustered near the same location. Gap buffers are especially common in text editors, where most changes to the text occur at or near the current location of the cursor. The text is stored in a large buffer in two contiguous segments, with a gap between them for inserting new text. Moving the cursor involves copying text from one side of the gap to the other (sometimes copying is delayed until the next operation that changes the text). Insertion adds new text at the end of the first segment; deletion deletes it.
Text in a gap buffer is represented as two strings, which take very little extra space and which can be searched and displayed very quickly, compared to more sophisticated data structures such as linked lists. However, operations at different locations in the text and ones that fill the gap (requiring a new gap to be created) may require copying most of the text, which is especially inefficient for large files. The use of gap buffers is based on the assumption that such recopying occurs rarely enough that its cost can be amortized over the more common cheap operations. This makes the gap buffer a simpler alternative to the rope for use in text editors[1] such as Emacs.[2]
Example
Below are some examples of operations with buffer gaps. The gap is represented by the empty space between the square brackets. This representation is a bit misleading: in a typical implementation, the endpoints of the gap are tracked using pointers or array indices, and the contents of the gap are ignored; this allows, for example, deletions to be done by adjusting a pointer without changing the text in the buffer. It is a common programming practice to use a semi-open interval for the gap pointers, i.e. the start-of-gap points to the invalid character following the last character in the first buffer, and the end-of-gap points to the first valid character in the second buffer (or equivalently, the pointers are considered to point "between" characters).
Initial state:
This is the way [ ]out.
User inserts some new text:
This is the way the world started [ ]out.
User moves the cursor before "started"; system moves "started " from the first buffer to the second buffer.
This is the way the world [ ]started out.
User adds text filling the gap; system creates new gap:
This is the way the world as we know it [ ]started out.
See also
- Dynamic array, the special case of a gap buffer where the gap is always at the end
- Zipper (data structure), conceptually a generalization of the gap buffer.
- Linked list
- Circular buffer
- Rope (computer science)
- Piece table - data structure used by Bravo and Microsoft Word
References
- Mark C. Chu-Carroll. "Gap Buffers, or, Don’t Get Tied Up With Ropes?" ScienceBlogs, 2009-02-18. Accessed 2013-01-30.
- emacs gap buffer info Accessed 2013-01-30.
External links
- Implementation in C
- Overview and implementation in .NET/C#
- Brief overview and sample C++ code
- Implementation of a cyclic sorted gap buffer in .NET/C#
- Use of gap buffer in early editor. (First written somewhere between 1969 and 1971)
- emacs gap buffer info(Emacs gap buffer reference)
- Flexichain: An editable sequence and its gap-buffer implementation
- Text showdown: Gap Buffers vs Ropes