Feb 10, 2008

Array vs. List

Memory allocation
Most often, arrays are static, with their size defined upon creation. Additionally, the memory allocated for arrays is contiguous. Therefore, they are typically used when the maximum number of elements is known at design time. The drawback to this approach is that large arrays require large amounts of memory, which may go unused, especially those designed for a maximum number of elements that will often not approach their capacity. And on some platforms, such as certain handheld devices that use older operating systems, memory constraints could limit the size of the arrays you can use.

On the other hand, linked lists are usually dynamic. They can grow and shrink as needed at runtime. Due to this trait, linked lists are more appealing when the number of elements is unknown. Also, the linked list memory is allocated on an element-by-element basis and thus is rarely contiguous. The downside to being able to deal with uncertainty is that adding and deleting elements to linked lists requires more overhead than merely assigning values to preallocated array elements. But the only limits on how much memory may be allocated to a linked list are imposed by the size of the memory heap used by the application.

Accessing elements
The elements within arrays are accessed by their indices. Thus, data access is easy and fast if you know which element to retrieve. If you don’t know the index of the element needed, but the elements are sorted based on some key value, you can perform highly efficient search algorithms to locate specific elements. These algorithms allow only a minimal number of comparisons to locate a unique element. There are also several established and efficient algorithms for sorting and merging arrays. However, arrays are inefficient when the ordering of their elements is likely to change. Maintaining a sorted array upon element deletion or insertion could require the transfer of every element in the array.

Linked lists are usually traversed element by element until a match is found. Because the memory for linked lists is not guaranteed to be contiguous, this list traversal is the only method for searching the list (without involving the use of other data structures as indices). The upside of noncontiguous memory is that reordering the list simply involves manipulating the links. Insertion or deletion of an element requires only a couple of pointer modifications. The transfer of the actual data isn’t required at all.

Breaking the rules
Using language-specific constructs may allow for the best of both worlds. With C, pointers to variables or objects can be used as arrays of the corresponding type if they are pointed to the first element in an allocated array. This allows a pointer to be used as an array, but when resizing is necessary, the realloc() function allocates a new block of memory and transfers all existing elements to the new location. This technique allows for dynamic resizing of an array while maintaining contiguous memory and element indexing.

With Java, the provided linked-list class offers an indexed linked list that supports all of the standard list methods (top, next, previous, etc.) as well as indexed operation. The indexOf(), get(), and set() methods allow array-like access to the elements of the list. Additionally, Java provides an ArrayList class that represents a resizable-array implementation of the list class. Both of these classes support methods for returning true arrays from their list representations.

Programming languages continue to become more advanced, and there is less distinction between the various types of data implementations as their structures expand to include the strengths and correct the deficiencies found in the standard models. However, it will always be important to remember where these structures originated and how they are still used within the newer classes. Although these newer implementations hide the details from the programmer, the computational overhead and resources required do not change.

Making the decision
If your data is best represented using a multidimensional structure, or the number of elements is known in advance and will remain consistent, an array is best. If your data is easily represented in one dimension, and the number of elements is unknown or is expected to change often throughout the operation of your program, a linked list is more efficient.

If your data will be searched and accessed often but will change infrequently, the array offers the least overhead for your expected operations. If you expect to be regularly adding or subtracting elements, especially if you need to maintain a sorted order, the versatility of the linked list will be of greater benefit.

In the end, your data and your operational needs will decide which structure is best for your application.

No comments: