List in C#: Implementation and Features

List is one of the most used collections in C#. Let’s inspect the features of the list and see how parts of it are implemented.

this article is about list<टी> From system.collection.generic namespace. More specifically, the article describes some features of the internal implementation and list<टी>, This is the most commonly used collection in C#, and it’s not just my opinion – that’s what Andrew Troelsen, Philippe Japiks and Jon Skeet write about in their books. And it makes sense: list<टी> It’s easy to work with. It is quite flexible and thus covers most of the daily tasks for developers. It also has a large number of helper methods. The capabilities of collections are further expanded with the help of LINQ.

inside the list<टी>

source code of list<टी> The class is available on GitHub. This means that we can see its implementation. Let us analyze the most important aspects.

list<टी> A class is a sequentially and dynamically resizable list of objects. under the hood, list<टी> Based on an array.

list<टी> The square has three main areas:

  1. Tea[] _Belongings is an internal array. The list is created based on this array.
  2. int _size Stores information about the number of items in the list.
  3. int _version Stores the version of the archive.

How to add an item to a list

The list is dynamically resizable. What happens when we add an item to the list?

public void Add(T item)
{
  _version++;
  T[] array = _items;
  int size = _size;
  if ((uint)size < (uint)array.Length)
  {
    _size = size + 1;
    array[size] = item;
  }
  else
  {
    AddWithResize(item);
  }
}

First, the value of _Edition The area has increased by one. We will examine it a little later in this article to see why this happens. After that, two local variables are created – array with elements of Tea type and Shape Of Integer type. These variables are assigned the values ​​of the appropriate fields. Next, if there is still space for an element in an array, the array element is replaced with size + 1 index. If there is no more space left in an array, then AddWithResize method is called.

private void AddWithResize(T item)
{
  Debug.Assert(_size == _items.Length);
  int size = _size;
  Grow(size + 1);
  _size = size + 1;
  _items[size] = item;
}

right here Grow The method is called to increase the current size of the internal array. Then, the same actions are performed as add Method to add elements if available space is left.

Let’s take a closer look at the Grow method:

private void Grow(int capacity)
{
  ....

  int newcapacity = _items.Length == 0 ? DefaultCapacity : 2 * _items.Length;

  if ((uint)newcapacity > Array.MaxLength) newcapacity = Array.MaxLength;

  if (newcapacity < capacity) newcapacity = capacity;

  Capacity = newcapacity;
}

algorithm of Grow The method is as follows:

  • If an internal array is empty, the capacity of the list is four, otherwise the length of the array is doubled;
  • If the new capacity value exceeds the maximum possible length of the array, the capacity will be equal to Array.MaxLength,
  • If the new value of the collection capacity is less than the current capacity, the new capacity will be equal to the current capacity;
  • eventually, new capacity has written to ability Property.

Why do we need the _version field?

why do we need _Edition Field whose value changed add method? As mentioned above, this field allows you to track the inventory version. The value of this field is checked when the list is enumerated. For example, let’s see For each method:

public void ForEach(Action<T> action)
{
  ....
  int version = _version;

  for (int i = 0; i < _size; i++)
  {
    if (version != _version)
    {
      break;
    }
    action(_items[i]);
  }

  if (version != _version)
    ThrowHelper
      .ThrowInvalidOperationException_InvalidOperation_EnumFailedVersion();
}

Before the start of the calculation, the value of _Edition The field is saved in a variable. If the list is changed during the traversal, the traversal is stopped, and System.InvalidOperationException is thrown. _Edition Fields are tracked as list<टी>.Counter, So, if we change while traversing the list For eachThis will result in an exception being thrown as well.

ability

list<टी> There is a constructor that takes a number (initial capacity) as the first argument.

List<int> list = new List<int>(8);

If a developer knows in advance what list size they need, they can set it. This eliminates unnecessary copy operations and memory allocation for a new array when a new item is added.

By the way, we can also manage the size of the internal array by using ability Property:

Let’s look at the code for this property:

public int Capacity
{
  get => _items.Length;
  set
  {
    if (value < _size)
    {
      ThrowHelper.ThrowArgumentOutOfRangeException(....);
    }

    if (value != _items.Length)
    {
      if (value > 0)
      {
        T[] newItems = new T[value];
        if (_size > 0)
        {
          Array.Copy(_items, newItems, _size);
        }
        _items = newItems;
      }
      else
      {
        _items = s_emptyArray;
      }
    }
  }
}

Received returns the accessor _item.length The value, that is, the length of the internal array.

group The accessor works as follows:

  • If value If the number of items in the collection is less than that, an exception is thrown.
  • If value is not equal to the length of the internal array and value is greater than zero, a new array whose capacity is equal to value has been made;
  • If the number of items in the list is greater than zero, the items from the old array are copied to the new one;
  • If value is null, then an empty array is assigned to the field, which is an internal array.

Other features of the list ways

insert

The insert method allows us to insert an item into the collection only within the bounds of the collection. If the number of items in the collection is equal to the size of the inner array, the array’s capacity is increased via grow(_size+1) way. If we try to insert an item by an index which is greater than inventory count, System.ArgumentOutOfRangeException is thrown.

List<string> list = new List<string>() { "1", "2"};
list.Insert(1, "10"); // OK
list.Insert(2, "15"); // OK
list.Insert(10, 12); // throw exception

Such behavior persists even with explicit management of the size of the internal array.

look at the example:

List<string> list = new List<string>() { "1", "2"};
list.Capacity = 8;
list.Insert(3, "3");

ability The property is assigned eight, and the internal array is resized. However, this does not allow us to insert an item in a position greater than inventory count, As a result of executing the code, an exception is thrown.

clean

This method clears the collection. As a result of this operation, to count property will be null. Items of reference type get a default value. If collection items are structures and have fields of reference type, these fields also get a default value. Note that the size of the internal array remains unchanged. if before clean call ability property was equal to eight, after that clean, The size of the array remains equal to eight. To clear the memory allocated for the array, we need to call trimaccess Later clean,

trimaccess

This method makes the size of the internal array equal to the number of items in the list. We should use it when we know that no more items will be added to the collection.

list.Clear();
list.TrimExcess();

sort and order

There are several differences between these two methods:

  • sort falls under the law list<टी> class, while order by LINQ to is an extension method;
  • sort method modifies the initial collection, while order by returns the sorted copy with IOrderedEnumerable type;
  • order by The method performs static sorting; does not serialize. if we use sort method, equivalent objects can be reordered.

a bit about performance

list vs ArrayList

list<टी> have in common. This means that when we create a list, we have to specify what types of objects it works with.

List<string> list = new List<string>();

Jeffrey Richter in his book “CLR Via C#” describes the following advantages of generics:

  • source code security;
  • security type;
  • cleaner code;
  • better performance.

In this same book (beginning of 12th chapter) there is a good example of comparison list<टी> And array list, its non-generic analogue. The essence of this test is to add an item to the list and assign the same item to a variable ten million times.

Below is an example of a test array list With a value type:

public void ValueTypeArrayList()
{
  ArrayList a = new ArrayList();
  for (Int32 n = 0; n < _count; n++)
  {
    a.Add(n);
    Int32 x = (Int32)a[n];
  }
}

was tested with items of value (int32) and reference (string) type.

After rewriting the code given in the book and testing it with benchmark.net, I got the following results:

We can see from the results that list<टी> works with algorithms int32 much faster than array list with int32, 13 times faster! In addition, memory is allocated four times less list<टी>,

Due to the fact that many packings are made during the operation array list The number of operations, garbage collection also increases. And getting an item requires unpacking. All this leads to a decrease in performance.

The difference with reference types is insignificant because there are no packing and unpacking operations, and these operations are heavy. Depending on the code, a small difference in speed appears due to the type conversion operation.

Benefits of assigning value to capacity

As mentioned above, if a developer knows the size of the list in advance, they can specify it.

Let’s do a little test.

public void ListWithoutCapacity()
{
  for (int i = 0; i < Count; i++)
  {
    List<int> list = new List<int>();
    for (int j = 0; j < Length; j++)
    {
      list.Add(j);
    }
  }
}

Here, 150,000 items are added list, Let’s do this operation 1000 times. Then, compare the performance with the same method but with the specified abilitywhich is equal to the number of addition operations.

The results show that the time taken to execute the method without ability Twice more than the preset. Also, in this case the memory is allocated four times as much. Such operations eliminate the 17 unnecessary copy operations on each iteration of the outer loop.

What’s the fastest way to determine if a list contains items?

Let’s take three options for determining that the list is not empty:

  1. Use to count LINQ to method and compare result to 0;
  2. Use to count Compare property and result to 0;
  3. Use any LINQ to Extension Method.

After testing, we get the following results for a list of 1 500 000 items:

Accessing the Fastest Option to count property since it returns the value of _Shape Farm.

to count The method tries to convert the initial collection ICollection, If the conversion is successful, the method returns the value of to count Property. If the conversion fails, we have to traverse the entire collection to count the number of elements. Luckily, list<टी> This is the interface.

any method returns Truth If it finds at least one element in the collection.

conclusion

we can safely say that list<टी> There is a more user friendly version of arrays. For example, it is more convenient to work with a list when the number of sequence elements is unknown in advance.

There are many more collections in C# which help developers in their work. Some of them are more specific, and some of them less so. I hope this article helps you too, and makes your understanding of lists a little better :).

Leave a Comment