-
-
Notifications
You must be signed in to change notification settings - Fork 345
C# Tips and Tricks
LINQ is a .NET library for processing sequences of items. It is handy when you need to convert data from one format to another, or otherwise shuffle and adjust lists and arrays. It also facilitates and encourages a more functional style in which side effects and mutability are de-emphasized. We use it extensively in CKAN.
using System.Linq;
A lambda is a function without a name; that is, it accepts zero or more parameters, and it returns a value. Lambdas are useful for passing predicates or callbacks to other code without having to define a full proper function in a class every time. LINQ code typically makes heavy use of lambdas, because most LINQ functions accept other functions as parameters to specify how to do their work.
There are multiple ways to create a lambda in C#, but in CKAN we usually use the =>
operator. First comes a new variable name for the parameter, then the =>
operator, followed by the expression to return. If you need more complex logic than can be fit into a simple expression, you can use a code block enclosed in curly braces.
var myLambda = (x => x + 2);
var myLongLambda = (x =>
{
int y = 10;
y = y * x;
return x + y;
});
The default return type for most LINQ functions is IEnumerable<T>
. This interface represents a generic sequence of elements of some type and is implemented by all of the common generic collection classes like Array
and List
. You can call a LINQ function on any object of those types.
For Dictionary<KeyType, ValueType>
objects, T
is KeyValuePair<KeyType, ValueType>
. You can use LINQ to treat a dictionary as a sequence, but each "element" will be made up of some key and value in a pair structure.
If there are elements in a sequence that you want to exclude, you can use Where
to select which ones to keep:
int[] numbers = new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8 };
var odds = numbers.Where(element => element % 2 == 1);
If you want to replace each element in a sequence with a value generated from it, you can use Select
to apply an expression to each element, similar to map
in other languages:
int[] numbers = new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8 };
var squares = numbers.Select(element => element * element);
You can ensure that a sequence has no duplicated elements simply by calling Distinct
:
int[] withDuplicates = new int[] { 0, 1, 2, 1, 3, 1, 4, 1, 5 };
var nonDuplicated = withDuplicates.Distinct();
You can also group identical or similar elements with GroupBy
, which returns a sequence of groups, each of which is a subsequence of the original plus a Key
property identifying the group:
int[] withCommonSquares = new int[] { -4, -3, -2, -1, 0, 1, 2, 3, 4 };
var groupedBySquares = withCommonSquares.GroupBy(element => element * element);
foreach (var group in groupedBySquares)
{
Console.WriteLine("Processing group {0}", group.Key);
Console.WriteLine("Elements: {0}", string.join(", ", group));
}
The OrderBy
function can be used to rearrange the elements of a sequence according to the value of some expression based on each element.
string[] unsorted = new string[] { "Einstein", "Bohr", "Feynman", "Planck", "Maxwell" };
var sortedBySecondChar = unsorted.OrderBy(element => element[1]);
To convert a LINQ expression to a specific type of collection, several helper functions are provided:
return original.ToList();
return original.ToArray();
return original.ToDictionary(element => element.MakeKey(),
element => element.MakeValue());
Since LINQ functions work with IEnumerable<T>
sequences, and also return those same sequences, it's possible to call a LINQ function directly on the return value of another LINQ function. You can exploit this to do a great deal of complex processing of a sequence in very few lines:
return original.Where(x => x.IsGoodElement())
.Select(x => x.ImportantProperty())
.Distinct()
.OrderBy(x => x.SortingProperty())
.ToArray();
You may be surprised to learn that when most of the above examples execute, no calculations are performed! By default, most LINQ functions are lazily evaluated; rather than returning a simple list of all elements, they return a special object called an enumerator that will generate the elements as needed.
This has an important consequence for performance: Enumerators only generate as many elements as they need to! So if you have a potentially large sequence, but you only need the first few elements, LINQ will only calculate the first few elements, meaning you do not pay CPU cycles for the ones you won't use, and you don't have to write any special "stop early" logic.
The yield return
statement can be used to write a lazily evaluated function without LINQ. Be careful, though: sequence elements are not remembered after they are provided to your code! If your function does something expensive at the start and then returns the elements lazily, the expensive part will be performed again and again if you use the start of the sequence more than once!
var mySequence = myLazyFunc();
// 1. First evaluation happens here
if (mySequence.Any())
{
Console.WriteLine("OK");
}
// 2. Second evaluation happens here
foreach (var elt in mySequence)
{
// Do stuff with elt
}
// 3. Third evaluation happens here
var stuff = mySequence.Select(x => x.Something()).ToList();
You can reduce this to a single evaluation by creating a normal list with ToList
. Of course this means you lose the benefits of lazy evaluation as well, as the entire sequence will be calculated and stored; you may want to try the CKAN extension function Memoize
to get the best of both worlds, an enumerator that only generates as many elements as needed and only calculates them once even if you re-use them.
Contact us on the KSP forum or on our Discord server