Getting Started with LINQ (3/3)

Getting Started with LINQ (3/3)

Language Integrated Query or LINQ is a C# feature that allows you to query different data sources using a unified language syntax.

In the first two parts we learned what is LINQ, when it is used and then went through every-day operations, projection, filtering, sorting, set operations, aggregation, etc. If you’re not familiar, I suggest taking a peek:

Getting Started with LINQ (1/3)
Getting Started with LINQ (2/3)

In this final part we’ll go over methods to generate, partition, compare and join collections. Once again, we’ll be working a List imported from JSON. Here, you can aquire the data as well as learn how to import it into C#.

GENERATORS

Empty Collection

Let’s kick things off with creating empty collections. This is one way to declare an empty Students collection:

var = Enumerable.Empty<Student>();

Alternatively you can do this too:

var emptyCollection = new List<Student>();

In this example, imagine you have to return an empty collection from a method. Here is one way to do it:

public IEnumerable<Student> GetStudents()
{
return Enumerable.Empty<Student>();
}

Or declare a new empty List when the return type is List<T>:

public List<Student> GetStudents()
{
return new List<Student>();
}

If you’re using .NET 8 or above you can reduce the number of steps by using a collection expression []. This works both for IEnumerable<T> and the List<T>:

public IEnumerable<Student> GetStudents()
{
return [];
}

public List<Student> GetStudents()
{
return [];
}

This also works when declaring empty collections:

List<Student> emptyCollection = [];

Range

This operator is used to generate an IEnumerable for a specified range:

IEnumerable<int> fiveNumbers = Enumerable.Range(1, 5);
// [1, 2, 3, 4, 5]

We can also use Range to generate a new collection of students:

IEnumerable<Student> newStudents = Enumerable.Range(1, 3).Select(count =>
{ // creates new Student for each count (1 – 3)
return new Student()
{
ID = count,
Name = “Placeholder”,
Country = “Placeholder”,
Age = count * 10
};
});

Repeat

The Repeat operator generates a sequence that contains one repeated value. For example let’s create 5 exact same students:

var repeatTimes = 5;

var csharpStudent = Enumerable.Repeat(new Student()
{
ID = 1,
Name = “C#”,
Country = string.Empty
}, repeatTimes);

PARTITION

Take, TakeLast, TakeWhile

The Take operator is used to limit the number of items in the collection.

IEnumerable<string> firstThreeNames = students.Select(s => s.Name).Take(3);
// [“Mirza”, “Armin”, “Alan”]

IEnumerable<string> lastThreeNames = students.Select(s => s.Name).TakeLast(3);
// [“Eddy”, “Abdurahman”, “Amy”]

If the number specified is greater than the number of elements in the collection, the Take operator will trim the list at the last element (without throwing errors).

The TakeWhile operator returns all the elements that satisfy the specified condition and skips the rest. In this example, we limit the collection to students under 20 years old.

IEnumerable<string> teens = students
.OrderBy(s => s.Age) // order from youngest to oldest
.TakeWhile(s => s.Age < 20) // take younger than 20. ignore the rest
.Select(s => s.Name); // take only names
// [“Mirza”, “Farook”, “Alan”, “Eddy”, “Abdurahman”]

Skip, SkipLast, SkipWhile

The Skip operator ignores all the elements until a point specified.

IEnumerable<string> lastFive = students.Skip(5).Select(s => s.Name);
// [“Raj”, “Nihad”, “Eddy”, “Abdurahman”, “Amy”] (remaining five)

IEnumerable<string> firstFive = students.SkipLast(5).Select(s => s.Name);
// [“Mirza”, “Armin”, “Alan”, “Seid”, “Farook”]

Here we’re ignoring all the students that are under 20.

IEnumerable<string> teens = students
.OrderBy(s => s.Age)
.SkipWhile(s => s.Age < 20)
.Select(s => s.Name);
// [“Armin”, “Raj”, “Nihad”, “Seid”, “Amy”]

The Skip and Take operators are commonly used when creating API pagination.

EQUALITY

SequenceEqual

The SequenceEqual operator is used to compare two collection to determine if they’re equal or not. Let’s demonstrate that with a simple example:

string[] countries = { “Bosnia”, “UK”, “Turkey” };
string[] countries2 = { “Bosnia”, “UK”, “Turkey” };

var isEqual = countries.SequenceEqual(countries2);
// true

If we’d change the order in either collection, the output would not evaluate to true.

string[] countries = { “Bosnia”, “UK”, “Turkey” };
string[] countries2 = { “Turkey”, “Bosnia”, “UK” };

var isEqual = countries.SequenceEqual(countries2);
// false

Back to the students collection, we can filter out student objects from the same country and compare the results:

// I created this to avoid writing the predicate `s => s.Country == “Bosnia”` twice
Func<Student, bool> isFromBosnia = s => s.Country == “Bosnia”;

var studentsFromBosnia = students.TakeWhile(isFromBosnia);
var studentsFromBosnia2 = students.Where(isFromBosnia);

var isEqual = studentsFromBosnia.SequenceEqual(studentsFromBosnia2);
// true

JOINS

In this section we’ll look at various ways to combine collections.

Zip

The Zip operator in LINQ pairs elements from two collections based on their positions (indexes). Let’s create a new collection that will be paired with countries collection:

int[] countryCodes = { 387, 44, 20, 90, 91, 86, 1 };
var distinctCountries = students.DistinctBy(s => s.Country).Select(s => s.Country);

Now let’s merge the two using Zip:

var countriesMerge = countryCodes.Zip(distinctCountries);

Concat

The Concat operator is used to concatenate (join) multiple collections together.

int[] nums = { 1, 2, 3, 4, 5 };
int[] newNums = { 100, 2, 300, 4, 500 };

var totalNums = nums.Concat(newNums);

The Concat operator seems similar to the Union. Both join operators collections. However, there are some differences when using the Concat:

No duplicate elements were removed
The order is preserved
The second collection is added to the end of the first

Let’s join a collection of students ages with another randomly-generated ages collection and combine the result.

var studentsAges = students.Select(s => s.Age);
var eldersAges = Enumerable.Range(1, 10).Select(_ =>
{
// I’m randomly generating an age between 65 & 100
var random = new Random();
int minAge = 65;
int maxAge = 100;

// Add adding random age into the eldersAges collection
return random.Next(minAge, maxAge);
});

IEnumerable<int> combinedAges = studentsAges.Concat(eldersAges);
int totalAges = combinedAges.Count(); // 20

SelectMany

In the students.json file we’re using, we know that each student object has a classes property, which represents an array of objects. How can we access those?

{
“ID”: 1,
“Name”: “Mirza”,
“Age”: 18,
“Country”: “Bosnia”,
“Classes”: [
{
“ID”: 1,
“Title”: “CAD”
},
{
“ID”: 2,
“Title”: “IT”
}
]
}

Bad way

The first choice would be to use the Select() projection operator:

var classesList = students.Select(s => s.Classes);

Since s.Classes is the collection as well, the variable classesList is of type IEnumerable<List<Classes>>. To get the list of titles we need to loop through outer collection and then use Select in the inner collection:

var classesList = new List<string>();

foreach (var stud in students)
{
foreach (var cl in stud.Classes)
{
classesList.Add(cl.Title);
}
}

However, there is a simpler way.

Better way

Using the SelectMany() projection operator we can drill into the inner array with ease.

IEnumerable<string> classTitles = students
.SelectMany(s =>
s.Classes.Select(s => s.Title)
);

The SelectMany() acts like a join between the outer and inner array.

If we expand our student object with a new Hobbies property that contains a List:

public class Student {
….
public List<string> Hobbies { get; set; } = [];
}

And then add a few to one of our students:

students.First().Hobbies = new List<string> { “Games”, “Hiking”, “Blogging” };

We can easily extract it once again the SelectMany():

var hobbies = students.SelectMany(s => s.Hobbies);

As opposed to doing:

var hobbiesList = new List<string>();

foreach (var stud in students)
{
foreach (var hob in stud.Hobbies)
{
hobbiesList.Add(hob);
}
}

Alternative SelectMany

The SelectMany() also has the second mode that accepts two parameters:

The first parameter is again the collection we’re trying to extract
The second is a function containing data of the original collection and the inner we’re trying to extract

var data = students.SelectMany(
s => s.Hobbies,
// original collection, inner collection
(student, hobbies) => }
);

Let’s use this to create a combination of student names and hobbies:

var hobbies = students.SelectMany(
s => s.Hobbies,
(student, hobbies) => new { Name = student.Name, Hobbies = hobbies }
);

The output is the name of thes student followed by their hobby.

Join

The Join operator is used to create a combination of two collections. For this example I created a new countries collection that we’ll join with the students collection.

public class Country
{
public int ID { get; set; }
public string Name { get; set; }
public string CapitalCity { get; set; }
public string Continent { get; set; }
}
var countries = new List<Country>
{
new Country() { ID = 1, Name = “Bosnia”, CapitalCity = “Sarajevo”, Continent = “Europe” },
new Country() { ID = 2, Name = “UK”, CapitalCity = “London”, Continent = “Europe” },
new Country() { ID = 3, Name = “Egypt”, CapitalCity = “Cairo”, Continent = “Africa” },
new Country() { ID = 4, Name = “Turkey”, CapitalCity = “Ankara”, Continent = “Asia” },
new Country() { ID = 5, Name = “India”, CapitalCity = “New Delhi”, Continent = “Asia” },
new Country() { ID = 6, Name = “China”, CapitalCity = “Beijing”, Continent = “Asia” },
new Country() { ID = 7, Name = “USA”, CapitalCity = “Washington”, Continent = “North America” },
// Countries below have no students:
new Country() { ID = 8, Name = “Croatia”, CapitalCity = “Zagreb”, Continent = “Europe” },
new Country() { ID = 9, Name = “Serbia”, CapitalCity = “Belgrade”, Continent = “Europe” },
};

All students have a country property and we’ll use that to link the two collections. Here is a basic join:

var studentsCountriesJoin = students.Join(
countries,
student => student.Country,
country => country.Name,
((student, country) => ( student: student.Name, continent: country.Continent ))
);

Let’s clarify what happened here.

The students is the outer collection that is joining the inner collection (countries). That’s the part students.Join(countries)

Then we determine on what property we are going to join the two. We join two on the country name:

student => student.Country,
country => country.Name,

// SQL equivalent
ON Student.Country = Country.Name

Then we group the two (student, country)

And then we decide what we’re going to return. In this case it’s a collection with two properties, student and continent:

( student: student.Name, continent: country.Continent )

The outcome of the join

var studentsCountriesJoin = students.Join(
countries,
student => student.Country,
country => country.Name,
((student, country) => ( student: student.Name, continent: country.Continent ))
);

is the following collection:

LINQ also allows join by multiple properties as well as applying multiple Joins. More on in the video.

Join & Group

Let’s again join students and countries and then group students by continents they’re from. The desired structure will look like:

{
“Europe”: [List of students where continent is “Europe”],
“Africa”: [List of students where continent is “Africa”],

}
var studentsByContinents = students
.Join(
countries,
student => student.Country,
country => country.Name,
// Note we do not need to specify { Name = student.Name, Continent = country.Continent }
// C# will do that for us
((student, country) => new { student.Name, country.Continent })
)
// Now comes the groupping part
.GroupBy(g => g.Continent)
.ToDictionary(
// The continent is the key
g => g.Key,
// Value is the list of student names
g => g.Select(sc => sc.Name).ToList());

GroupJoin

The GroupJoin operator is used to group elements from the second sequence (right side) that match each element from the first sequence (left side). It produces a hierarchical result set.

To get started, let’s look again at our join of students and countries.

var studentsCountriesJoin = students.Join(
countries,
student => student.Country,
country => country.Name,
(student, country) => new { student.Name, Country = country.Name, country.Continent }
);

We know that the output here is going to be a collection containing a student name, country and the continent belong to that student. Now let’s see the GroupJoin:

var studentsGroupedByCountry = countries.GroupJoin(
students,
country => country.Name,
student => student.Country,
(country, studentsGroup) =>
new { country.Continent, Country = country.Name, Students = studentsGroup.Select(s => s.Name) }
);

Let’s analyze what happened here:

First of all, we can see that we groupped students by country while joining
Second thing, the Name is represented as a collection with a Count property, not the actual student names
But the most importantly, we have countries without students. There are no students from countries at the bottom and the groupjoin() is indicating that.

In SQL terms,

The Join() operator represents the INNER JOIN as it produces the result where only the matching elements from both sequences are included (only the students and the countries with students).
The GroupJoin() operator represents the LEFT OUTER JOIN as it produces all records from the inner table (countries), and the matching records from the outer table (students). As we can see all countries are in the result, even those without matching students.

If we’d apply groupping again, the output would be the same thing we had above with Join & Group:

var studentsGroupedByCountry = countries.GroupJoin(
students,
country => country.Name,
student => student.Country,
(country, studentsGroup) => new { country.Continent, Students = studentsGroup.Select(s => s.Name) }
);

var groupedByContinent = studentsGroupedByCountry
.GroupBy(x => x.Continent)
.Select(g => new
{
Continent = g.Key,
Students = g.SelectMany(x => x.Students).ToList()
})
.ToList();

Wrapping Up

That’s all I wanted to share on LINQ. If you learned something new don’t forget to hit the follow button. Also, follow me on Twitter to stay up to date with my upcoming content.

Bye for now 👋

Please follow and like us:
Pin Share