LINQ is a shorthand for Language Integrated Query, which is a technology for integrating capabilities for querying data directly into the C# language.
What does that mean? Well, in short, it means that you can use the query syntax to query, filter, aggregate or group data with a minimum amount of code – meaning that you often do not have to write an ordering or grouping mechanism of your own, from scratch. You “only” need to write a query part of it in order to be able to use the results.
Let’s dive right into it by using an example:
// Data Source var customers = new[] { new { FirstName = "Kalia", LastName = "Burt" }, new { FirstName = "Christine", LastName = "Carver" }, new { FirstName = "Tanisha", LastName = "Dotson" }, new { FirstName = "Josiah", LastName = "Morales" }, new { FirstName = "Venus", LastName = "Hubbard" }, new { FirstName = "Jelani", LastName = "Miles" }, new { FirstName = "Rooney", LastName = "Herrera" }, new { FirstName = "Bert", LastName = "Moon" }, new { FirstName = "Bort", LastName = "Moon" } }; // Query Expression var relevantCustomers = from customer in customers where customer.LastName == "Moon" select customer; // Execute the query. foreach (var customer in relevantCustomers) { Console.WriteLine($"{customer.LastName}, {customer.FirstName}"); } // Output: // Moon, Bert // Moon, Bort
See how easy that was? From our data, we selected relevant customers using only one line. No helper methods, and no need for custom filtering.
But what is really awesome about LINQ is that you can use multiple LINQ-enabled data sources with the same basic expression. If you are an experienced developer, you must have seen things like:
var post = db.Posts.where(o => o.Id == 1).FirstOrDefault();
And you guessed correctly, that is a LINQ expression, meaning that you don’t have to already have results in memory to be able to use LINQ, meaning a LINQ Query can be used to retrieve data from a SQL Database or an XML file.
But we might be getting a bit ahead of ourselves. First, let’s try and see what are various ways to write an Expression. First, we have the Query syntax. The Query syntax is the most readable and often the easiest to understand:
var students = new[] { new { Name = "Fallon Norris", Grade = 4, Email = "fallonnorris7107@outlook.ca", Zip = "52462" }, new { Name = "Zenaida Pace", Grade = 5, Email = "zenaidapace8936@protonmail.ca", Zip = "55596" }, new { Name = "Dora Fry", Grade = 2, Email = "dorafry@google.net", Zip = "57232" }, new { Name = "Basil Farrell", Grade = 3, Email = "basilfarrell4451@icloud.com", Zip = "41748" }, new { Name = "Cairo Galloway", Grade = 3, Email = "cairogalloway1741@hotmail.ca", Zip = "34176" } }; // Query var goodStudents = from student in students where student.Grade > 3 orderby student.Grade descending select student; // Execute the query. foreach (var student in goodStudents) { Console.WriteLine($"{student.Name}: {student.Grade}"); } // Zenaida Pace: 5 // Fallon Norris: 4
Then, we have the Method syntax. You will probably see and use this syntax most often, at least that has been my experience, even if I find the Query syntax more naturally readable.
Yes, you might scoff at this notion, but just wait until we come to the more complex joins and aggregates.
The important thing to stress is that the performance is identical.
var averageStudents = students .Where(s => s.Grade <= 3) .OrderByDescending(s => s.Grade); // Execute the query. foreach (var student in averageStudents) { Console.WriteLine($"{student.Name}: {student.Grade}"); } // Basil Farrell: 3 // Cairo Galloway: 3 // Dora Fry: 2
And lastly, we can also have Mixed syntax. Mixed syntax means that we use both Query syntax and Method syntax in the same query to achieve our results:
var extraordinaryStudents = from student in students where student.Grade == 5 select student; int numberOFExtraordinaryStudents = extraordinaryStudents.Count(); // OR EVEN int numberOfExtraordinaryStudents2 = ( from s in students where s.Grade > 4 select s ).Count();
In short, you can use LINQ to write queries that can filter, order or sort the data. What also might be important to know is that query is simply a set of instructions that are executed when the result is requested:
In the screenshot above, the goodStudentsList variable stores the results of a goodStudents query.
It is often a good practice to store the results in a new variable, as LINQ does something that is called Deferred Query Execution. When you write a query, the results are not stored in the query variable – only the command definition is stored there. Only at a query execution time, the query is evaluated and the results are created.
So, when is a query executed? In this example, the query is executed when we iterate over the query variable:
var averageStudents = students .Where(s => s.Grade <= 3) .OrderByDescending(s => s.Grade); // Execute the query. foreach (var student in averageStudents) { Console.WriteLine($"{student.Name}: {student.Grade}"); }
In this example, the query is executed when the ToList() method is called:
// Query var goodStudents = from student in students where student.Grade > 3 orderby student.Grade descending select student; // Execution var goodStudentsList = goodStudents.ToList();
In short, LINQ is lazy, it’s only evaluated when the results are needed. As you can imagine, that enables us to reuse the same query to get different results after the source data changes, but can also lead to performance problems, if you would, for example, need the list of students in multiple places and you would iterate over the query variable two times, the filtering would be applied multiple times.
I hope this small introduction to LINQ was at least a bit useful. In any case, it was necessary, as I plan to amuse myself with writing more complex LINQ queries in hopes they solve at least somewhat more common problems when it comes to the tasks of filtering, sorting and aggregating data. Until next time, I wish you all the best.