MichaelvdV@Atos

Exploring lesser known C# Language features Part I

Blog Post created by MichaelvdV@Atos on Jun 27, 2011

A lot of us use Microsoft .NET in their daily efforts to create applications for the PI system. It seems there is still somewhat of a 50/50 spread between VB.NET and C# developers. I myself am an avid C# developer, and I really like the language. The fact that I’m not a VB.NET programmer has nothing to do with the language itself, but it’s more about my history as a developer.

 

 

 

 

 

 

 

C# (and .NET in general) has a lot of really nice features, and it keeps getting better with every release. With every release, new major functionality becomes available. For instance, the introduction of generics in .NET 2.0, the introduction of WCF/WPF in .NET 3.0, the introduction of LINQ and lambda’s in .NET 3.5, and now the introduction of dynamic typing in .NET 4.0.

 

 

 

 

 

 

 

A lot of the times, these major changes go accompanied by some lesser changes. A lot of the time, the lesser changes are there to make the bigger changes possible. For instance, LINQ (a big change) relies heavily on ‘Extension Methods’ (a lesser change). Lambdas are also a very big part of LINQ, and these are getting more and more accepted. The acceptance of a more functional programming paradigm embedded into an object oriented language seems to be a very smart decision, that works really well for us programmers.

 

 

 

 

 

 

 

In this blog post series, I would like to have a look at some of the lesser known constructs in C#. It’s sometimes difficult to decide which ones are ‘lesser known’. It could be that you are already familiar with some of them, but I’m sure there will be some that you are not that familiar with!

 

 

 

 

 

 

The ‘yield’ keyword

I think this is a prime example of a lesser known, and therefore lesser used, construct that could really affect your programming style in a good way.

 

 

 

 

 

 

 

The yield keyword is used in what we call iterators. An iterator can be a method, get accessor or operator that returns a collection that can be enumerated. A very good example of using this would be an implementation of the ‘GetEnumerator()’ method. This method is being used when you use a foreach statement to loop over a collection (for instance an array or List<T>). You can only use the yield statement when returning an IEnumerable<T> or derivative.

 

 

 

 

 

 

 

Let’s say, we want to create a method that returns a collection of data. A simple example would be:

 

 

 

 

 

static void Main(string[] args)
        {
            var sinusoid = GetSinus(1, 10);
            Console.WriteLine("Looping");
            foreach (var number in sinusoid)
                Console.WriteLine(number);
            Console.ReadLine();
        }


        public static List GetSinus(double from, double to)
        {
            List returnList = new List();
            Console.WriteLine("Getting sinus");
            for (double i = from; i < to; i++)
            {
                var sin = Math.Sin(i);
                returnList.Add(sin);
            }

            return returnList;
        }

 

 

 

 

As you can see, in order for us to create a collection of sinusoid values, we first create a new empty List<double>. We use a ‘for’ loop to go through the input range, and we add every calculated sinusoid value to the list. Once we are done, we return the entire list.

 

 

 

 

 

 

 

If we run this code, the output will be:

 

 

 

 

 

 

 

4466.output1.png 

 

Let’s introduce a method that does the same, but uses the ‘yield’ keyword.

 

 

 

   public static IEnumerable GetSinus(double from, double to)

        {

            Console.WriteLine("Getting sinus");

            for (double i = from; i < to; i++)

                yield return Math.Sin(i);

        }

 

There are several changes to this method implementation. For instance, the return value is a ‘IEnumerable<double>’. We don’t use a returnlist here. We just loop trough the range using a ‘for’ loop, and we return our calculated sinus value using the ‘yield return’ statement.

 

 

 

 

If we run the application with our new GetSinus method, the output will be:

 

 

 

 

 

 

 

 

 

 

 

2474.output2.png 

 

 

 

 

 

Something has changed in the output… I invite you to have a look at the first code sample, and both output samples. You will notice something is off… We will come back to that later.

 

 

 

 

 

In addition to the ‘yield return <expression>’ statement, we can also use ‘yield break’, which is implemented in the following example:

 

 

 

 

 

 

 

 public static IEnumerable GetSinus(double from, double to)

        {

            Console.WriteLine("Getting sinus");

            for (double i = from; i < to; i++)

                if (i > 5)

                    yield break;

                else

                    yield return Math.Sin(i);

        }

 

 

This example will break off the iteration when i > 5. This means that it will break out of the for loop, and ends the execution of this method. You can for instance use this if you trap exceptions in  your iterator block, and want to end iteration when an error occurs.

 

 

 

 

 

Back to the output of the two samples…

 

 

 

 

 

 

 

 

 

You will notice that the “Looping” and “Getting sinus” writelines are executed in a different order. We can see that in this example the “Getting sinus” writeline is executed after the “Looping” writeline. This means, that the actual method gets executed when running the foreach loop. But, how can this be? We created the variable ‘sinusoid’, and assigned the output of the ‘GetSinus’ method before the foreach loop.

 

 

 

 

 

 

 

This behavior is called ‘Defered Execution’, or sometimes called ‘lazy execution’. This means that an expression is not evaluated until its realized value is needed. In this case, the value isn’t needed until the foreach loop. That’s why the “Getting Sinus” writeline is executed after the ‘Looping” writeline. If you have used LINQ or LINQtoSQL before, you may have noticed this behavior also.

 

 

 

 

 

 

 

For this sample, we can negate this behavior by changing the assignment of the ‘sinusoid’ variable to:

 

var sinusoid = GetSinus(1, 10).ToList();

 

 

 

In this case, the realized values of the GetSinus iterator are needed immediately, because we are using the .ToList() extension method to create List<double>.

 

 

 

 

 

The purpose of ‘Defered Execution’ is to improve performance when manipulating large data collections. This is especially true when working with multiple ‘chained’ queries or manipulations.

 

 

 

 

 

One effect of this lazy evaluation is that you can ‘reuse’ your expression, for instance, after your data changes. Let’s see an example of that:

 

 

 

 

 

 static void Main(string[] args)
        {
            //Create an array of ints
            var numbers = new int[] { 1, 5, 7, 8, 2, 7, 9, 4, 3 };
            //Create a LINQ query that selects all numbers lower than 5
            var lowNumbers = from n in numbers

                             where n < 5

                             select n;

            //Print all the low numbers
            Console.WriteLine("First time printing low numbers");
            foreach (var lowNumber in lowNumbers)
            {
                Console.WriteLine(lowNumber);

            }

            //Loop trough the numbers array, and decrease every number by 2

            for (int i = 0; i < numbers.Length; i++)

            {

                numbers
 = numbers
 - 2;
            }


            //Print all the low numbers again, using the same lowNumbers variable
            Console.WriteLine("Second time printing low numbers");
            foreach (var lowNumber in lowNumbers)
            {
                Console.WriteLine(lowNumber);
            }
 
            Console.ReadLine();

        }

 

 

The output of this example will be:

 

 

 

 

 

2352.output3.png 

 

This is, at first sight, a bit counter-intuitive. Shouldn’t the result of the assigned ‘lownumbers’ variable be the same throughout the whole execution of this code? This is how ‘Defered Execution’ behaves.

 

 

 

The best way to see it is that ‘lowNumbers’ is not assigned the output of the LINQ query, but it is assigned the LINQ query itself. Every time the realized values of ‘lowNumbers’ are needed, the LINQ query gets executed.

 

 

 

Again, we can negate this effect by assigning ‘lownumbers’ like this:

 

 

 

var lowNumbers = (from n in numbers

 

                             where n < 5

 

                             select n).ToList();

 

 

 

In this way, the realized values of the LINQ query are immediately needed to create the List<double> using the .ToList() extension method.

 

If we do that, the output of our sample will be:

 

 

 

 

 

 

 

 

 

 

 

 

 

6404.output4.png

 

This shows we negated the ‘Lazy evaluation’ by making it ‘Eager Evaluation’, using the .ToList() method.

 

 

Summary

We can use the ‘yield’ statement to create iterator blocks. Iterator blocks return collections of type ‘IEnumerable<T>’ or derivatives.

 

When we use the ‘yield’ statement, we don’t need any temporary collections that we will return. By using the ‘yield’ statement, we use less code and it will be better readable.

 

Iterator blocks use Lazy Evaluation, we can negate the lazy evaluation by immediately assigning the realized values.

 

·         In lazy evaluation, a single element of the source collection is processed during each call to the iterator. This is the typical way in which iterators are implemented.

 

·         In eager evaluation, the first call to the iterator will result in the entire collection being processed. A temporary copy of the source collection might also be required. For example, the OrderBy method has to sort the entire collection before it returns the first element.

Further reads

The ‘yield’ statement on MSDN

 

Iterators on MSDN

 

Deferred Execution and Lazy Evaluation in LINQ to XML on MSDN

 

101 LINQ Samples: Query Execution

Outcomes