From C# To Perl: Performance

Caveats

Let’s talk about the performance of C# vs. Perl. First, let me just say that a performance comparison of a staticly-typed, pre-compiled language vs. a dynamically-typed, scripting language is inherently unfair. Yes, .NET does not compile to native code, but the comparison is still unfair. A static, type-safe language avoids the overhead of compilation on startup and allows for more optimization, particularly removing virtual calls when not needed.

Comparison

With that being said, how much faster is C# in benchmarks? My usual source when comparing the performance of programming languages is the The Computer Language Benchmarks Game. Here you can see the comparison of C# vs. Perl. One thing to note is that I am comparing the performance of the Mono implementation of C# vs. Perl, not the Microsoft .NET implementation. The performance of Mono is comparable to that of .NET these days and the benchmark game runs the benchmarks on Ubuntu, so I’m just going to run with it. Also, while small benchmarks have severe limitations, they will serve the purpose here.

We can see three metrics in the comparison: time, memory, and code size. Various benchmarks stress each differently, but the trends are clear. First, C# is a hell of a lot faster than Perl. As I stated above, this shouldn’t surprise anybody.

The more interesting comparisons are memory and code. We can see that Mono uses 13x more memory on the mandelbrot benchmark, but is anywhere from even to 3x more on every other. It also requires up to 5x more code than Perl. You can see the source code for the benchmarks that required much more code in C# below.

k-nucleotide: C# Perl
reverse-complement: C# Perl

You can see that the code for the k-nucelotide benchmark is much more concise in Perl. The verboseness of C# in this case hinders the ablity to quickly understand the algorithm. The same is true for reverse-complement, though not quite as bad, with Perl’s concise file-handling operations being the main difference.

You can also see the same comparison made on a quad-core 32-bit machine. The same trend holds and you can also see that Mono makes better use of multiple cores. Unfortunately, benchmarks are not available for Perl on 64-bit processors.

Does It Matter

So what? C# is faster. How often does it really matter? If you aren’t doing heavy number crunching, it probably doesn’t. These days performance is often limited by outside factors, such as network latency or database performance. But let’s say you have identified a bottleneck in your code and it is definitely your code that is the bottleneck. How would you go about improving performance in each language?

You have similar options for both languages. In C# you can link to native code (usually a compiled C/C++ binary) or you can declare your code to be unsafe to get direct access to pointers. Either way, it’s going to introduce some pain. Integrating C# with native code always made me feel dirty and I usually ended up putting a wrapper class around it to hide the ugliness. Declaring your code to be unsafe isn’t much better. Most of its speed benefits come from getting the runtime out of the way, meaning you lose most of the benefits of C# over C/C++.

Perl gives you a few options that are mostly equivalent to those in C#. First, you can inline C code with the Inline module. This is admittedly pretty sweet for simple cases. You don’t have to use C, but if you are doing it for performance reasons you probably will. This is great if you have a single performance sensitive algorithm in your module, because, as the name implies, you can do it inline. However, there are limitations to what you can do with this method. If you need to improve the performance of a whole module, you are best served by using XS. This is not for the faint at heart, but will allow you to transparently call C code within Perl.

Conclusion

What we have found is pretty much what was expected. C# is much faster, but uses more memory than Perl and is more verbose. They both have similar options to improve performance by integrating with code written in C, although Perl’s seem slightly more natural to me.

Once again though, does it really matter? Most programmers are not tackling performance sensitive problems and hardware is cheap. If performance does matter, you’re probably better off with C#, but in the grand scheme of things, whatever tools your team is familiar with is the far more important factor.

Welcome

Welcome to the newly launched website for Curmudgeonly Software. The goal is for this to be an umbrella for all of my open source work. For now, you can see projects I have contributed to on the Projects page. All technical posts that formerly resided at curiouscurmudgeon.com will also be moved over at some point and all future technical posts will be posted here.

From C# To Perl: Garbage Collection

I don’t think that it is a stretch to say that every modern language that isn’t built especially for peformance or system programming uses a garbage collector. The programmer productivity gains that you get by lifting the burden of manual memory management can be large. Sure, you lose some control, but most tasks don’t require that level of control. C/C++ is always there if by chance you actually do. With that being said, let’s look at how the Perl and C# garbage collectors compare.

Perl employs a simple reference counting garbage collector. A reference counting garbage collector works exactly as the name implies. It keeps a count of the number of references to a piece of data on the heap. When the number of references to an object reaches zero the collector knows that it can reclaim the data. The major limitation of this algorithm is that it cannot handle circular references because the reference count will never reach zero. Even if your application can no longer access the data, the collector will never reclaim it, leading to a memory leak. Fortunately, circular data structures don’t come up in most applications, but you must be aware when they do. Perl allows you to get around this by weakening a reference. A reference that has been weakened does not increment the reference count for the garbage collector, allowing the it to reclaim circular data structures when used properly.

Coming from C# the simplicity of the Perl garbage collector was a bit of a surprise, because, quite frankly, I find the .NET garbage collector to be amazing. It uses generational collection, allowing for faster collection cycles that get most objects, with periodic longer collections to check older generations. The longer collection cycles use a simple mark and sweep algorithm. There are even different modes depending on if you have a desktop or web application. It also allows you to weaken references, manage non-memory resources with finalizers (although you should be implementing the IDisposable pattern), induce collections, or get notifications when the runtime senses a full garbage collection is approaching. This is a massive topic, but if you want to learn more about .NET garbage collection then you ever wanted to, a good place to start is the blog of Maoni Stephens, who spends her days actually working on it.

Put simply, C# collector puts Perl’s to shame. Unfortunately, the programming language shootout uses the Mono implementation of C#, so a quick comparison against Perl isn’t feasible. Quick benchmarks most likely won’t exercise the generational collector very well anyways. If Perl 6 ever gets around to being ready (I’m not counting on it), Parrot will allow for pluggable garbage collection engines. I would fully expect for something better than the current Perl 5 garbage collector to be an option. For now though, C# has a clear edge.

From C# To Perl: Code Analysis Tools

At my last job we ran a continuous integration server with CruiseControl.NET. As part of each build we did code analysis with FxCop. (We also ran StyleCop, but that was only to run one custom rule to address a problem we were having.) FxCop is a tool that analyzes .NET assemblies for potential design, localization, performance, and security problems. Most of these rules are pulled from Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries, which serves as the de-facto style guidelines in many .NET shops. (We used it to avoid petty formatting arguments. Whatever FDG said was the decider unless we unanimously disagreed.)

So, what are the options in Perl and how do they compare to FxCop? Luckily the Perl community has an excellent tool for code analysis as well, Perl::Critic. I am not aware of any other Perl code analysis tools. It too pulls many of its rules from a standard text, Damian Conway’s Perl Best Practices. Let’s look at how it stacks up.

One criticism of FxCop is its unwieldy XML files. Depending on the size of your code base and your configuration it is easy to end up with XML files running into the tens of megabytes. Perl::Critic, like most UNIX tools, disdains the use of XML (usually a plus) and instead outputs violations one to a line. Both tools let you configure which rules/policies to include with each run. FxCop throws these into the same massive .fxcop XML file while Perl::Critic looks for an INI-style file called .perlcriticrc in your home directory. Speaking of that massive XML file, FxCop also has a nasty habit of rewriting its XML results and configuration across runs, leading to some nasty conflicts if you have multiple developers committing changes simultaneously.

Why would you commit changes though? This is an area where FxCop comes out ahead. Both tools let you exclude violations. Perl::Critic does it through the use of “no critic” annotations. FxCop gives you more flexibility. It allows you to use attributes to suppress specfic violations in any scope you can apply an attribute too. In additon, you can add exceptions directly to your XML through the GUI tool to avoid cluttering your code. You probably don’t want to try editing the XML config yourself. It’s a beast.

FxCop also has some advantages because it works on bytecode. Since all .NET languages compile to the same bytecode you can use FxCop on any .NET language. In Perl this would be similar to using Perl::Critic on Parrot bytecode to analyze any language that runs on Parrot. However, in practice this doesn’t matter too much. Most .NET code bases are homogenous. It’s still pretty awesome though.

Any advantages that FxCop has though are blown away by the fact that Perl::Critic is open source and FxCop isn’t. FxCop doesn’t even have a published API. This means you cannot write your own custom rules. Seriously. Someone explain to me why this is a good idea. There is someone trying to document the API, but it’s a poor substitute for the real thing. Perl::Critic on the other hand is open source with each policy neatly organized into its own module. Install it from CPAN and you’ll get a bunch of policies, all of which you can view the code for to learn how to write your own. And there a lot more rules just waiting for you to try out.

In the interest of fairness, StyleCop is open source, but I don’t have a lot of experience with it. We briefly looked at using the built-in rules, but decided we didn’t really care for a lot of them. A number of these rules would fall into the type of style rules that Perl::Critic has though.

If Perl::Critic had the exclusion capabalities that FxCop has this wouldn’t even be close, but even with those limitations I still prefer it. Point Perl.

From C# To Perl: Introduction

After over three years as a C# developer I have made the jump to Perl. Judging from the dearth of information I have found comparing the two (all I could find is this awful auto-generated comparison), I don’t think this is a jump many programmers have made, so I wanted to make some comparisons myself. In future posts I will be touching on differences (there aren’t many similarities) between the two languages and the advantages and disadvantages of each. In addition I will compare the surrounding issues of culture, tooling, performance, debugging, etc. The focus will be on the C# developer coming to Perl.

Since my Perl-fu is still weak, feel free to point out any inaccuracies you see. To be fair to Perl, when comparing many language features I will be comparing C# against both bare Perl and Perl with CPAN. The malleability of Perl compared to C# makes this necessary. In addition there are some features of each language that have no direct comparison to the other, such as tied variables in Perl (I just sat here for ten minutes trying to think of some feature of C# that isn’t available in some form on CPAN. I couldn’t think of any. Point Perl.).