Jump to content
RESET Forums (homeservershow.com)

Compiler Performance in Azure


Recommended Posts

Note: As I generated more data for this post, I started to realize a few things that I initially perceived maybe were a little less severe, see the end for some clarification there.


I've been playing around with using Azure to host a build machine hoping that I could get better performance than I was running it as a VM at home.  My main task that I want to run was a set of scripts that builds MinGW.  I was getting reasonable performance out of my local VM except that the last step of the build is uploading a 60MB file and I don't have the network bandwidth to do that well (my modem tells me 5Mbps down and 896kps up).  Unfortunately, while I get much better internet bandwidth, the performance of extracting tarballs is not as great.


I'm running this on an A1 Basic instance.  I've done some tests to compare the OS disk and a mirror of two data disks and while an artifical benchmark for the OS disk is much better, real-world for the download and extract step is about the same.  I guess when using a C compiler the most important metric to performance is IOPS, which unfortunately are somewhat low, especially on Basic VM's.  If I read correctly, I'm limited to 300IOPS/disk, though the benchmark tool (SQLIO) I was using makes it look more like 500IOPS/disk.


Grabbed some rough benchmarks between my desktop at home and the OS and Data disks on the VM using the time command.  I mostly care about clock time in this case.  I tried to only compare specific tasks to each other.


Extracting Tarballs, Apply Patches, Run a few other misc commands similar to patching (It's a specific step the overall script can run), rough average of two runs

VM OS Disk (A0 Basic) - 8:00

VM Data Disk (A0 Basic, 2 Disk Mirror) - 8:10

Desktop Data Disk (1TB WD Black) - 7:50


As a side note, the Azure VM appears to be running on a Xeon E5-2673v3 (2.4GHz).  My desktop is a Core i5-3330 (3.0GHz).


I wrote all this and then realized that maybe I'm not seeing much worse performance on Azure than I do locally, I just maybe didn't have a couple things set up quite right.  I'm running a full build in both environments to compare now, and actually making sure I'm comparing apples to apples for once.  I usually run with -j 4 (4 compiles in parallel) on my local computer, so given that the VM only has 1 core, not a good comparison.  I have also not been able to find a good way to monitor the performance of a virtual machine to see where bottlenecks are.  I can see memory and CPU percentages and disk bandwidth, but I haven't been able to find a way to watch what I think is the actual limited parameter on the disk - IOPS.

Link to comment
Share on other sites

Ran 3 different full builds to compare and the results aren't nearly as promising as the extraction step had them.



real  - 56m11.473s

user - 8m31.114s

sys   - 15m15.474s

user + sys = 23m46s - Total compute time

real - (user + sys) = 32m25s - Waiting time


VM OS Disk

real  - 209m26.616s

user - 55m45.619s

sys   - 44m37.606s

user + sys = 100m22s - Total compute time

real - (user + sys) = 109m4s - Waiting time


VM Data Disk

real  - 216m0.743s

user - 54m49.362s

sys   - 43m4.787s

user + sys = 97m54s - Total compute time

real - (user + sys) = 118m6 - Waiting time


So despite some fairly close number for extracting files, it takes nearly 3.5 times as long to run the full build.  I'm still not quite sure how exactly to interpret the user and sys times, from what I do understand, the difference between real and user + sys is the amount of time spent waiting for other operations - I/O, other tasks, etc...  So the Azure VM spends a lot of time waiting for something, probably the disk if I had to guess.  What I'm really confused about is the difference in computational time.  My desktop does run at a higher frequence (3.0GHz vs. 2.4GHz), but I wouldn't expect that to make such a large difference in the performance.  I'm having trouble believing that a Core i5 (Ivy Bridge) is that much faster than a Xeon E5 v3 (Haswell) even with the frequency difference.  Actually, if both clocked at the same speed, I'd expect the Xeon to be faster.


Granted, I could be misreading the numbers completely.  I think I'm going to try some parallel builds and see if overlapping some of the I/O and CPU time helps at all.


The only conclusion I've made so far is that OS vs. Data disk doesn't matter too much, at least when the Data disk is actually a striped pair of disks.  I wonder if I'm not getting the full benefit of the striped disks since a lot of the I/O is fairly small files, well below the 64k stripe size I can find online.

Link to comment
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in

Sign In Now

  • Create New...