After reviewing the code for the simple YAML parser I wrote, I decided it was getting a little messy, so before continuing, I decided to refactor it a little bit.
The simples thing to do was to separate the serialisation and the deserialisation into separate classes, and simple call those from within the YamlConvert
class in the existing methods. This approach tends to be what other JSON and YAML libraries do, with added functionality such as being able to control aspects of the serialisation/deserialisation process for specific types.
I currently don’t need, or want, to do that, as I’m taking a much more brute force approach - however it is something to consider for a future refactor. Maybe.
I ended up with the following for the YamlConvert
:
public static class YamlConvert
{
private static YamlSerialiser Serialiser;
private static YamlDeserialiser Deserialiser;
static YamlConvert()
{
Serialiser = new YamlSerialiser();
Deserialiser = new YamlDeserialiser();
}
public static string Serialise(YamlHeader header)
{
return Serialiser.Serialise(header);
}
public static YamlHeader Deserialise(string filePath)
{
if (!File.Exists(filePath)) throw new FileNotFoundException("Unable to find specified file", filePath);
var content = File.ReadAllLines(filePath);
return Deserialise(content);
}
public static YamlHeader Deserialise(string[] rawHeader)
{
return Deserialiser.Deserialise(rawHeader);
}
}
It works quite well, as it did before, and looks a lot better. There is no dependency configuration to worry about, as I mentioned above I’m not worried about swapping out the serialisation/deserialisation process at any time.
Previously we left off with a method which could parse the YAML header in one of our markdown files, and it was collecting each line between the ---
header marker, for further processing.
One of the main requirements for the overall BlogHelper9000
utility is to be able to standardise the YAML headers in each source markdown file for a post. Some of the posts had a mix of different tags, that were essentially doing the same thing, so one of the aims is to be able to collect those, and transform the values into the correct tags.
In order to achieve this, we can specify a collection of the valid header properties up front, and also a collection of the ‘other’ properties that we find, which we can hold for further in the process when we’ve written the code to handle those properties. The YamlHeader
class has already been defined, and we can use a little reflection to load that class up and pick the properties out.
private static Dictionary<string, object?> GetYamlHeaderProperties(YamlHeader? header = null)
{
var yamlHeader = header ?? new YamlHeader();
return yamlHeader.GetType()
.GetProperties(BindingFlags.DeclaredOnly | BindingFlags.Public | BindingFlags.Instance)
.Where(p => p.GetCustomAttribute<YamlIgnoreAttribute>() is null)
.ToDictionary(p =>
{
var attr = p.GetCustomAttribute<YamlNameAttribute>();
return attr is not null ? attr.Name.ToLower() : p.Name.ToLower();
}, p => p.GetValue(yamlHeader, null));
}
We need to be careful to ignore collecting properties that are not part of the YAML header in markdown files, but that we use in the YamlHeader
that we can use when doing further processing - such as holding the ‘extra’ properties that we’ll need to match up with their valid counterparts in a further step. Thus we have the custom YamlIgnoreAttribute
that we can use to ensure we drop properties that we don’t care about. We also need to ensure that we can match up C# property names with the actual YAML header name, so we also have the YamlNameAttribute
to handle this.
Then we just need a way of parsing the individual lines and pulling the header name and the value out.
(string property, string value) ParseHeaderTag(string tag)
{
tag = tag.Trim();
var index = tag.IndexOf(':');
var property = tag.Substring(0, index);
var value = tag.Substring(index+1).Trim();
return (property, value);
}
Here we just return a simple tuple after doing some simple substring manipulation, which is greatly helped by the header and its value always being seperated by ‘:’.
Then if we put all that together we can start to parse the header properties.
private static YamlHeader ParseYamlHeader(IEnumerable<string> yamlHeader)
{
var parsedHeaderProperties = new Dictionary<string, object>();
var extraHeaderProperties = new Dictionary<string, string>();
var headerProperties = GetYamlHeaderProperties();
foreach (var line in yamlHeader)
{
var propertyValue = ParseHeaderTag(line);
if (headerProperties.ContainsKey(propertyValue.property))
{
parsedHeaderProperties.Add(propertyValue.property, propertyValue.value);
}
else
{
extraHeaderProperties.Add(propertyValue.property, propertyValue.value);
}
}
return ToYamlHeader(parsedHeaderProperties, extraHeaderProperties);
All we need to do is, to setup up some dictionaries to hold the header properties, get the dictionary of valid header properties, and then loop through each line, parsing the header tag and verifying whether the property is a ‘valid’ one that we definitely know we want to keep, and or one we need to hold for further processing. You’ll noticed in the above code, that it’s missing an end brace: this is deliberate, because the ParseHeaderTag
method and ToYamlHeader
method are both nested methods.
Reading through the code to write this post has made me realise that we can do some refactoring to make this look a little nicer.
So we’ll look at that next.
The next thing to do to get BlogHelper9000 functional is to write a command which provides some information about the posts in the blog. I want to know:
I also know that I want to introduce a command which will allow me to fix the metadata in the posts, which is a little messy. I’ve been inconsistently blogging since 2007, originally starting off on a self-hosted python blog I’ve forgot the name of before migrating to Wordpress, and then migrating to a short lived .net static site generator before switching over to Jekyll.
Obviously, Markdown powered blogs like Jekyll have to provide non-markdown metadata in each post, and for Jekyll (and most markdown powered blogs) that means: YAML.
There are a couple of options when it comes to parsing YAML. One would be to use YamlDotNet which is a stable library which conforms with V1.1 and v1.2 of the YAML specifications.
But where is the fun in that?
I’ve defined a POCO called YamlHeader
which I’m going to use to use as the in-memory object to represent the YAML metadata header at the top of a markdown file.
If we take a leaf from different JSON converters, we can define a YamlConvert
class like this:
public static class YamlConvert
{
public static string Serialise(YamlHeader header)
{
}
public static YamlHeader Deserialise(string filePath)
{
}
}
With this, we can easily serialise a YamlHeader
into a string, and deserialise a file into a YamlHeader
.
Deserialising is the slight more complicated of the two, so lets start with that.
Our first unit test looks like this:
[Fact]
public void Should_Deserialise_YamlHeader()
{
var yaml = @"---
layout: post
title: 'Dynamic port assignment in Octopus Deploy'
tags: ['build tools', 'octopus deploy']
featured_image: /assets/images/posts/2020/artem-sapegin-b18TRXc8UPQ-unsplash.jpg
featured: false
hidden: false
---
post content that's not parsed";
var yamlObject = YamlConvert.Deserialise(yaml.Split(Environment.NewLine));
yamlObject.Layout.Should().Be("post");
yamlObject.Tags.Should().NotBeEmpty();
}
This immediately requires us to add an overload for Deserialise
to the YamlConvert
class, which takes a string[]
. This means our implementation for the first Deserialise
method is simply:
public static YamlHeader Deserialise(string filePath)
{
if (!File.Exists(filePath)) throw new FileNotFoundException("Unable to find specified file", filePath);
var content = File.ReadAllLines(filePath);
return Deserialise(content);
}
Now we get into the fun part. And a big caveat: I’m not sure if this is the best way of doing this, but it works for me and that’s all I care about.
Anyway. A YAML header block is identified by a single line of only ---
followd by n
lines of YAML which is signified to have ended by another single line of only ---
. You can see this in the unit test above.
The algorithm I came up with goes like this:
For each line in lines:
if line is '---' then
if header start marker not found then
header start marker found
continue
break loop
store line
parse each line of found header
So in a nutshell, it loops through each line in the file, look for the first ---
to identify the start of the header, and then until it hits another ---
, it gathers the lines for further processing.
Translated into C#, the code looks like this:
public static YamlHeader Deserialise(string[] fileContent)
{
var headerStartMarkerFound = false;
var yamlBlock = new List<string>();
foreach (var line in fileContent)
{
if (line.Trim() == "---")
{
if (!headerStartMarkerFound)
{
headerStartMarkerFound = true;
continue;
}
break;
}
yamlBlock.Add(line);
}
return ParseYamlHeader(yamlBlock);
}
This is fairly straightforward, and isn’t where I think some of the problems with the way it works actually are - all that is hidden behind ParseYamlHeader
, and is worth a post on its own.
In the introductory post to this series, I ended with issuing a command to initialise a new console project, BlogHelper9000
. It doesn’t matter how you create your project, be it from Visual Studio, Rider or the terminal, the end result is the same, as the templates are all the same.
With the new .net 6 templates, the resulting Program.cs
is somewhat sparse, if you discount the single comment then all you get in the file is a comment and a Console.WriteLine("Hello, World!");
, thanks to all the new wizardry in the latest versions of the language and the framework.
Thanks to this new fangled sorcery, the app still has a static main method, you just don’t need to see it, and as such, the args
string array is still there. For very simple applications, this is all you really need to do. However, once you get past a few commands, with a few optional flags, things can get complicated, fast. This can into a maintenance headache.
In the past I’ve written my own command line parsing abstractions, I’ve used Mono.Options and other libraries, and I think I’ve finally settled on Oakton as my go to library for quickly and easily adding command line parsing to a console application. It’s intuitive, easy to use and easy to maintain. This means you can easily introduce it into a team environment and have everyone understand it immediately.
After following Oakton’s getting started documentation, you can see how easy it is to get going with a basic implementation. I recommended introducing the ability to have both synchronous and asynchronous commands able to be executed, and you achieve this by a small tweak to the Program.cs
and taking into consideration the top-level statements in .net 6, like this:
using System.Reflection;
var executor = CommandExecutor.For(_ =>{
_.RegisterCommands(typeof(Program).GetTypeInfo().Assembly);
});
var result = await executor.ExecuteAsync(args);
return result;
In .net 5, or if you don’t like top-level statements and have a static int Main
you can make it static Task<int> Main
instead and return the executor.ExecuteAsync
instead of awaiting it.
In some console applications, different commands can have the same optional flags, and I like to put mine in a class called BaseInput
. Because I know I’m going to have several commands in this application, I’m going to add some base classes so that the different commands can share some of the same functionality. I’ve also used this in the past to, for example, create a database instance in the base class, which is then passed into each inheriting command. It’s also a good place to add some common argument/flag validation.
What I like to do is have an abstract base class, which inherits from the Oakton command, and add an abstract Run
method to it, and usually a virtual bool ValidateInput
too; these can then be overriden in our actual Command implementations and have a lot of nice functionality automated for us in a way that can be used across all Commands.
Some of the detail of these classes are elided, to stop this from being a super long post, you can see all the details in the Github repo.
public abstract class BaseCommand<TInput> : OaktonCommand<TInput>
where TInput : BaseInput
{
public override bool Execute(TInput input)
{
return ValidateInput(input) && Run(input);
}
protected abstract bool Run(TInput input);
protected virtual bool ValidateInput(TInput input)
{
/* ... */
}
}
This ensures that all the Commands we implement can optionally decide to validate the inputs that they take in, simply by overriding ValidateInput
.
The async version is exactly the same… except async:
public abstract class AsyncBaseCommand<TInput> : OaktonAsyncCommand<TInput>
where TInput : BaseInput
{
public override Task<bool> Execute(TInput input)
{
return ValidateInput(input) && Run(input);
}
protected abstract Task<bool> Run(TInput input);
protected virtual Task<bool> ValidateInput(TInput input)
{
/* ... */
}
}
There is an additional class I’ve not yet shown, which adds some further reusable functionality between each base class, and that’s the BaseHelper
class. I’ve got a pretty good idea that any commands I write for the app are going to operate on posts or post drafts, which in jekyll are stored in _posts
and _drafts
respectively. Consequently, the commands need an easy way of having these paths to hand, so a little internal helper class is a good place to put this shared logic.
internal class BaseHelper<TInput> where TInput : BaseInput
{
public string DraftsPath { get; }
public string PostsPath { get; }
private BaseHelper(TInput input)
{
DraftsPath = Path.Combine(input.BaseDirectoryFlag, "_drafts");
PostsPath = Path.Combine(input.BaseDirectoryFlag, "_posts");
}
public static BaseHelper<TInput> Initialise(TInput input)
{
return new BaseHelper<TInput>(input);
}
public bool ValidateInput(TInput input)
{
if (!Directory.Exists(DraftsPath))
{
ConsoleWriter.Write(ConsoleColor.Red, "Unable to find blog _drafts folder");
return false;
}
if (!Directory.Exists(PostsPath))
{
ConsoleWriter.Write(ConsoleColor.Red, "Unable to find blog _posts folder");
return false;
}
return true;
}
}
This means that our base class implementations can now become:
private BaseHelper<TInput> _baseHelper = null!;
protected string DraftsPath => _baseHelper.DraftsPath;
protected string PostsPath => _baseHelper.PostsPath;
public override bool Execute(TInput input)
{
_baseHelper = BaseHelper<TInput>.Initialise(input);
return ValidateInput(input) && Run(input);
}
protected virtual bool ValidateInput(TInput input)
{
return _baseHelper.ValidateInput(input);
}
null!
, where I am telling the compiler to ignore the fact that _baseHelper
is being initialised to null, as I know better.
This allows each command implementation to hook into this method and validate itself automatically.
Now that we have some base classes to work with, we can start to write our first command. If you check the history in the repo, you’ll see this wasn’t the first command I actually wrote… but it probably should have been. In any case, it only serves to illustrate our first real command implementation.
public class InfoCommand : BaseCommand<BaseInput>
{
public InfoCommand()
{
Usage("Info");
}
protected override bool Run(BaseInput input)
{
var posts = LoadsPosts();
var blogDetails = new Details();
DeterminePostCount(posts, blogDetails);
DetermineDraftsInfo(posts, blogDetails);
DetermineRecentPosts(posts, blogDetails);
DetermineDaysSinceLastPost(blogDetails);
RenderDetails(blogDetails);
return true;
}
/**...*/
}
LoadPosts
is a method in the base class which is responsible for loading the posts into memory, so that we can process them and extract meaningful details about the posts. We put store this information in a Details
class, which is what we ultimately use to render the details to the console. You can see the details of these methods in the github repository, however they all boil down to simple Linq queries.
In this post we’ve seen how to setup Oakton and configure a base class to extend the functionality and give us more flexibility, and an initial command. In subsequent posts, we’ll cover more commands and I’ll start to use the utility to tidy up metadata across all the posts in the blog and fix things like images for posts.
Normally you can’t broadly stop someone from being able to send you mail. However, there is a loophole.
You can file a PS Form 1500 and say that the advertisement you received from them made you horny. No questions asked prohibitory order.
I just had to setup my vimrc
and vimfiles
on a new laptop for work, and had some fun with Vim, mostly as it’s been years since I had to do it. I keep my vimfiles
folder in my github, so I can grab it wherever I need it.
To recap, one of the places that Vim will look for things is $HOME/vimfiles/vimrc
, where $HOME
is actually the same as %USERPROFILE%
. In most corporate environments, the %USERPROFILE%
is actually stored in a networked folder location, to enable roaming profile support and help when a user gets a new computer.
So you can put your vimfiles
there, but, it’s a network folder - it’s slow to start an instance of Vim. Especially if you have a few plugins.
Instead, what you can do is to edit the _vimrc
file in the Vim installation folder (usually in C:\Program Files (x86)\vim
), delete the entire contents and replace it with:
set rpt+=C:\path\to\your\vimfiles
set viminfo+=nC:\path\to\your\vimfiles\or\whatever
source C:\path\to\your\vimfiles\vimrc
What this does is:
vimrc
file and uses thatThis post largely serves as a memory aid for myself when I need to do this again in future I won’t spend longer than I probably needed to googling it to find out how to do it, but I hope it helps someone else.
Recently I was inspired by @buhakmeh’s blog post, Supercharge Blogging With .NET and Ruby Frankenblog to write something similar, both as an exercise and excuse to blog about something, and as a way of tidying up the metadata on my existing blog posts and adding header images to old posts.
The initial high level requirements I want to support are:
The next series of posts will cover implementing the above requirements… not necessarily in that order. First I will go over setting up the project and configuring Oakton.
After that I will probably cover implementing fixes to the existing blog metadata, as I think that is going to be something that will be required in order for any sort of Info function to work properly, as all of the yaml metadata will need to be consistent.
Then I think I’ll tackle the image stuff, which should be fairly interesting, and should give a nice look to the existing posts, as having prominent images for posts is part of the theme for the blog, which I’ve not really taken full advantage of.
I’ll try to update this post with links to future posts, or else make it all a big series.
dotnet new console --name BlogHelper9000
At work, we have recently been porting our internal web framework into .net 6. Yes, we are late to the party on this, for reasons. Suffice it to say I currently work in an inherently risk averse industry.
Anyway, one part of the framework is responsible for getting reports from SSRS.
The way it did this is to use a wrapper class around a SOAP client generated from good old ReportService2005.asmx?wsdl
, using our faithful friend svcutil.exe
. The wrapper class used some TaskCompletionSource
magic on the events in the client to make the client.LoadReportAsync
and the other *Async
methods actually async, as the generated client was not truely async.
Fast forward to the modern times, and we need to upgrade it. How do we do that?
Obviously, Microsoft are a step ahead: svcutil
has a dotnet version - dotnet-svcutil
. We can install it and get going:
dotnet too install --global dotnet-svcutil
Once installed, we can call it against the endpoint:
dotnet-svcutil http://server/ReportServer/ReportService2005.asmx?wsdl
In our wrapper class, the initialisation of the client has to change slightly, because the generated client is different to the original svcutil
implementation. Looking at the diff between the two files, it’s because the newer version of the client users more modern .net functionality.
The wrapper class constructor has to be changed slightly:
public Wrapper(string url, NetworkCredential credentials)
{
var binding = new BasicHttpBinding(BasicHttpSecurityMode.TransportCredentialOnly);
binding.Security.Transport.ClientCredentialType = HttpClientCredentialType.Ntlm;
binding.MaxReceivedMessageSize = 10485760; // this is a 10mb limit
var address = new EndpointAddress(url);
_client = new ReportExecutionServiceSoapClient(binding, address);
_client.ClientCredentials.Windows.AllowedInpersonationLevel = TokenImpersonationLevel.Impersonation;
_client.ClientCredentials.Windows.ClientCredential = credentials;
}
Then, the code which actually generates the report can be updated to remove all of the TaskCompletionSource
, which actually simplifies it a great deal:
public async Task<byte[]> RenderReport(string reportPath, string reportFormat, ParameterValue[] parameterValues)
{
await _client.LoadReportAsync(null, reportPath, null);
await _client.SetExecutionParametersAsync(null, null, parameterValues, "en-gb");
var deviceInfo = @"<DeviceInfo><Toolbar>False</ToolBar></DeviceInfo>";
var request = new RenderRequest(null, null, reportFormat, deviceInfo);
var response = await _client.RenderAsync(request);
return response.Result;
}
You can then do whatever you like with the byte[]
, like return it in an IActionResult
or load it into a MemoryStream
and write it to disk as the file.
who is eating cereal anymore? Literally don’t think I’ve seen someone eat a bowl of cereal in twenty years
Recently we realised that we had quite a few applications being deployed through Octopus Deploy, and that we had a number of Environments, and a number of Channels, and that managing the ports being used in Dev/QA/UAT across different servers/channels was becoming… problematic.
When looking at this problem, it’s immediately clear that you need some way of dynamically allocating a port number on each deployment. This blog post from Paul Stovell shows the way, using a custom Powershell build step.
As we’d lost track of what sites were using what ports, and that we also have ad-hoc websites in IIS that aren’t managed by Octopus Deploy, we thought that asking IIS “Hey, what ports are the sites you know about using?” might be a way forward. We also had the additional requirement that on some of our servers, we also might have some arbitary services also using a port and that we might bump into a situation where a port was chosen that was already being used by a non-IIS application/website.
Researching the first situation, it’s quickly apparent that you can do this in Powershell, using the Webadministration
module. Based on the answers to this question on Stackoverflow, we came up with this:
Import-Module Webadministration
function Get-IIS-Used-Ports()
{
$Websites = Get-ChildItem IIS:\Sites
$ports = foreach($Site in $Websites)
{
$Binding = $Site.bindings
[string]$BindingInfo = $Binding.Collection
[string]$Port = $BindingInfo.SubString($BindingInfo.IndexOf(":")+1,$BindingInfo.LastIndexOf(":")-$BindingInfo.IndexOf(":")-1)
$Port -as [int]
}
return $ports
}
To get the list of ports on a machine that are not being used is also fairly straightforward in Powershell:
function Get-Free-Ports()
{
$availablePorts = @(49000-65000)
$usedPorts = @(Get-NetTCPConnection | Select -ExpandProperty LocalPort | Sort -Descending | Where { $_ -ge 49000})
$unusedPorts = foreach($possiblePort in $usedPorts)
{
$unused = $possiblePort -notin $usedPorts
if($unused)
{
$possiblePort
}
}
return $unusedPorts
}
With those two functions in hand, you can work out what free ports are available to be used as the ‘next port’ on a server. It’s worth pointing out that if a site in IIS is stopped, then IIS won’t allow that port to be used in another website (in IIS), but the port also doesn’t show up as a used port in netstat -a
, which is kind of what Get-NetTCPConnection
does.
function Get-Next-Port()
{
$iisUsedPorts = Get-IIS-Used-Ports
$freePorts = Get-Free-Ports
$port = $freePorts | Where-Object { $iisUsedPorts -notcontains $_} | Sort-Object | Select-Object First 1
Set-OctopusVariable -Name "Port" -Value "$port"
}
Then you just have to call it at the end of the script:
Get-Next-Port
You’d also want to have various Write-Host
or other logging messages so that you get some useful output in the build step when you’re running it.
If you found this because you have a build server which is ‘offline’, without any external internet access because of reasons, and you can’t get your build to work because dotnet fails to restore the tool you require for your build process because of said lack of external internet access, then this is for you.
In hindsight, this may be obvious for most people, but it wasn’t for me, so here it is.
In this situation, you just need to shy away from local tools completely, because as of yet, I’ve been unable to find anyway of telling dotnet not to try to restore them, and they fail every build.
Instead, I’ve installed the tool(s) as a global tool, in a specific folder, e.g. C:\dotnet-tools
, which I’ve then added to the system path on the server. You may need to restart the build server for it to pick up the changes to the environment variable.
One challenge that remains is how to ensure the dotnet tools are consistent on both the developer machine, and the build server. I leave that as an exercise for the reader.
I’m leaving this here so I can find it again easily.
We had a problem updating the Visual Studio 2019 Build Tools on a server, after updating an already existing offline layout.
I won’t go into that here, because it’s covered extensively on Microsoft’s Documentation website.
The installation kept failing, even when using --noweb
. It turns out that when your server is completely cut off from the internet, as was the case here, you also need to pass --noUpdateInstaller
.
This is because (so it would seem) that even though --noweb
correctly tells the installer to use the offline cache, it doesn’t prevent the installer from trying to update itself, which will obviously fail in a totally disconnected environment.
Since a technical breakdown of how Betsy does texture compression was posted, I wanted to lay out how the compressors in Convection Texture Tools (CVTT) work, as well as provide some context of what CVTT's objectives are in the first place to explain some of the technical decisions.
First off, while I am very happy with how CVTT has turned out, and while it's definitely a production-quality texture compressor, providing the best compressor possible for a production environment has not been its primary goal. Its primary goal is to experiment with compression techniques to improve the state of the art, particularly finding inexpensive ways to hit high quality targets.
A common theme that wound up manifesting in most of CVTT's design is that encoding decisions are either guided by informed decisions, i.e. models that relate to the problem being solved, or are exhaustive. Very little of it is done by random or random-like searching. Much of what CVTT exists to experiment with is figuring out techniques which amount to making those informed decisions.
Anyway, CVTT's ParallelMath module is kind of the foundation that everything else is built on. Much of its design is motivated by SIMD instruction set quirks, and a desire to maintain compatibility with older instruction sets like SSE2 without sacrificing too much.
Part of that compatibility effort is that most of CVTT's ops use a UInt15 type. The reason for UInt15 is to handle architectures (like SSE2!) that don't support unsigned compares, min, or max, which means performing those operations on a 16-bit number requires flipping the high bit on both operands. For any number where we know the high bit is zero for both operands, that flip is unnecessary - and a huge number of operations in CVTT fit in 15 bits.
The compare flag types are basically vector booleans, where either all bits are 1 or all bits are 0 for a given lane - There's one type for 16-bit ints, and one for 32-bit floats, and they have to be converted since they're different widths. Those are combined with several utility functions, some of which, like SelectOrZero and NotConditionalSet, can elide a few operations.
The RoundForScope type is a nifty dual-use piece of code. SSE rounding modes are determined by the CSR register, not per-op, so RoundForScope when targeting SSE will set the CSR, and then reset it in its destructor. For other architectures, including the scalar target, the TYPE of the RoundForScope passed in is what determines the operation, so the same code works whether the rounding is per-op or per-scope.
While the ParallelMath architecture has been very resistant to bugs for the most part, where it has run into bugs, they've mostly been due to improper use of AnySet or AllSet - Cases where parallel code can behave improperly because lanes where the condition should exclude it are still executing, and need to be manually filtered out using conditionals.
by OneEightHundred (noreply@blogger.com) at 2021-01-03 23:21
If you want some highlights:
The SDL variant ("AerofoilSDL") is also basically done, with a new OpenGL ES 2 rendering backend and SDL sound backend for improved portability. The lead version on Windows still uses D3D11 and XAudio2 though.
Unfortunately, I'm still looking for someone to assist with the macOS port, which is made more difficult by the fact that Apple discontinued OpenGL, so I can't really provide a working renderer for it any more. (Aerofoil's renderer is actually slightly complicated, mostly due to postprocessing.)
In the meantime, the Android port is under way! The game is fully playable so far, most of the work has to do with redoing the UI for touchscreens. The in-game controls use corner taps for rubber bands and battery/helium, but it's a bit awkward if you're trying to use the battery while moving left due to the taps being on the same side of the screen.
Most of the cases where you NEED to use the battery, you're facing right, so this was kind of a tactical decision, but there are some screens (like "Grease is on TV") where it'd be really nice if it was more usable facing left.
I'm also adding a "source export" feature: The source code package will be bundled with the app, and you can just use the source export feature to save the source code to your documents directory. That is, once I figure out how to save to the documents directory, which is apparently very complicated...
Anyway, I'm working on getting this into the Google Play Store too. There might be some APKs posted to GitHub as pre-releases, but there may (if I can figure out how it works) be some Internal Testing releases via GPS. If you want to opt in to the GPS tests, shoot an e-mail to codedeposit.gps@gmail.com
Maybe, but there are two obstacles:
The game is GPL-licensed and there have reportedly been problems with Apple removing GPL-licensed apps from the App Store, and it may not be possible to comply with it. I've heard there is now a way to push apps to your personal device via Xcode with only an Apple ID, which might make satisfying some of the requirements easier, but I don't know.
Second, as with the macOS version, someone would need to do the port. I don't have a Mac, so I don't have Xcode, so I can't do it.
by OneEightHundred (noreply@blogger.com) at 2020-10-20 11:09
As part of modernisng, updating and generally overhauling my blog, I thought it would be nice to add some consistancy to the Yaml front matter used by Jekyll. For those who do not know, Jekyll uses Yaml front matter blocks to process any file which contains one as a special file. The front matter can contain variables in the form foo: value
. Jekyll itself defines some predefined globabl variables and variables for posts, but anything else is valid and can be use in Liquid tags.
I wondered if I could write some F# to:
Fairly straightforward requirements.
I’m using YamlDotNet to do most of the heavy lifting. I think could also have used the FSharp.Configuration Type Provider, but I’m not sure that it would have done exaclty what I wanted.
I’m just writing this in an F# script, hosted in a project. After adding the YamlDotNet NuGet package, we can reference it and get to work:
#r "../../.nuget/packages/YamlDotNet/8.1.2/lib/netstandard2.1/YamlDotNet.dll"
open System.IO
open System.Text.RegularExpressions
open YamlDotNet.Serialization
open YamlDotNet.Serialization.NamingConventions
let path = "../sgrassie.github.io/_posts"
Here, we reference the package, and then open various namespaces for use later on. The code for my blog is kept in a separate folder, relative to the project which has got the fsharp scripts I’m writing abot in it. This is nice and easy.
type FrontMatter() =
member val Title = "" with get, set
member val Description = "" with get, set
member val Layout = "" with get, set
member val Tags = [|""|] with get, set
member val Published = "" with get, set
member val Category = "" with get, set
member val Categories = "" with get, set
member val Metadescription = "" with get, set
member val Series = "" with get, set
member val Featured = false with get, set
member val Hidden = false with get, set
member val Image = "" with get, set
[<YamlMember(Alias = "featured_image", ApplyNamingConventions = false)>]
member val FeaturedImage = "" with get, set
[<YamlMember(Alias = "featured_image_thumbnail", ApplyNamingConventions = false)>]
member val FeaturedImageThumbnail = "" with get, set
[<YamlIgnore>]
member val MarkdownFilePath = "" with get, set
This is a class with auto-implemented properties. You can see three attributes in use. The YamlMember
attribute allows us to alias a property in Yaml which doesn’t follow the CamelCase convention we configured the deserialiser with. I think that a C# version of this would look pretty much the same.
let deserializer = DeserializerBuilder()
.WithNamingConvention(CamelCaseNamingConvention.Instance)
.Build()
This initialises the YamlDotNet deserialiser, and is pretty much almost exactly how you would do this in C#. To deserialise something, we need some Yaml. When I was testing this, I got an error in YamlDotNet that was pretty weird and essentially means that it can’t parse the file, and it turns out it’s because all the other stuff outside the Yaml front matter that is upsetting it.
let expression = "(?:---)(?<yaml>[\\s\\S]*?)(?:---)"
Oh regex, I do love thee.
Very simply, this regex will parse everything in a file between two ---
blocks, into a named Yaml
group. We now have actual front matter, we still need to parse into an object.
let extractFrontmatter filePath =
let file = File.ReadAllText(filePath)
let result = Regex.Match(file, expression).Groups.["yaml"].Value
let frontMatter =
let frontMatter = deserializer.Deserialize<FrontMatter>(result)
frontMatter.MarkdownFilePath <- filePath
frontMatter
frontMatter
This is a bit more complex so lets unpack it:
filePath
.deserializer
, and return it. Here, we also keep track of the file path (we will need this later).We also need to load all of the markdown files:
let loadMarkdownFiles path = Directory.EnumerateFiles(path, "*.md", SearchOption.AllDirectories)
Notice how those last couple of functions are using ‘currying’. It lets us do all of the work in one pipeline:
path |> loadMarkdownFiles |> Seq.map extractFrontmatter |> Seq.iter (fun x -> printfn "%s - %s" x.MarkdownFilePath x.Title)
This gives us a dataset to work with. Next time we’ll continue with the rest of the requirements.
Many years ago, after working in my first programming job for a couple of years the company was taken over, and coding tests for new hires were introduced. The incumbent developers all decided to take the test, and it was seen as a fun diversion for a couple of hours.
I don’t have access to the actual wording of the requirements given to candidates, but the test required a text file containing around 100k words to be loaded and sorted into the largest set of the longest anagram. For example in the words file I’m using in this blog post, there are 466544 words in the file, 406627 of which are anagrams. The largest set is for a 7 letter anagram, of whih there are 15 words. There are smaller sets of longer anagrams, we’re not interested in those. And, it had to run in in less a second. They had three hours to write it, on a computer not connected to the internet. They had access to Java, through Eclipse, C/C++/C# through Visual Studio and Delphi through Embarcadero Studio.
I don’t know where the test originally came from - I think it originated in a different company which had been acquired by the same company I now worked for, but I’m not sure. I think the intent of the test was to in part gauge how the candiate reacted to the deadline pressure, part how they could understand the requirements given to them, and lastly what sort of code they wrote.
As it has been a long time and the company no longer recruits after moving most development overseas, so, I’m going to present my solution.
First we have to load the file, and figure out to generate the anagram and keep track of how many instances of that anagram there are. It turned out for the candidates taking the test that this was the bit that most got stuck on, specifically the short mental leap it took to working out you needed to sort the letters of the word alphabetically to create the key.
private static string CreateKey(string word)
{
var lowerCharArray = word.ToLowerInvariant().ToCharArray();
Array.Sort(lowerCharArray);
return new string(lowerCharArray);
}
private static void LoadWords(string filePath, Dictionary<string, List<string>> words)
{
using (var streamReader = File.OpenText(filePath))
{
string s;
while ((s = streamReader.ReadLine()) != null)
{
var key = CreateKey(s);
if (words.TryGetValue(key, out var set))
{
set.Add(s);
}
else
var newSet = new List<string> {s};
words.Add(key, newSet);
}
}
}
}
words
is a Dictionary<string, List<string>>
, which we use to track the count of anagrams. The rest of the file loading is a fairly standard while
loop over the reader ReadLine
method, checking the dictionary to see if the anagram has already been found, and if so add the new word to the set, otherwise, add the anagram and create a new list to hold the word(s).
Once we have all the words loaded and matched into sets of anagrams, we can process them to work out which is the largest set with the longest word.
private static KeyValuePair<string, List<string>> ProcessAnagrams(Dictionary<string, List<string>> words)
{
var largestSet = 0;
var longestWord = 0;
var foundSet = new KeyValuePair<string, List<string>>();
foreach (var set in words)
{
if (set.Value.Count >= largestSet)
{
largestSet = set.Value.Count;
if (set.Key.Length > longestWord)
{
longestWord = set.Key.Length;
foundSet = set;
}
else
{
longestWord = 0;
}
}
}
return foundSet;
}
Here we simply bruteforce check all of the entries in the dictionary to find the answer. It’s not elegant, but it gets the job done. Running it on my Macbook Pro gives:
406627 anagrams processed from 466544 in 00:02:850
File read and key generation in 00:02:829
Anagrams searched in: 00:00:021
Found:
Key: AEINRST (7), Count: 15
aeinrst
antsier
asterin
eranist
nastier
ratines
resiant
restain
retains
retinas
retsina
stainer
starnie
stearin
Tersina
There are lots of blog comment systems, and this blog has used Disqus as the comment system for a long time. I’m not going to go into all the reasons to move away from Disqus, but page load times and wanting more control over your data and being able to respect your readers privacy figure highly.
Also, this blog is a technical blog focused on software development and associated topics, and this means that anyone who wants to comment on my blog is almost certain to be familar with Github and have an account, and also be as uncomfortable using Disqus as I have been.
I did investigate rolling my own code based on examples from other blogs, who have used some jekyll liquid templates and javascript to pull from the Github API and use it to post comments back to the repo hosting the blog. This has some attraction, but also has a big drawback, which is the authorisation situation to the Github API, as you don’t really want your client id and client secret exposed in the repo.
You can get around this by hosting an app in heroku to use as the postback url so that you can hide the client id and client secret, and there is also staticman, but none of these seemed as simple as just using utteranc.es
To configure utteranc.es, head over to the website and follow the instructions, and fill out the form to suit you. For the blog post to issue mapping, I chose ‘Issue title contains page title’, and I also chose to have utteranc.es add a ‘Comment’ label to the issue it creates in the blog repository. After you do that, you’ll get a code snippet generated for you that looks somewhat like this:
<script src="https://utteranc.es/client.js"
repo="sgrassie/sgrassie.github.io"
issue-term="title"
label="Comment"
theme="github-light"
crossorigin="anonymous"
async>
</script>
Add this to a jekyll include, for example utterances.html
and then include it in your post.html
layout at the position you want the blog comments to appear. Most jekyll blog templates have Disqus support, so it will probably just be a simple case of finding where in the layout that Disqus is included, and replacing it.
If your existing comments are not important to you, then at this point you can stop and enjoy your new Github powered comment system. Personally for me, it’s the principle of the thing, and the fact that the comments on my blog belong to me, and the author of the comment. So, we can do something about it.
Disqus allows you to export your comments, and once you do so, you will get your comments emailed to the email registered with your Disqus account. I’ve done a lot of work with XML in a previous role, and I think that the Disqus XML export looks… odd. The reason I say that is that each post on your blog appears to be mapped to a <thread>
element, which contains a bunch of expected metadata about the blog post. I would expect each individual comment to be a nested in a <comments>
element, but this is not the case. Instead, each individual comment has an entry as a <post>
element at the same level as the <thread>
, and they are mapped to each other using and attribute id. I don’t think that makes any sense, I’m sure there must be good reasons. I just can’t think what might be.
A comment then, looks like this:
<thread dsq:id="1467739952">
<id>218 http://temporalcohesion.co.uk/?p=218</id>
<forum>temporalcohesion</forum>
<category dsq:id="2467491" />
<link>http://temporalcohesion.co.uk/2010/10/25/lets-write-an-api-library-for-github/</link>
<title>Let&#8217;s write an API library for Github</title>
<message />
<createdAt>2010-10-25T12:00:24Z</createdAt>
<author>
<name>Stuart Grassie</name>
<isAnonymous>false</isAnonymous>
<username>stuartgrassie</username>
</author>
<isClosed>false</isClosed>
<isDeleted>false</isDeleted>
</thread>
An actual comment on this post looks like:
<post dsq:id="952258229">
<id>wp_id=25</id>
<message><![CDATA[<p>Great post Stu!</p>]]></message>
<createdAt>2010-10-25T22:47:44Z</createdAt>
<isDeleted>false</isDeleted>
<isSpam>false</isSpam>
<author>
<name>John Sheehan</name>
<isAnonymous>true</isAnonymous>
</author>
<thread dsq:id="1467739952" />
You can see the way that the post
element is mapped back to the containing thread
using the dsq:id
attribute.
The strange structure of the XML makes it less straightforward to parse the XML, as it means we’ll have to do a little bit of work in matching up blog posts and the comments on them. Also very annoying is the fact that a thread
element doesn’t know if it actually has any associated post
comments.
We can acomplish this fairly easily with a little bit of F# and the FSharp.Data XmlProvider. Setting the provider up is straightforward, here I’m just using a direct reference to the assembly which I’d previously added via NuGet.
#r "../../.nuget/packages/fsharp.data/3.3.3/lib/netstandard2.0/FSharp.Data.dll
open FSharp.Data
type Disqus = XmlProvider<"/Users/stuart/Downloads/temporalcohesion-2020-07-13T20 27 09.014136-all.xml">
type Comment = { Author: string; Message: string; Created: System.DateTimeOffset; ParentThreadId: int64; }
type BlogPost = { Title: string; Url: string; Author: string; ThreadId: int64; Comments : Comment list }
let data = Disqus.Load("/Users/stuart/Downloads/temporalcohesion-2020-07-13T20 27 09.014136-all.xml")
If you are new to F# (and I’m still fairly new) this might look scary, but it really isn’t. After referencing the assembly in the script, we open the FSharp.Data namespace, and then initialise an XmlProvider by passing it the XML the file we’re going to parse.
That enables the XmlProvider to infer a lot of things about the XML in the file, and then the XmlProvider loads the actual data from the file. Two records are also defined to hold the details about the Threads/Posts that are going to imported, and how multiple comments map refer to a single blog post. These records are analagous to simple C# POCO classes with getters and setters.
With these types ready, we can define a couple of functions to convert the XML into them, and thus do a way with a lot of the extraneous noise from the XML, that we don’t really care about.
let toComments posts =
posts
|> Seq.filter (fun (post : Disqus.Post) -> not post.IsSpam || not post.IsDeleted)
|> Seq.map (fun (post : Disqus.Post) -> {Author = post.Author.Name; Message = post.Message; Created = post.CreatedAt; ParentThreadId = post.Thread.Id})
|> Seq.toArray
let toBlogPosts posts =
posts
|> Seq.filter (fun (thread : Disqus.Thread) -> not thread.IsDeleted)
|> Seq.map (fun (thread : Disqus.Thread) -> {Title = thread.Title; Url = thread.Link.Substring(0, thread.Link.Length - 1); Author = thread.Author.Name; ThreadId = thread.Id; Comments = [] })
These functions use currying, which as a longtime C# developer I’m still getting the hang of, and that will come in handy shortly. They map the Disqus
types generated by the XmlProvider into the custom types I defined, taking care to filter out comments we don’t want to import and not importing any blog posts which Disqus says have been deleted.
Seq.filter
in the toComments
function worked correctly, as I still had to go and manually delete a couple of comments that were marked as spam from the Github Issues
With those functions defined, we need a way of mapping the comments to the correct blog post.
let mapBlogToComments(post, comments) =
let commentsOnPost = comments
|> Array.filter (fun comment -> comment.ParentThreadId = post.ThreadId)
|> Array.toList
{post with Comments = commentsOnPost}
Here we take a single post, and all of the comments, and then use a nested function to grab the set of comments associated to that post, by way of the ThreadId
. With that written, we can use some more currying to create another function that will do a lot of hard work for us:
let addCommentsToTheirPosts comments = data.Threads |> toBlogPosts |> Seq.map (fun post -> mapBlogToComments(post, comments))
This function will take the threads, use the toBlogPosts
method to turn them into BlogPost
and then map each blog post to the correct comments using the method we’ve just defined to do that. But where do the comments
come from? Well, it turns out this currying thing is really quite useful, as it enables all this magic looking |>
, or ‘piping’ to happen.
let toImport = data.Posts
|> toComments
|> addCommentsToTheirPosts
|> Seq.filter (fun x -> x.Comments.Length > 0)
Take all the posts data, turn them all into comments, and then pipe that to the addCommentsToTheirPosts
function, and then filter out blog posts which don’t have any comments, as importing those is pointless. All for around 24 lines of code. I know full well the C# it would take do all that, and whilst with C# 8 you could probably get close, I doubt you’d equal 24 lines.
Just to be on the safe side, it’s probably a good idea to look through each of the posts and comments that we’ve now got to just to see if things are matching up correctly.
toImport |> Seq.iter (fun post -> printfn "%s - %s - comments: %d" post.Title post.Url post.Comments.Length)
Running that will give you an idea of what blog posts are going to be imported, and the number of comments. The first time I ran this, I found some of the blog posts in the Disqus XML import did not have the posts title set, so I was getting duplicated post titles. As there were only three instances of this error, I just manually corrected the XML and re-reran the script to check I had everything correct.
So far, so good. Now comes the fun part and something I’ve yet to do in F#, which is interop with a C# library. It turns out that it’s not so hard, but that makes perfect sense when you understand that F# is a .net language, just like C#. A long time ago I started to write an API library for GitHub, but I gave it up in favour of Octokit.net.
We can easily reference Octokit and open the namespace as before:
#r "../../.nuget/packages/octokit/0.48.0/lib/netstandard2.0/Octokit.dll"
open Octokit
Then we just need to setup a few variables:
let repo = "sgrassie.github.io"
let githubApp = "foo"
let token = "<your-personal-access-token-here>"
let credentials = Credentials(token)
let header = ProductHeaderValue(githubApp)
let client = GitHubClient(header, Credentials = credentials)
These just get us a client to work with, and all I did was just register a new Personal Access Token on my account to use as the password. Notice how with F# you don’t need to new
anything, even though they are classes from a C# assembly. These can then be used in the following function, which I’m gonna prefix with this warning:
It does work though, so just… use at your own caution.
let exportToGithub posts =
for post in posts do
System.Threading.Thread.Sleep(2000)
let issuebody = sprintf "Comment thread for the post [%s](%s)" post.Title post.Url
printfn "%s" issuebody
let newIssue = NewIssue(post.Title, Body = issuebody)
let issue = client.Issue.Create("sgrassie", repo, newIssue) |> Async.AwaitTask |> Async.RunSynchronously
printfn "New issue created for %s" post.Title
for comment in post.Comments do
System.Threading.Thread.Sleep(2000)
let message = sprintf "Comment by **%s** on **%s** (imported from Disqus):\r\n\r\n%s" comment.Author (comment.Created.ToString("f")) comment.Message
let newComment = client.Issue.Comment.Create("sgrassie", repo, issue.Number, message) |> Async.AwaitTask |> Async.RunSynchronously
printfn " New comment created for %s" comment.Author
toImport |> exportToGithub
I’m sure that a more experienced F# person is going to look at that and be like “WTF”, but as I said, it does work. I left the printfn
log messages in, but essentially it loops over each post, waits a couple of seconds and then creates the new issue, and then loops over all of the comments for that post and adds then as comments to the issue. I put the Thread.Sleep
’s in the there just so I didn’t hammer the Github API, but honestly there was that few to import I doubt it would have trigged the rate limit, but I imagine a more popular blog with more comments on the posts woould.
I’ve been upgrading part of our build infrastructure to handle the ongoing upgrade to .net core, and as part of that, I’ve had to update the Cake build script to handle doing the restore in an offline environment, on the build server.
There is a great post on the Octopus blog about writing a Cake build script for .net core, I encourage you to check that out, I’m not going to repeat too much of that.
My specific requirement is that the `DotNetCoreRestore’ needs to succeed on an ‘offline’ build server, that is, a build server that has no access to the internet.
In order for this to succeed, you are going to need provide a way for NuGet to get the packages, usually this is done by maintaining an offline NuGet cache which you can point NuGet at, or even checking the packages into the repository. I’d always recommend going with the first option, although there are scenarios were the second option might be required.
However you do it, you need to tell NuGet where they are. The easiest thing to do is to use a NuGet.Config
local to the .sln
, but it is possible to code a location into the script.
Here is the restore task:
Task("Restore")
.IsDependentOn("Clean")
.Does(() =>
{
var settings = new DotNetCoreRestoreSettings();
if(BuildSystem.IsRunningOnTeamCity)
{
settings.PackagesDirectory = "./packages";
settings.IgnoreFailedSources = true
//optionally
//settings.Sources = new[] { "http://someinternalfeed/nuget" }
}
foreach(var project in projects)
{
DotNetCoreRestore(project.FullPath, settings);
}
});
This project uses a NuGet.config
to add the paths of internal package sources, and sets the location of the packages folder to be local to the .sln
- we’ve found this cuts down on conflicts on developer machines.
In the previous post, I displayed my fledgling understanding of F# by writing a script which can parse the CSV set of results of the English Premier League to generate the league table. The script does this primarily by using a mutable BCL Dictionary
. F# is immutable by default, and whilst that is itself not immutable, you have to go out of your way to enable it. I’ll try to save repeating Scott Wlashin.
There are some improvements that can be made to the script. I’ll highlight them here and then link to the full script as a gist.
First a note on pattern matching. In the previous post I mentioned that I thought I could use pattern matching in a particular place, and obviously I can:
let fullTimeResult =
match row.FTR with
| "H" -> Home
| "A" -> Away
| _ -> Draw
Rather than if/then/else. Here, the “” is equal to the default
in a C# switch statement, if it’s not a Home or Away (win), then it _must be a draw.
To start making things immutable, we can update the updateTeam
function from the previous post, and pass in a Map<string, LeagueRow>
:
let updateTeam (league : Map<string, LeagueRow>, team : string, points : int, forGoals : int, againstGoals: int, won : int, drawn, lost: int) =
if league.ContainsKey team then
let existing = league.[team]
let updated = {existing with Played = existing.Played + 1; Won = existing.Won + won; Drawn = existing.Drawn + drawn; Lost = existing.Lost + lost; For = existing.For + forGoals; Against = existing.Against + againstGoals; Points = existing.Points + points}
league.Add(team, updated)
else
let leagueRow = {Team = team; Played = 1; Won = won; Drawn = drawn; Lost = lost; GD = 0; For = forGoals; Against = againstGoals; Points = points}
league.Add(team, leagueRow)
The code is almost the same as the previous version, except that we no longer use the <-
operator to update the mutable dictionary. What’s going on instead is that F# creates a new instance of the LeagueRow
, with updated values, and adds that to the Map
, by key, which has the side-effect of creating a new instance of the whole Map
, with the league row identified by the key replaced with the updated version.
The updateHomeWin
function becomes:
let updateHomeWin (league : Map<string, LeagueRow>, result : MatchResult) =
let league = updateTeam(league, result.HomeTeam, 3, result.HomeGoals, result.AwayGoals, 1, 0, 0)
let league = updateTeam(league, result.AwayTeam, 0, result.AwayGoals, result.HomeGoals, 0, 0, 1)
league
This again replaces the BCL Dictionary with the Map, and simply passes the league map through each updateTeam
call, and then returns the updated league object.
processMatchResult
is also updated to pass in a Map
, and calling the fold with it and a default map is straightforward:
|> Seq.fold processMatchResult (Map<string, LeagueRow> [])
This makes the script much more ‘the way of things’ in F#, which is to say it’s using an immutable data structure.