Minification vs the GPL

A not-entirely-theoretical question about open source software licensing came up at work the other day. I thought it was interesting enough to warrant a quick dive into the philosophy of minification, and how it relates to copyleft open source licenses. Specifically: does distributing (only) minified source code violate the GPL?

If you’ve come here looking for a legally-justifiable answer to that question, you’re out of luck. But what I can give you is a (fictional) story:

TheseusJS is slow

TheseusJS is a (fictional) Javascript library designed to be run in a browser. It’s released under the GPLv3 license. This license allows you to download and use TheseusJS for any purpose you like, including making money off it, modifying it, or redistributing it to others… but it requires that if you redistribute it you have to do so under the same license and include the source code. As such, it forces you to share with others the same freedoms you enjoy for yourself, which is highly representative of some schools of open-source thinking.

Screenshot showing TheseusJS's GitHub page. The project hasn't been updated in a year, and that was just to add a license: no code has been changed in 12 years.
It’s a cool project, but it really needs some maintenance this side of 2010.

It’s a great library and it’s used on many websites, but its performance isn’t great. It’s become infamous for the impact it has on the speed of the websites it’s used on, and it’s often the butt of jokes by developers: “Man, this website’s slow. Must be running Theseus!”

The original developer has moved onto his new project, Moralia, and seems uninterested in handling the growing number of requests for improvements. So I’ve decided to fork it and make my own version, FastTheseusJS and work on improving its speed.

FastTheseusJS is fast

I do some analysis and discover the single biggest problem with TheseusJS is that the Javascript file itself is enormous. The original developer kept all of the copious documentation in comments in the file itself, and for some reason it doesn’t even compress well. When you use TheseusJS on a website it takes a painfully long time for a browser to download it, if it’s not precached.

Screenshot showing a website for the TheseusJS API. It's pretty labyrinthine (groan).
Nobody even uses the documentation in the comments: there’s a website with a fully-documented API.

My first release of FastTheseusJS, then, removes virtually of the comments, replacing them with a single comment at the top pointing developers to a website where the API is fully documented. While I’m in there anyway, I also fix a minor bug that’s been annoying me for a while.

v1.1.0 changes

  • Forked from TheseusJS v1.0.4
  • Fixed issue #1071 (running mazeSolver() without first connecting <String> component results in endless loop)
  • Removed all comments: improves performance considerably

I discover another interesting fact: the developer of TheseusJS used a really random mixture of tabs and spaces for indentation, sometimes in the same line! It looks… okay if you set your editor up just right, but it’s pretty hideous otherwise. That whitespace is unnecessary anyway: the codebase is sprawling but it seldom goes more than two levels deep, so indentation levels don’t add much readability. For my second release of FastTheseusJS, then, I remove this extraneous whitespace, as well as removing the in-line whitespace inside parameter lists and the components of for loops. Every little helps, right?

v1.1.1 changes

  • Standardised whitespace usage
  • Removed unnecessary whitespace

Some of the simpler functions now fit onto just a single line, and it doesn’t even inconvenience me to see them this way: I know the codebase well enough by now that it’s no disadvantage for me to edit it in this condensed format.

Screenshot of a block of Javascript code intended using semicolons rather than tabs or spaces.
Personally, I’ve given up on the tabs-vs-spaces debate and now I indent my code using semicolons. (That’s clearly a joke. Don’t flame me.)

In the next version, I shorten the names of variables and functions in the code.

For some reason, the original developer used epic rambling strings for function names, like the well-known function dedicateIslandTempleToTheImageOfAGodBeforeOrAfterMakingASacrificeWithOrWithoutDancing( boolBeforeMakingASacrifice, objectImageOfGodToDedicateIslandTempleTo, stringNmeOfPersonMakingDedication, stringOrNullNameOfLocalIslanderDancedWith). That one gets called all the time internally and isn’t exposed via the external API so it might as well be shortened to d=(i,j,k,l,m)=>. Now all the internal workings of the library are each represented with just one or two letters.

v1.1.2 changes

  • Shortened/standarised non-API variable and function names – improves performance

I’ve shaved several kilobytes off the monstrous size of TheseusJS and I’m very proud. The original developer says nice things about my fork on social media, resulting in a torrent of downloads and attention. Within a certain archipelago of developers, I’m slightly famous.

But did I violate the license?

But then a developer says to me: you’re violating the license of the original project because you’re not making the source code available!

A man in a suit sits outdoors with a laptop and a cup of coffee. He is angry and frustrated, and a bubble shows that he is thinking:"why can't people respect the fucking license?!"
This happens every day. Probably not to this same guy every time though, but you never know. Original photo by Andrea Piacquadio.

They claim that my bugfix in the first version of FastTheseusJS represents a material change to the software, and that the changes I’ve made since then are obfuscation: efforts short of binary compilation that aim to reduce the accessibility of the source code. This fails to meet the GPL‘s definition of source code as “the preferred form of the work for making modifications to it”. I counter that this condensed view of the source code is my “preferred” way of working with it, and moreover that my output is not the result of some build step that makes the code harder to read, the code is just hard to read as a result of the optimisations I’ve made. In ambiguous cases, whose “preference” wins?

Did I violate the license? My gut feeling is that no, all of my changes were within the spirit and the letter of the GPL (they’re a terrible way to write code, but that’s not what’s in question here). Because I manually condensed the code, did so with the intention that this condensing was a feature, and continue to work directly with the code after condensing it because I prefer it that way… that feels like it’s “okay”.

But if I’d just run the code through a minification tool, my opinion changes. Suppose I’d run minify --output fasttheseus.js theseus.js and then deleted my copy of theseus.js. Then, making changes to fasttheseus.js and redistributing it feels like a violation to me… even if the resulting code is the same as I’d have gotten via the “manual” method!

I don’t know the answer (IANAL), but I’ll tell you this: I feel hypocritical for saying one piece of code would not violate the license but another identical piece of code would, based only on the process the developer followed to produce it. If I replace one piece of code at a time with less-readable versions the license remains intact, but if I replace them all at once it doesn’t? That doesn’t feel concrete nor satisfying.

Screenshot showing highly-minified HTML code (for this page) which is still reasonably readable.
Sure, I can write a blog post in just one line of code. It’ll just be a really, really, really long line… (Still perfectly readable, though!)

This isn’t an entirely contrived example

This example might seem highly contrived, and that’s because it is. But the grey area between the extremes is where the real questions are. If you agree that redistribution of (only) minified source code violates the GPL, you’re left asking: at what point does the change occur? Code isn’t necessarily minified or not-minified: there are many intermediate steps.

If I use a correcting linter to standardise indentation and whitespace – switching multiple spaces for the appropriate number of tabs, removing excess line breaks etc. (or do the same tasks manually) I’m sure you’d agree that’s fine. If I have it replace whole-function if-blocks with hoisted return statements, that’s probably fine too. If I replace if blocks with ternery operators or remove or shorten comments… that might be fine, but probably depends upon context. At some point though, some way along the process, minification goes “too far” and feels like it’s no longer within the limitations of the license. And I can’t tell you where that point is!

This issue’s even more-complicated with some other licenses, e.g. the AGPL, which extends the requirement to share source code to hosted applications. Suppose I implement a web application that uses an AGPL-licensed library. The person who redistributed it to me only gave me the minified version, but they gave me a web address from which to acquire the full source code, so they’re in the clear. I need to make a small patch to the library to support my service, so I edit it right into the minified version I’ve already got. A user of my hosted application asks for a copy of the source code, so I provide it, including the edited minified library… am I violating the license for not providing the full, unminified version, even though I’ve never even seen it? It seems absurd to say that I would be, but it could still be argued to be the case.

Diagram showing how permissive software licenses are generally compatible for use in LGPL or MPL licensed software, which are then compatible for use (except MPL) in GPL licensed software, which are in turn compatible for use (except GPL 2) with AGPL licensed software.
I love diagrams like this, which show license compatibility of different open source licenses. Adapted from a diagram by Carlo Daffara, in turn adapted from a diagram by David E. Wheeler, used under a CC-BY-SA license.

99% of the time, though, the answer’s clear, and the ambiguities shown above shouldn’t stop anybody from choosing to open-source their work under GPL, AGPL (or any other open source license depending on their preference and their community). Perhaps the question of whether minification violates the letter of a copyleft license is one of those Potter Stewart “I know it when I see it” things. It certainly goes against the spirit of the thing to do so deliberately or unnecessarily, though, and perhaps it’s that softer, more-altruistic goal we should be aiming for.

Screenshot showing TheseusJS's GitHub page. The project hasn't been updated in a year, and that was just to add a license: no code has been changed in 12 years.× Screenshot showing a website for the TheseusJS API. It's pretty labyrinthine (groan).× Screenshot of a block of Javascript code intended using semicolons rather than tabs or spaces.× A man in a suit sits outdoors with a laptop and a cup of coffee. He is angry and frustrated, and a bubble shows that he is thinking:"why can't people respect the fucking license?!"× Screenshot showing highly-minified HTML code (for this page) which is still reasonably readable.× Diagram showing how permissive software licenses are generally compatible for use in LGPL or MPL licensed software, which are then compatible for use (except MPL) in GPL licensed software, which are in turn compatible for use (except GPL 2) with AGPL licensed software.×

Exploiting vulnerabilities in Cellebrite UFED and Physical Analyzer from an app’s perspective

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Cellebrite makes software to automate physically extracting and indexing data from mobile devices. They exist within the grey – where enterprise branding joins together with the larcenous to be called “digital intelligence.” Their customer list has included authoritarian regimes in Belarus, Russia, Venezuela, and China; death squads in Bangladesh; military juntas in Myanmar; and those seeking to abuse and oppress in Turkey, UAE, and elsewhere. A few months ago, they announced that they added Signal support to their software.

Their products have often been linked to the persecution of imprisoned journalists and activists around the world, but less has been written about what their software actually does or how it works. Let’s take a closer look. In particular, their software is often associated with bypassing security, so let’s take some time to examine the security of their own software.

Moxie Marlinspike (Signal)

Recently Moxie, co-author of the Signal Protocol, came into possession of a Cellebrite Extraction Device (phone cracking kit used by law enforcement as well as by oppressive regimes who need to clamp down on dissidents) which “fell off a truck” near him. What an amazing coincidence! He went on to report, this week, that he’d partially reverse-engineered the system, discovering copyrighted code from Apple – that’ll go down well! – and, more-interestingly, unpatched vulnerabilities. In a demonstration video, he goes on to show that a carefully crafted file placed on a phone could, if attacked using a Cellebrite device, exploit these vulnerabilities to take over the forensics equipment.

Obviously this is a Bad Thing if you’re depending on that forensics kit! Not only are you now unable to demonstrate that the evidence you’re collecting is complete and accurate, because it potentially isn’t, but you’ve also got to treat your equipment as untrustworthy. This basically makes any evidence you’ve collected inadmissible in many courts.

Moxie goes on to announce a completely unrelated upcoming feature for Signal: a minority of functionally-random installations will create carefully-crafted files on their devices’ filesystem. You know, just to sit there and look pretty. No other reason:

In completely unrelated news, upcoming versions of Signal will be periodically fetching files to place in app storage. These files are never used for anything inside Signal and never interact with Signal software or data, but they look nice, and aesthetics are important in software. Files will only be returned for accounts that have been active installs for some time already, and only probabilistically in low percentages based on phone number sharding. We have a few different versions of files that we think are aesthetically pleasing, and will iterate through those slowly over time. There is no other significance to these files.

That’s just beautiful.

Hey ONS: This Is Not A Mistake

Hi, ONS! I know we haven’t really spoken since you ghosted me in 2011, but I just wanted to clear something up for you –

This is not a mistake (except for the missing last names):

(Specimen) 2021 census form on which Ruth declares that she cohabits with both a husband AND a partner.
It’s perfectly possible for somebody to live with multiple partners, even if they’re forbidden from marrying more than one.

Back in 2011 you thought it was a mistake, and this prevented my partner, her husband and I from filling out the digital version of the census. I’m sure it’s not common for somebody to have multiple cohabiting romantic relationships (though it’s possibly more common than some other things you track…), but surely an “Are you sure?” would be better than a “No you don’t!”

Clippy says "It looks like you've got a husband AND a partner. Is that right?" with possible answers "Yes, and it's awesome." or "No, but I can dream!"
For all I know, you already fixed it. If not: I mocked-up a UI for you.

We worked around it in 2011 by using the paper forms. Apparently this way you still end up “correcting” our relationship status for us (gee, thanks!) but at least – I gather – the originals are retained. So maybe in a more-enlightened time, future statisticians might be able ask about the demographics of domestic nonmonogamy and have at least some data to work with from the early 21st century.

I know you’re keen for as many people as possible to do the census digitally this year. But unless you’ve fixed your forms then my family and I – and thousands of others like us – will either have to use the paper copies you’re trying to phase out… or else knowingly lie on the digital versions. Which would you prefer?

(Specimen) 2021 census form on which Ruth declares that she cohabits with both a husband AND a partner.× Clippy says "It looks like you've got a husband AND a partner. Is that right?" with possible answers "Yes, and it's awesome." or "No, but I can dream!"×

Santander to Accept Homemade Deeds Poll

For most of the last decade, one of my side projects has been FreeDeedPoll.org.uk, a website that helps British adults to change their name for free and without a solicitor. Here’s a little known fact: as a British citizen, you have the right to be known by virtually any name you like, and for most people the simplest way to change it is to write out a deed poll: basically a one-person contract on which you promise that you’re serious about adopting your new name and you’re not committing fraud or anything.

FreeDeedPoll.org.uk
This web design looked dated when I made it and hasn’t gotten any younger, but the content remains valid as ever.

Over that time, I’ve helped thousands of people to change their names. I don’t know exactly how many because I don’t keep any logs, but I’ve always gotten plenty of email from people about the project. Contact spiked in 2013 after the Guardian ran an article about it, but I still correspond with two or three people in a typical week.

These people have lots of questions that come up time and time again, and if I had more free time I’d maintain an FAQ of them or something. In any case, a common one is people asking for advice when their high street bank, almost invariably either Nationwide or Santander, disputes the legitimacy of a “home made” deed poll and refuses to accept it.

Abbey National and Abbey (former names of Santander) crossed out and replaced with Santander.
You’d think that Santander of all people would appreciate how important it is to have your legitimate change of name respected. Hang on… haven’t I joked about their rebranding before?

When such people contact me, I advise them of a number of solutions and workarounds. Going to a different branch can work (training at these high street banks is internally inconsistent, I guess?). Getting your government-issued identity documents sorted and then threatening to move your account elsewhere can sometimes work. For applicants willing to spend a little money, paying a solicitor a couple of quid to be one of your witnesses can work. I often don’t hear back from people who email me about these banks: maybe they find success by one of these routes, or maybe they give up and go down one an unnecessarily-expensive avenue.

But one thing I always put on the table is the possibility of fighting. I provide a playbook of strategies to try to demonstrate to their troublemaking bank that the bank is in the wrong, along with all of the appropriate legal citations. Recent years put a new tool in the box: the GDPR/DPA2018, which contains clauses prohibiting companies from knowingly retaining incorrect personal data about an individual. I’ve been itching for a chance to use these new weapons… and over this last month, I finally had the opportunity.

A man signs a document.
Print this. Sign here. That’s pretty-much all there is to it.

I was recently contacted by a student (who, as you might expect, has more free time than they do spare money!) who was having trouble with Santander refusing to accept their deed poll. They were willing to go all-out to prove their bank wrong. So I gave them the toolbox and they worked through it and… Santander caved!

Not only have Santander accepted that they were wrong in the case of this student, but they’ve also committed to retraining their staff. Oh, and they’ve paid compensation to the student who emailed me.

Even from my position on the sidelines, I couldn’t help but cheer at this news, and not just because I’ll hopefully have fewer queries to deal with.

Abbey National and Abbey (former names of Santander) crossed out and replaced with Santander.× A man signs a document.×

G7 Comes Out in Favor of Encryption Backdoors

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

From a G7 meeting of interior ministers in Paris this month, an “outcome document“:

Encourage Internet companies to establish lawful access solutions for their products and services, including data that is encrypted, for law enforcement and competent authorities to access digital evidence, when it is removed or hosted on IT servers located abroad or encrypted, without imposing any particular technology and while ensuring that assistance requested from internet companies is underpinned by the rule law and due process protection. Some G7 countries highlight the importance of not prohibiting, limiting, or weakening encryption;

There is a weird belief amongst policy makers that hacking an encryption system’s key management system is fundamentally different than hacking the system’s encryption algorithm. The difference is only technical; the effect is the same. Both are ways of weakening encryption.

The G7’s proposal to encourage encryption backdoors demonstrates two unsurprising things about the politicians in attendance, including that:

  • They’re unwilling to attempt to force Internet companies to add backdoors (e.g. via legislation, fines, etc.), making their resolution functionally toothless, and
  • More-importantly: they continue to fail to understand what encryption is and how it works.

Somehow, then, this outcome document simultaneously manages to both go too-far (for a safe and secure cryptographic landscape for everyday users) and not-far-enough (for law enforcement agencies that are in favour of backdoors, despite their huge flaws, to actually gain any benefit). Worst of both worlds, then.

Needless to say, I favour not attempting to weaken encryption, because such measures (a) don’t work against foreign powers, terrorist groups, and hardened criminals and (b) do weaken the personal security of law-abiding citizens and companies (who can then become victims of the former group). “Backdoors”, however phrased, are a terrible idea.

I loved Schneier’s latest book, by the way. You should read it.

Mark Zuckerberg asks governments to help control internet content

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Mark Zuckerberg

Mark Zuckerberg says regulators and governments should play a more active role in controlling internet content.

In an op-ed published in the Washington Post, Facebook’s chief says the responsibility for monitoring harmful content is too great for firms alone.

He calls for new laws in four areas: “Harmful content, election integrity, privacy and data portability.”

It comes two weeks after a gunman used the site to livestream his attack on a mosque in Christchurch, New Zealand.

“Lawmakers often tell me we have too much power over speech, and frankly I agree,” Mr Zuckerberg writes, adding that Facebook was “creating an independent body so people can appeal our decisions” about what is posted and what is taken down.

An interesting move which puts Zuckerberg in a parallel position to Bruce Schneier, who’s recently (and especially in his latest book) stood in opposition to a significant number of computer security experts (many of whom are of the “crypto-anarchist” school of thought) also pushed for greater regulation on the Internet. My concern with both figureheads’ proposals comes from the inevitable difficulty in enforcing Internet-wide laws: given that many countries simply won’t enact, or won’t effectively enforce, legislation of the types that either Zuckerberg nor Schneier suggest, either (a) companies intending to engage in unethical behaviour will move to – and profit in – those countries, as we already see with identity thieves in Nigeria, hackers in Russia, and patent infringers in China… or else (b) countries that do agree on a common framework will be forced to curtail Internet communications with those countries, leading to a fragmented and ultimately less-free Internet.

Neither option is good, but I still back these proposals in principle. After all: we don’t enact other internationally-relevant laws (like the GDPR, for example) because we expect to achieve 100% compliance across the globe – we do so because they’re the right thing to do to protect individuals and economies from harm. Little by little, Internet legislation in general (possibly ignoring things like the frankly silly EU cookie regulation and parts of the controversial new EU directives on copyright) makes the Internet a safer place for citizens of Western countries. There are still a huge number of foreign threats like scammers and malware authors as as well as domestic lawbreakers, but increasing the accountability of large companies is, at this point, a far bigger concern.

German chat app slacking on hashing fined €20k

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

by Richard Chirgwin (The Register)

German chat platform Knuddels.de (“Cuddles”) has been fined €20,000 for storing user passwords in plain text (no hash at all? Come on, people, it’s 2018).

The data of Knuddels users was copied and published by malefactors in July. In September, someone emailed the company warning them that user data had been published at Pastebin (only 8,000 members affected) and Mega.nz (a much bigger breach). The company duly notified its users and the Baden-Württemberg data protection authority.

Interesting stuff: this German region’s equivalent of the ICO applied a fine to this app for failing to hash passwords, describing them as personal information that was inadequately protected following their theft. That’s interesting because it sets a German, and to a lesser extend a European, precedent that plaintext passwords can be considered personal information and therefore allowing the (significant) weight of the GDPR to be applied to their misuse.

How My Stupid Bloody Name Finally Paid For Itself

Since changing my surname 11½ years ago to the frankly-silly (albeit very “me”) Q, I’ve faced all kinds of problems, from computer systems that don’t accept my name to a mocking from the Passport Office to getting banned from Facebook. I soon learned to work-around systems that insisted that surnames were at least two characters in length. This is a problem which exists mostly because programmers don’t understand how names work in the real world (or titles, for that matter, as I’ve also discovered).

It’s always been a bit of an inconvenience to have to do these things, but it’s never been a terrible burden: even when I fly internationally – which is probably the hardest part of having my name – I’ve learned the tricks I need to minimise how often I’m selected for an excessive amount of unwanted “special treatment”.

Airport
I plan to make my first trip to the USA since my name change, next year. Place bets now on how that’ll go.

This year, though, for the very first time, my (stupid bloody) unusual name paid for itself. And not just in the trivial ways I’m used to, like being able to spot my badge instantly on the registration table at conferences I go to or being able to fill out paper forms way faster than normal people. I mean in a concrete, financially-measurable way. Wanna hear?

So: I’ve a routine of checking my credit report with the major credit reference agencies every few years. I’ve been doing so since long before doing so became free (thanks GDPR); long even before I changed my name: it just feels like good personal data housekeeping, and it’s interesting to see what shows up.

Message to Equifax asking them to correct the details on my Credit Report.
It started out with the electoral roll. How did it end up like this? It was only the electoral roll. It was only the electoral roll.

And so I noticed that my credit report with Equifax said that I wasn’t on the electoral roll. Which I clearly am. Given that my credit report’s pretty glowing, I wasn’t too worried, but I thought I’d drop them an email and ask them to get it fixed: after all, sometimes lenders take this kind of thing into account. I wasn’t in any hurry, but then, it seems: neither were they –

  • 2 February 2016 – I originally contacted them
  • 18 February 2016 – they emailed to say that they were looking into it and that it was taking a while
  • 22 February 2016 – they emailed to say that they were still looking into it
  • 13 July 2016 – they emailed to say that they were still looking into it (which was a bit of a surprise, because after so long I’d almost forgotten that I’d even asked)
  • 14 July 2016 – they marked the issue as “closed”… wait, what?
Equifax close my request
Given that all they’d done for six months was email me occasionally to say that it was taking a while, it was a little insulting to then be told they’d solved it.

I wasn’t in a hurry, and 2017 was a bit of a crazy year for me (for Equifax too, as it happens), so I ignored it for a bit, and then picked up the trail right after the GDPR came into force. After all, they were storing personal information about me which was demonstrably incorrect and, continued to store and process it even after they’d been told that it was incorrect (it’d have been a violation of principle 4 of the DPA 1998, too, but the GDPR‘s got bigger teeth: if you’re going to sick the law on somebody, it’s better that it has bark and bite).

My message instructing Equifax to fix their damn data about me.
Throwing the book tip-of-the-day: don’t threaten, just explain what you require and under what legal basis you’re able to do so. Let lawyers do the tough stuff.

My anticipation was that my message of 13 July 2018 would get them to sit up and fix the issue. I’d assumed that it was probably related to my unusual name and that bugs in their software were preventing them from joining-the-dots between my credit report and the Electoral Roll. I’d also assumed that this nudge would have them either fix their software… or failing that, manually fix my data: that can’t be too hard, can it?

Apparently it can:

Equifax suggest that I change my name ON THE ELECTORAL ROLL to match my credit report, rather than the other way around.
You want me to make it my problem, Equifax, and you want me to change my name on the Electoral Roll to match the incorrect name you use to refer to me in your systems?

Equifax’s suggested solution to the problem on my credit report? Change my name on the Electoral Roll to match the (incorrect) name they store in their systems (to work around a limitation that prevents them from entering single-character surnames)!

At this point, they turned my send-a-complaint-once-every-few-years project into a a full blown rage. It’s one thing if you need me to be understanding of the time it can take to fix the problems in your computer systems – I routinely develop software for large and bureaucratic organisations, I know the drill! – but telling me that your bugs are my problems and telling me that I should lie to the government to work around them definitely isn’t okay.

Actually, Equifax: no. No no no no no. No.
Dear Equifax: No. No no no. No. Also, no. Now try again. Love Dan.

At this point, I was still expecting them to just fix the problem: if not the underlying technical issue then instead just hack a correction into my report. But clearly they considered this, worked out what it’d cost them to do so, and decided that it was probably cheaper to negotiate with me to pay me to go away.

Which it was.

This week, I accepted a three-figure sum from Equifax as compensation for the inconvenience of the problem with my credit report (which now also has a note of correction, not that my alleged absence from the Electoral Roll has ever caused my otherwise-fine report any trouble in the past anyway). Curiously, they didn’t attach any strings to the deal, such as not courting publicity, so it’s perfectly okay for me to tell you about the experience. Maybe you know somebody who’s similarly afflicted: that their “unusual” name means that a credit reference company can’t accurately report on all of their data. If so, perhaps you’d like to suggest that they take a look at their credit report too… just saying.

Cash!
You can pay for me to go away, but it takes more for me to shut up. (A lesson my parents learned early on.)

Apparently Equifax think it’s cheaper to pay each individual they annoy than it is to fix their database problems. I’ll bet that, in the long run, that isn’t true. But in the meantime, if they want to fund my recent trip to Cornwall, that’s fine by me.

Airport× Message to Equifax asking them to correct the details on my Credit Report.× Equifax close my request× My message instructing Equifax to fix their damn data about me.× Equifax suggest that I change my name ON THE ELECTORAL ROLL to match my credit report, rather than the other way around.× Actually, Equifax: no. No no no no no. No.× Cash!×

After Section 702 Reauthorization

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

After Section 702 Reauthorization – Schneier on Security (schneier.com)

For over a decade, civil libertarians have been fighting government mass surveillance of innocent Americans over the Internet. We’ve just lost an important battle. On January 18, President Trump signed the renewal of Section 702, domestic mass surveillance became effectively a permanent part of US law. Section 702 was initially passed in 2008, as an…

When One Library Steals From Another

When I first started working at the Bodleian Libraries in 2011, their websites were looking… a little dated. I’d soon spend some time working with a vendor (whose premises mysteriously caught fire while I was there, freeing me up to spend my birthday in a bar) to develop a fresh, modern interface for our websites that, while not the be-all and end-all, was a huge leap forwards and has served us well for the last five years or so.

The Bodleian Libraries website as it appeared in 2011.
The colour scheme, the layout, the fact that it didn’t remotely work on mobiles… there was a lot wrong with the old design of the Bodleian Libraries’ websites.

Fast-forward a little: in about 2015 we noticed a few strange anomalies in our Google Analytics data. For some reason, web addresses were appearing that didn’t exist anywhere on our site! Most of these resulted from web visitors in Turkey, so we figured that some Turkish website had probably accidentally put our Google Analytics user ID number into their code rather than their own. We filtered out the erroneous data – there wasn’t much of it; the other website was clearly significantly less-popular than ours – and carried on. Sometimes we’d speculate about the identity of the other site, but mostly we didn’t even think about it.

Bodleian Library & Radcliffe Camera website
How a Bodleian Libraries’ website might appear today. Pay attention, now: there’ll be a spot-the-difference competition in a moment.

Earlier this year, there was a spike in the volume of the traffic we were having to filter-out, so I took the time to investigate more-thoroughly. I determined that the offending website belonged to the Library of Bilkent University, Turkey. I figured that some junior web developer there must have copy-pasted the Bodleian’s Google Analytics code and forgotten to change the user ID, so I went to the website to take a look… but I was in for an even bigger surprise.

Bilkent University Library website, as it appears today.
Hey, that looks… basically identical!

Whoah! The web design of a British university was completely ripped-off by a Turkish university! Mouth agape at the audacity, I clicked my way through several of their pages to try to understand what had happened. It seemed inconceivable that it could be a coincidence, but perhaps it was supposed to be more of an homage than a copy-paste job? Or perhaps they were ripped-off by an unscrupulous web designer? Or maybe it was somebody on the “inside”, like our vendor, acting unethically by re-selling the same custom design? I didn’t believe it could be any of those things, but I had to be sure. So I started digging…

Bodleian and Bilkent search boxes, side-by-side.
Our user research did indicate that putting the site and catalogue search tools like this was smart. Maybe they did the same research?

 

Bodleian and Bilkent menus side-by-side.
Menus are pretty common on many websites. They probably just had a similar idea.

 

Bodleian and Bilkent opening hours, side-by-side.
Tabs are a great way to show opening hours. Everybody knows that. And this is obviously just the a popular font.

 

Bodleian and Bilkent sliders, side-by-side.
Oh, you’ve got a slider too. With circles? And you’ve got an identical Javascript bug? Okay… now that’s a bit of a coincidence…

 

Bodleian and Bilkent content boxes, side-by-side.
Okay, I’m getting a mite suspicious now. Surely we didn’t independently come up with this particular bit of design?

 

Bodleian and Bilkent footers, side-by-side.
Well these are clearly different. Ours has a copyright notice, for example…

 

Copyright notice on Bilkent University Library's website.
Oh, you DO have a copyright notice. Hang on, wait: you’ve not only stolen our design but you’ve declared it to be open-source???

I was almost flattered as I played this spot-the-difference competition, until I saw the copyright notice: stealing our design was galling enough, but then relicensing it in such a way that they specifically encourage others to steal it too was another step entirely. Remember that we’re talking about an academic library, here: if anybody ought to have a handle on copyright law then it’s a library!

I took a dive into the source code to see if this really was, as it appeared to be, a copy-paste-and-change-the-name job (rather than “merely” a rip-off of the entire graphic design), and, sure enough…

HTML source code from Bilkent University Library.
In their HTML source code, you can see both the Bodleian’s Google Analytics code (which they failed to remove) but also their own. And a data- attribute related to a project I wrote and that means nothing to their site.

It looks like they’d just mirrored the site and done a search-and-replace for “Bodleian”, replacing it with “Bilkent”. Even the code’s spelling errors, comments, and indentation were intact. The CSS was especially telling (as well as being chock-full of redundant code relating to things that appear on our website but not on theirs)…

CSS code from Bilkent University.
The search-replace resulted in some icky grammar, like “the Bilkent” appearing in their code. And what’s this? That’s MY NAME in the middle of their source code!

So I reached out to them with a tweet:

Tweet: Hey @KutphaneBilkent (Bilkent University Library): couldn't help but notice your website looks suspiciously like those of @bodleianlibs...?
My first tweet to Bilkent University Library contained a “spot the difference” competition.

I didn’t get any response, although I did attract a handful of Turkish followers on Twitter. Later, they changed their Twitter handle and I thought I’d take advantage of the then-new capability for longer tweets to have another go at getting their attention:

Tweet: I see you've changed your Twitter handle, @librarybilkent! Your site still looks like you've #stolen the #webdesign from @bodleianlibs, though (and changed the license to a #CreativeCommons one, although the fact you forgot to change the #GoogleAnalytics ID is a giveaway...).
This time, I was a little less-sarcastic and a little more-aggressive. Turns out that’s all that was needed.

Clearly this was what it took to make the difference. I received an email from the personal email account of somebody claiming to be Taner Korkmaz, Systems Librarian with Bilkent’s Technical Services team. He wrote (emphasis mine):

Dear Mr. Dan Q,

My name is Taner Korkmaz and I am the systems librarian at Bilkent. I am writing on behalf of Bilkent University Library, regarding your share about Bilkent on your Twitter account.

Firstly, I would like to explain that there is no any relation between your tweet and our library Twitter handle change. The librarian who is Twitter admin at Bilkent did not notice your first tweet. Another librarian took this job and decided to change the twitter handle because of the Turkish letters, abbreviations, English name requirement etc. The first name was @KutphaneBilkent (kutuphane means library in Turkish) which is not clear and not easy to understand. Now, it is @LibraryBilkent.

About 4 years ago, we decided to change our library website, (and therefore) we reviewed the appearance and utility of the web pages.

We appreciated the simplicity and clarity of the user interface of University of Oxford Bodlien Library & Radcliffe Camera, as an academic pioneer in many fields. As a not profit institution, we took advantage of your template by using CSS and HTML, and added our own original content.

We thought it would not create a problem the idea of using CSS codes since on the web page there isn’t any license notice or any restriction related to the content of the template, and since the licenses on the web pages are mainly more about content rather than templates.

The Library has its own Google Analytics and Search Console accounts and the related integrations for the web site statistical data tracking. We would like to point out that there is a misunderstanding regarding this issue.

In 2017, we started to work on creating a new web page and we will renew our current web page very soon.

Thank you in advance for your attention to this matter and apologies for possible inconveniences.

Yours sincerely,

Or to put it another way: they decided that our copyright notice only applied to our content and not our design and took a copy of the latter.

Do you remember when I pointed out earlier that librarians should be expected to know their way around copyright law? Sigh.

They’ve now started removing evidence of their copy-pasting such as the duplicate Google Analytics code fragment and the references to LibraryData, but you can still find the unmodified code via archive.org, if you like.

That probably ends my part in this little adventure, but I’ve passed everything on to the University of Oxford’s legal team in case any of them have anything to say about it. And now I’ve got a new story to tell where web developers get together over a pint: the story of the time that I made a website for a university… and a different university stole it!

The Bodleian Libraries website as it appeared in 2011.× Bodleian Library & Radcliffe Camera website× Bilkent University Library website, as it appears today.× Bodleian and Bilkent search boxes, side-by-side.× Bodleian and Bilkent menus side-by-side.× Bodleian and Bilkent opening hours, side-by-side.× Bodleian and Bilkent sliders, side-by-side.× Bodleian and Bilkent content boxes, side-by-side.× Bodleian and Bilkent footers, side-by-side.× Copyright notice on Bilkent University Library's website.× HTML source code from Bilkent University Library.× CSS code from Bilkent University.×

Data-hucksters beware: online privacy is returning

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Next year, 25 May looks like being a significant date. That’s because it’s the day that the European Union’s general data protection regulation (GDPR) comes into force. This may not seem like a big deal to you, but it’s a date that is already keeping many corporate executives awake at night. And for those who are still sleeping soundly, perhaps it would be worth checking that their organisations are ready for what’s coming down the line.

First things first. Unlike much of the legislation that emerges from Brussels, the GDPR is a regulation rather than a directive. This means that it becomes law in all EU countries at the same time; a directive, in contrast, allows each country to decide how its requirements are to be incorporated in national laws…

TIL that the first ever speeding fine was given to Walter Arnold of Kent, UK, in January 1896. His speed: 8mph in a 2mph zone. He was caught by a policeman on a bicycle.

This link was originally posted to /r/todayilearned. See more things from Dan's Reddit account.

The original link was: http://www.nationalmotormuseum.org.uk/motoring_firsts

Walter Arnold of East Peckham, Kent, had the dubious honour of being the first person in Great Britain to be successfully charged with speeding on 28 January 1896. Travelling at approximately 8mph/12.87kph, he had exceeded the 2mph/3.22kph speed limit for towns. Fined one shilling and costs, Arnold had been caught by a policeman who had given chase on a bicycle.

Buying a House, Part 3

This blog post is the third in a series about buying our first house. If you haven’t already, you might like to read the first part. In the second post in the series, we’d put an offer on a house which had been accepted… but of course that’s still early days in the story of buying a house…

We hooked up with Truemans, a local solicitor, after discovering that getting our conveyancing services from a local solicitor is only marginally more-expensive than going with one of the online/phone/post based national ones, and you get the advantage of being able to drop in and harass them if things aren’t going as fast as you’d like. Truemans were helpful from day one, giving us a convenient checklist of all of the steps in the process of buying a house. I’m sure we could have got all the same information online, but by the time I was thinking about offers and acceptance and moving and mortgages and repayments and deposits and everything else, it was genuinely worth a little extra money just to have somebody say “next, this needs to happen,” in a reassuring voice.

A 22-page form; each page is double-sided for added insanity.
This gargantuan beast is our mortgage application form. All of those pages are double-sided, by the way.

Meanwhile, we got on with filling out our mortgage application form. Our choice of lenders – which Stefan, who I’d mentioned in the last post, had filtered for us – was limited slightly by the fact that we wanted a mortgage for three people, not for one or two; but it wasn’t limited by as much as you might have thought. In practice, it was only the more-exotic mortgage types (e.g. Option ARMs, some varieties of interest-only mortgage) that we were restricted from, and these weren’t particularly appealing to us anyway. One downside of there being three of us, though, was that while our chosen lender had computerised their application process, the computerised version wasn’t able to handle more than two applicants, so we instead had to fill out a mammoth 22-page paper form in order to apply. At least it weeds out people who aren’t serious, I suppose.

A front door with a hole, boarded up with plywood.
The front door of our intended new home had recently sustained some… damage. That didn’t bode well.

I revisited the house to check out a few things from the outside: in particular, I was interested in the front door, which had apparently been broken during a… misunderstanding… by the current owners, who are in the middle of what seems like a complicated divorce. The estate agent had promised that it would be repaired before the sale, but when I went to visit I found that this hadn’t happened yet. Of course, now we had lawyers on our side, so it was a quick job to ask them to send a letter to the seller’s solicitor, setting the repair of the door as a condition upon which the sale was dependent.

A page from our Environmental Search, indicating some of the past uses of the land around the house we hope to buy.
The results of our Environmental Search were perhaps the most-interesting. But I’ll understand if you don’t think it’s as interesting as I do.

Our solicitors had also gotten started with the requisite local searches. One of the first things a conveyancing solicitor will do for you is do a little research to ensure that the property really is owned by the people who are selling it, that there’s no compulsory purchase order so that a motorway can be built through the middle of it, that it’s actually connected to mains water and sewers, that planning permission was correctly obtained for any work that’s been done on it, and that kind of thing. One of the first of these searches to produce results was the environmental search.

A map of the area around our new house, as it was about a century ago.
A map of the area around our new house, as it was about a century ago, unearthed by our convenient tame librarian.

One of the things that was revealed be the environmental search was that the area was at a significantly higher-than-average risk of subsidence, had the construction not been done in a particular way – using subsidence-proof bricks, or something, I guess? I theorised that this might be related to the infill activities that (the environmental search also reported) had gone on over the last hundred and fifty years. The house is near a major waterway, in an area that was probably once lower-lying and wetter, but many of the small ponds in the area were filled in in the early part of the 20th century (and then, of course, the area was developed as the suburbs of central Oxfordshire expanded, in the 1980s). Conveniently, we have a librarian on our house-buying team, and he was able to pull up a stack of old OS maps showing the area, and we were able to find our way around this now almost-unidentifiable landscape.

A map showing a field, hedgerows, water course and - highlighted in blue - a pond. The second highlighting in blue (bottom left) is a letter 'O', not a pond. I got carried away highlighting things, okay?
A map showing a field, hedgerows, water course and – highlighted in blue – a pond. The second highlighting in blue (bottom left) is a letter ‘O’, not a pond. I got carried away highlighting things, okay?

Sure enough, there were ponds there, once, but that’s as far as our research took us. Better, we thought, to just pass on the environmental search report to a qualified buildings surveyor, and have them tell us whether or not it was made out of subsidence-proof bricks or shifting-ready beams or whatever the hell it is that you do when you’re building a house to make it not go wonky. Seriously, I haven’t a clue, but I know that there are experts who do.

Three-panel diagram, showing a low-lying lake being pumped to allow house construction, but in the third panel - OH NOES! - the houses have gone lop-sided because without the water in it, the ground becomes unstable.
In this highly-realistic diagram, which wouldn’t look out of place in a geography textbook, houses go wonky because they’re built on ground that became more-compressible after it was drained. This is what I want to avoid.

Given that the house we’re looking at is relatively new, I don’t anticipate there being any problems (modern building regulations are a lot more stringent than their historical counterparts), but when you’re signing away six-figures, you learn to pay attention to these kinds of things.

Hopefully, the fourth blog post in this series will be about exchanging contracts and getting ready to move in to our new home: fingers crossed!

A 22-page form; each page is double-sided for added insanity.× A front door with a hole, boarded up with plywood.× A page from our Environmental Search, indicating some of the past uses of the land around the house we hope to buy.× A map of the area around our new house, as it was about a century ago.× A map showing a field, hedgerows, water course and - highlighted in blue - a pond. The second highlighting in blue (bottom left) is a letter 'O', not a pond. I got carried away highlighting things, okay?× Three-panel diagram, showing a low-lying lake being pumped to allow house construction, but in the third panel - OH NOES! - the houses have gone lop-sided because without the water in it, the ground becomes unstable.×

Jury Duty, Part 4

This is the last in a series of four blog posts about my experience of being called for jury duty in 2013.

And just like that, it was over. The courts service kept me “on the hook” for a day or two, but after that: when I called the answerphone from which I receive my instructions, I was told that I’d been cleared. My jury service was over.

Scene from 12 Angry Men. Henry Fonda explains his vote of "not guilty".
12 Angry Men is an awesome film. The behaviour of some of the characters would certainly be illegal in a contemporary UK case, so we certainly can’t consider them to be role models for a real jury, but it’s a great film nevertheless.

I filled in my expenses form. £5.71 for lunch (where do they get these numbers?) each day. 8.9 pence per mile cycled to and from the courthouse. Given that they give a mileage bonus to car shares, I wonder if they’d have given me a top-up if I’d have shared a tandem with another juror?

I heard the outcome of the trial second-hand, a few days later, on a local radio station. It somehow reminded me that the real world was connected to my time on a jury: something I’d sort-of forgotten at the time. Being pulled out from your daily routine and put onto jury duty feels sometimes surreal, and – like the blind spot in your eye that fills-in what you see with the colours around it – it’s hard to remember now that just last week I wasn’t just following my normal pattern. So when I heard about the result of a trial in which my ‘alter ego’ – Dan the juror! – took part, it was strangely jarring. For a moment, I said to myself: “Oh yeah; that happened.”

My jury service was a really interesting experience. I’d have appreciated less sitting around and being shuffled from place to place, and more-certainty about when I would and wouldn’t be needed, but that’s only a small issue. I got to see the wheels of justice turning from within the machine, and to take part in an important process of our society. And that’s great.

Scene from 12 Angry Men. Henry Fonda explains his vote of "not guilty".×