https://yorickpeterse.com/Yorick Peterse2024-02-08T19:00:00ZYorick Petersehttps://yorickpeterse.comhttps://yorickpeterse.com/articles/what-it-was-like-working-for-gitlab/What it was like working for GitLab2024-02-08T19:00:00Z2024-02-08T19:00:00Z<p>
I joined GitLab in October 2015, and left in December 2021 after working there
for a little more than six years.</p><p>While I previously wrote <a href="/articles/im-leaving-gitlab-to-work-on-inko-full-time/">about leaving
GitLab</a> to work on
<a href="https://inko-lang.org/">Inko</a>, I never discussed what it was like working for
GitLab between 2015 and 2021. There are two reasons for this:</p><ol><li>I was suffering from burnout, and didn't have the energy to revisit the last
six years of my life (at that time)</li><li>I was still under an NDA for another 24 months, and I wasn't sure how much I
could discuss without violating it, even though it probably wouldn't have
caused any problems</li></ol><p>The NDA expired last December, and while I suspect I'm still dealing with the
fallout of a burnout, I have a bit more energy to revisit my time at GitLab.</p><p>I'll structure this article into two main sections: an overview of my time at
GitLab based on what I can still remember, and a collection of things I've
learned as a result of my work and experiences.</p><h2 id="table-of-contents">Table of contents</h2><ul class="toc"><li><a href="#before-gitlab">Before GitLab</a></li><li><a href="#2015-2017">2015-2017</a></li><li><a href="#2017-2018">2017-2018</a></li><li><a href="#2019-2021">2019-2021</a></li><li><a href="#what-ive-learned">What I've learned</a><ul><li><a href="#scalability-needs-to-be-part-of-a-companys-culture">Scalability needs to be part of a company's culture</a></li><li><a href="#make-teams-more-data-and-developer-driven">Make teams more data and developer driven</a></li><li><a href="#you-cant-determine-what-is-minimal-viable-without-data">You can't determine what is "minimal viable" without data</a></li><li><a href="#a-saas-and-self-hosting-dont-go-well-together">A SaaS and self-hosting don't go well together</a></li><li><a href="#more-people-doesnt-equal-better-results">More people doesn't equal better results</a></li><li><a href="#im-conflicted-on-the-use-of-ruby-on-rails">I'm conflicted on the use of Ruby on Rails</a></li><li><a href="#the-time-it-takes-to-deploy-code-is-vital-to-the-success-of-an-organization">The time it takes to deploy code is vital to the success of an organization</a></li><li><a href="#location-based-salaries-are-discriminatory">Location based salaries are discriminatory</a></li></ul></li><li><a href="#conclusion">Conclusion</a></li></ul><h2 id="before-gitlab">Before GitLab</h2><p>Before joining GitLab, I was working for a small startup based in Amsterdam.
Like most startups, in the months leading up to my departure the company started
to run out of money and had to resort to some desperate solutions, such as
renting out part of the office space to cover the costs. At the same time, I
felt I had done all the things I wanted to and could do at a technical level.</p><p>In parallel to this, I was also working on
<a href="https://github.com/rubinius/rubinius">Rubinius</a> in my spare time, and we had
considered using it on various occasions, going as far as making sure all our
code ran on it without any problems. This also lead to the creation of
<a href="https://github.com/yorickpeterse/oga">Oga</a>, an XML/HTML parsing library acting
as an alternative to Nokogiri.</p><p>Unfortunately, the lack of funding combined with various technical problems
meant that we never pursued the use of Rubinius further. Because of all these
factors, I started looking for a job where I could spend at least some more time
working on Rubinius in hopes of making it stable enough for people to use in a
production environment.</p><p>During this time I attended various Ruby meetups in Amsterdam, and helped out
with a few <a href="https://railsgirls.com/">Rails Girls</a> workshops. At one of these
workshops I ran into <a href="https://sytse.com/">Sytse</a> and his wife, and once again at
a later workshop or meetup (I think, I can't quite remember as it's been a long
time). Through this I learned about GitLab, and developed an interest in working
there.</p><p>Some time in the summer of 2015 I sent Sytse an Email, stating I wanted to work
for GitLab and asking if they were willing to sponsor me working on Rubinius one
day per week. The conversation and interviews that followed resulted in me
starting at GitLab in October 2015 as employee #28. My task was to improve
performance of GitLab, and allowed me to spend 20% of my time on Rubinius.</p><p>During my time I was a part of various teams, had a lot of autonomy, reported to
10 different managers over the years, nearly wiped the company out of existence,
built various critical components that GitLab still uses to this day, saw the
company grow from 30-something employees to around 2000 employees, and ended up
with a burnout. Or as the Dutch saying goes: "Lekker gewerkt, pik!" (good luck
translating that).</p><h2 id="2015-2017">2015-2017</h2><p>My last day at the company before GitLab was September 30, a Wednesday, followed
by starting at GitLab the next day. This meant I went from working in an office
five days per week to working remote five days per week. While I had worked from
home before, mainly when the trains weren't running due to a tiny amount of snow
or leaves on the tracks, it took a bit of adjusting to the new setup.</p><p>A particular memory from this time that stands out is carrying a bag of
groceries home during the day, and realizing how nice it's to do that during the
day instead of in the evening after coming home from work.</p><p>Another memory is taking a nap on my sofa with my cat, of which I took this
picture at the time:</p><p><img src="/images/what-it-was-like-working-for-gitlab/sofa_cat.jpg" alt="My cat judging me while I try to take a nap" /></p><p>Yes, those are Homer Simpson slippers.</p><p>The apartment I was renting at the time wasn't large and only had a small
kitchen area, a small living room, and a similarly small attic. This meant that
my living room functioned as my bedroom, living room, and office all at once. It
wasn't a great setup, but it was all I could afford at the time. Perhaps the
expensive Aeron chair had something to do with it.</p><p>In spite of being an all remote company, GitLab was a social company, with
frequent meetups and events taking place over the years. For example, a few
weeks after I joined there was a company gathering in Amsterdam, involving
various activities during the day and dinners in the evening:</p><p><img src="/images/what-it-was-like-working-for-gitlab/amsterdam_dinner.jpg" alt="A dinner with everybody at GitLab" /></p><p>Back then you could still fit the entire company in one corner of a restaurant.</p><p>Not long after, GitLab had its first growth spurt, resulting in somewhere around
100 new employees (I think? My memories are a bit fuzzy at this point). At the
next company gathering in Austin in 2016, a single corner in a restaurant was no
longer enough:</p><p><img src="/images/what-it-was-like-working-for-gitlab/austin_gathering.jpg" alt="The company gathering in Austin, Texas" /></p><p>During this time there were also plenty of negative experiences. GitLab suffered
from terrible performance, frequent outages (almost every week some), poor
management, and many other problems that startups face. This lead to "GitLab is
slow" being the number one complaint voice by users. Especially on Hacker News
people just <em>loved</em> to complain about it, no matter what the original topic
(e.g. some new feature announcement) might've been. Of course GitLab was aware
of this, and in fact one of the reasons GitLab hired me was to resolve these
problems.</p><p>Resolving these problems proved a real challenge, in particular because GitLab
had no adequate performance monitoring infrastructure. That's not an
exaggeration by the way: the only service running at the time was a New Relic
trial account that only allowed monitoring of one, <em>maybe</em> two servers out of
the (I think) total of 15-20 servers we had at the time. This meant that
whatever data did come in wasn't an accurate representation, and made measuring
and solving performance a challenge.</p><p>What made solving these problems extra difficult was GitLab's requirement that
whatever tooling we'd use had to be available to self-hosted customers, and
preferably be open source (or perhaps this was even a hard requirement, I can't
remember). This meant I had to not only improve performance, but also build the
tools to improve performance in the first place. At the same time, writing
performant code (or code that at least isn't horribly slow) wasn't at all
considered a priority for the rest of the company. GitLab also had a tendency to
listen more to complaints on Hacker News than internal complaints. This lead to
an internal running joke that it if you wanted something to change, you'd have
better luck complaining about it anonymously on Hacker News instead of bringing
it up through the proper channels.</p><p>What followed was several months of me trying to improve performance, build the
tooling necessary for this, try to change the company culture/attitude towards
performance such that things would actually improve over time, and deal with
GitLab not being happy with the improvements made. I distinctively remember
there being at least several video calls in which I was close to yelling at
Sytse, though it fortunately never came to that.</p><p>In spite of these challenges I did manage to build the necessary tooling, and
improve performance in various parts (some of which were significant, others not
so much). This tooling became an official GitLab feature known as <a href="https://docs.gitlab.com/ee/administration/monitoring/performance/">"GitLab
Performance Monitoring"</a>,
though it has changed quite a bit over the years. Another tool I built was
<a href="https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/1749">"Sherlock"</a>, a
heavy-weight profiler meant to be used in a development environment.</p><p>During this time, GitLab started to realize you can't solve these sort of
problems by just hiring one person, especially if performance isn't a priority
for the rest of the company. One of the changes this lead to was that instead of
reporting directly to Sytse, I would report to a dedicated manager as part of
the new "Performance" team, and the team had a budget to hire more people. I
don't remember exactly what the budget was, but it wasn't much: two, <em>maybe</em>
three people I think. This wasn't nearly enough given the total size of the
company and it's primary focus being producing as many features as possible, but
it was a start.</p><p>Much of my second year I spent as part of this team, now with a bit more room to
breathe. I continued campaigning for more resources and making good performance
a requirement for new code, but with mixed results, and of course I and the team
as a whole continued improving performance.</p><p>During this time GitLab also saw its first wave of lay-offs and people leaving
by their own will, mainly as a result of GitLab hiring the wrong people in the
first place. This meant that GitLab grew from 30-something to (I think)
130-something people, only to shrink back to 80-something people, only to start
growing again in the months to come.</p><p>As for Rubinius: while we tried to get GitLab to work on Rubinius, we never
succeeded. Combined with the maintainer of Rubinius wanting to take the project
in a different direction and the controversies this lead to within the Ruby
community, we ultimately decided to give up on Rubinius, and I stopped working
on it entirely. It's unfortunate, as Rubinius had a lot going for it over the
years but was ultimately held back by the maintainers running the project in a
way different from what was necessary for it to become successful.</p><h2 id="2017-2018">2017-2018</h2><p><img src="/images/what-it-was-like-working-for-gitlab/sytse_giraffe.jpg" alt="South Africa summit in 2018" /></p><p>After the first rocky 1,5 years, things started to improve. Performance had
improved greatly, and GitLab was starting to take it more seriously. Hiring
processes were much improved, and like a game of chess GitLab started moving the
right people into the right places. The scope of the performance team also
changed: instead of focusing on performance in general, the team would focus on
database performance and as part of this was to be renamed to the creatively
called "Database team". With this change also came a bigger budget for hiring
people, and infrastructure engineers assigned to help us out with e.g. setting
up new databases.</p><p>A critically important feature I built during this time is <a href="https://docs.gitlab.com/ee/administration/postgresql/database_load_balancing.html">GitLab's database
load balancer</a>
(<a href="https://about.gitlab.com/blog/2017/10/02/scaling-the-gitlab-database/">announced here</a>).
This feature allowed developers to continue to write their database queries as
usual, while the load balancer would take care of not directing these queries to
either a replica or a primary. After performing a write, the load balancer
ensures the primary database is used until the written changes are available to
all replicas, an act commonly referred to as "sticking". The introduction of the
load balancer had a significant and positive impact on performance, and I'm
certain GitLab would've been in a lot of trouble if it wasn't for this load
balancer. What I'm most proud of is being able to introduce this system
transparently. To date I've not seen a database load balancer (let alone for
Ruby on Rails) that you can just add to your project and you're good to go.
Instead, existing solutions are more like frameworks that only provide a small
piece of the puzzle, requiring you to glue everything together yourself, often
without any form of sticking support. It's a shame we never got to extract it to
a standalone library.</p><p>This period wasn't just one of incredible productivity and improvements, it also
marked the lowest and scariest moment of my time at GitLab and my career as a
whole: on January 31st, after a long and stressful day of dealing with many
issues that continued well into the late evening, I <del>solved GitLab's performance
problems</del> <a href="https://about.gitlab.com/blog/2017/02/01/gitlab-dot-com-database-incident/">removed GitLab's production database by
accident</a>.
This then lead to the discovery that we didn't have any backups as a result of
the system not working for a long time, as well as the system meant to notify us
of any backup errors not working either. In the end we did recover, as I had
copied the production data to our staging environment about six hours prior as
part of the work I was doing that day, though the recovery process took around
24 hours. While about six hours of data loss is by all accounts terrible, I'm
not sure what would've happened if I hadn't made that backup. Suffice to say, my
heart skipped a few beats that day, and I'm certain I instantly grew a few extra
grey hairs.</p><p>A recurring source of frustration during this time was GitLab's desire to shard
the database, even after the introduction of the database load balancer. Not
only did I and the other engineers and my manager believe this to be the wrong
solution to our problems, we also had the data to back this up. For example,
sharding is useful if writes heavily outnumber reads, but in case of GitLab
reads dominated writes by a ratio along the lines of 10:1. Further, the amount
of data we were storing wasn't nearly enough to justify the complexity of
introducing sharding. I distinctively remember a consultant we'd hired saying
something along the lines of "No offence, but we have customers with several
orders of magnitudes more load and storage, and even for them sharding is
overkill". In spite of this, GitLab would continue to push for this over the
years, until management made the decision to leave it be, only for GitLab to
bring it up <em>again</em> (just using a slightly different name and idea this time)
towards the end of my time at GitLab.</p><h2 id="2019-2021">2019-2021</h2><p><img src="/images/what-it-was-like-working-for-gitlab/new_orleans.jpg" alt="New Orleans summit in 2019" /></p><p>Some time in 2018-2019 I transitioned from the database team into a newly
founded "Delivery" team, as I had grown tired of working on performance and
reliability for the last four years. Furthermore, multiple people were now
working on performance and reliability, so I felt it was the right time
for me to move on to something new. The goal of this new team was to improve the
release process and tooling of GitLab, as the state of this was at the time best
described as messy.</p><p>For example, we looked into how much time there was between a commit landing in
the main branch and deploying the change GitLab.com. The resulting data showed
that on average it would take several days, but in the worst cases it could take
up to <em>three weeks</em>. The main bottleneck here was the split between GitLab
Community Edition and GitLab Enterprise Edition, both existing as separate Git
repositories, requiring manual merges and conflict resolution on a regular
basis. This lead to a multi-month effort to <a href="https://about.gitlab.com/blog/2019/02/21/merging-ce-and-ee-codebases/">merge the two projects into
one</a>.
While we divided the work into frontend and backend work, and made various teams
responsible for contributing their share towards the effort, I ended up
implementing most of the backend related changes, with another colleague taking
care of most of the frontend work.</p><p>Together with the rest of the team we made significant improvements to the
release process during this period, and we reached a point where we could deploy
changes in a matter of hours. While this is nowhere near as quick as it
should've been, going from a worst-case of three weeks to a worst-case of
<em>maybe</em> a day is a <em>massive</em> improvement.</p><p>Like the previous periods, this period was not free of turmoil and changes.</p><p>2018 was the last year we had a GitLab summit focused on employees, with 2019
and following years following a format more like a traditional conference, aimed
more at customers and less at employees. From a financial perspective this was
understandable as organizing a gathering of 2000+ people is incredibly
expensive. From a social perspective it was a loss, as the more corporate
setting of the summits wasn't nearly as fun as the old format. I have fond
memories of <a href="https://youtu.be/39chczWRKws?feature=shared&t=1751">Sytse dancing on stage in response to a team winning a
contest</a>, or Sytse and his
wife giving a fitness class while Sytse is wearing a giraffe costume. These sort
of goofy events wouldn't happen any more in the following years.</p><p>Then there was the issue of laptop management: people would request a company
Mac laptop and were more or less free to use it how they saw fit, or you'd use
your own hardware like me. Over the years GitLab's management started
discussions about using software to be able to remotely manage the laptops. A
recurring problem in these discussions was that the proposed tools were invasive
(e.g. they could be used to record user activity), didn't contain any guarantees
against abuse, and feedback from employees (of which there was <em>a lot</em>) would be
ignored until key employees started applying pressure on management. The plans
would then be shelved, only for the discussion to start all over again months
later.</p><p>What stood out the most was not the proposed changes, but rather the way
management handled the feedback, and how the changes in general gave off a vibe
of solutions in search of problems to justify their existence. It's worth
mentioning that most people involved in these discussions (myself included)
understood the need for some form of laptop management (e.g. against theft), but
felt that the invasive solutions proposed went too far.</p><p>GitLab did settle on a laptop management solution using
<a href="https://nl.sentinelone.com/">SentinelOne</a>. While GitLab made it a requirement
for employees to install this software on hardware used to access GitLab
resources, including your personal hardware (or at least was considering
requiring that), I (using my own desktop computer) somehow managed to stay under
the radar and was never asked to install the software in question. Perhaps
because I wasn't using a company issued laptop, GitLab just forgot to check up
on me.</p><p>These cultural changes combined with various changes in my personal life
resulted in a loss of motivation, productivity, and an increase in stress, and
less consistent working hours. The team's manager (whom I'd consider the best
manager I've ever had) also transitioned to a different role, with a newly hired
manager now leading the team. I didn't get along well with this manager, The
resulting conflict lead to a "performance enablement plan", a procedure meant to
get things back on track before the need for a "performance improvement plan"
(PIP). A PIP is meant to be used as a last attempt at improving the relationship
between an employee, their work, and their employer.</p><p>What rubbed me the wrong way was how GitLab handled the PEP: I acknowledged
there were areas I needed to improve upon, but I felt that part of the problem
was the new manager's way of working. Management assured me that the PEP meant
to improve the state of things on both ends, i.e. it wouldn't just focus on <em>me</em>
improving but also the manager. That didn't happen, and the PEP focused solely
on what <em>I</em> needed to do differently. The PEP was also a bit vague about what
requirements had to be met. The original plan was for the PEP to last one month,
but by the end of the first month my manager decided to extend the PEP by
another month because they felt this to be necessary, the reasons for which
weren't well specified. I decided to just go along with it, and after two months
passed I completed the PEP and management deemed the results satisfactory.</p><p>The optimist in me likes to believe I was just the first employee to be put on a
PEP and thus management had to figure things out as we went along. The pessimist
in me has a far more negative opinion on this series of events, but I'll keep
that to myself.</p><p>After this experience I realized that perhaps it was time for me to leave, as
both GitLab and I were heading in different directions, and I was unhappy with
the state of things at the time.</p><p>The opportunity for this presented itself towards the end of 2021: GitLab was
going public, and taking into account the time I had to wait before I could
exercise my stock options meant I'd be able to leave in December 2021. I
couldn't leave earlier due to how stock option taxes worked in The Netherlands
at the time: exercising stock options meant having to pay full income taxes
(52%) over the difference between the exercise fee and valuation, even if the
stock isn't liquid. In my case the amount of taxes would be so high I wouldn't
be able to afford it, forcing me to wait until GitLab went public. A few months
later the law changed, and you can now choose to pay the taxes either at the
time of exercise, or when the stock is liquid. The caveat is that if you defer
taxes until the stock is liquid, you pay taxes based on the value at that time,
not based on the value at the time of exercising your stock options. This
certainly isn't ideal and presents a huge financial risk, but at least you have
a choice.</p><p>And so with my stocks acquired, I left in December 2021 to work on
<a href="https://inko-lang.org/">Inko</a> full-time, using my savings to cover my bills.</p><h2 id="what-ive-learned">What I've learned</h2><p>With the history out of the way, let's take a look at some things I've learned
from my time at GitLab. One thing to keep in mind is that I'm basing these
findings on my personal experiences, and as such it's not unlikely I'm wrong in
some areas.</p><h3 id="scalability-needs-to-be-part-of-a-companys-culture">Scalability needs to be part of a company's culture</h3><p>A mistake GitLab made, and continued to make when I left, was not caring enough
about scalability. Yes, directors would say it was important and improvements
were certainly made, but it was never as much of a priority as other goals. At
the heart of this problem lies the way GitLab makes money: it primarily earns
money from customers self-hosting GitLab Enterprise Edition, not GitLab.com. In
fact, GitLab.com always cost <em>much</em> more money than it brought in. This
naturally results in a focus on the self-hosted market, and many of the
performance problems we ran into on GitLab.com didn't apply to many self-hosted
customers.</p><p>What was even more frustrating was that many developers in fact <em>wanted</em> to
improve performance, but weren't given the time and resources to do so.</p><h3 id="make-teams-more-data-and-developer-driven">Make teams more data and developer driven</h3><p>Another factor is GitLab's product manager driven nature. While some key
developers may have had the ability to influence product decisions (given enough
screaming and kicking), it was mainly product managers and directors deciding
what needed to be implemented. Sometimes these decisions made a lot of sense,
other times they seemed to be based solely on the equivalent of "I read on
Hacker News this is a good idea, so we have to build it".</p><p>I believe GitLab would've been able to perform better as a company if it adopted
a simpler hierarchy early on, instead of the traditional multi-layer hierarchy
it has today. In particular, I think the idea of product managers needs to go in
favour of giving team leads more power and having them interact more with users.
To me, that's ultimately what a "product manager" should do: help build the
product at a technical level, but also act as a liaison between the team and its
users.</p><h3 id="you-cant-determine-what-is-minimal-viable-without-data">You can't determine what is "minimal viable" without data</h3><p>One of GitLab's core principles is to always start with a "minimal viable
change". The idea is to deliver the smallest possible unit of work that delivers
value to the user. On paper that sounds great, but in practice the definition of
"minimal" is inconsistent between people. The result is that one team might
consider performance or good usability a requirement to something being viable,
while another team couldn't care less.</p><p>In practice this lead to GitLab building many features over the years that just
weren't useful: a serverless platform nobody asked for and that was ultimately
killed off, support for managing Kubernetes clusters that didn't work for three
weeks without anybody noticing, a chatops solution we had to build on top of our
CI offering (thus introducing significant latency) instead of using existing
solutions, or a requirements management feature that only supported creating and
viewing data (not even updating or deleting); these are just a few examples from
recent years.</p><p>To determine what makes something viable, you need a deep understanding of the
desires of your target audience. While GitLab does perform <a href="https://handbook.gitlab.com/handbook/product/ux/performance-indicators/paid-nps/">user surveys every
quarter</a>,
and some teams have access to data about user engagement, from what I remember
and learned from talking to other former colleagues it seems this data was more
incidentally used, instead of being a core part of each team's workflow.</p><h3 id="a-saas-and-self-hosting-dont-go-well-together">A SaaS and self-hosting don't go well together</h3><p>GitLab offers two types of product: self-hosted installations and a software as
a service (SaaS) offering. I believe most companies won't be able to effectively
offer such a setup, including GitLab. Not only do you get a conflict of interest
based on what earns you the most money (as mentioned above), but the two types
of setups also come with different requirements and ways of applying updates.</p><p>For example, for a SaaS you want to be able to deploy quickly and have to
handle large amounts of data and workloads taking place on a centralized
infrastructure. Given most self-hosted instances tend to be tiny in comparison
to the SaaS offering, many of the solutions for the problems you encounter as a
SaaS and their corresponding solutions just don't apply to self-hosted
installations. This effectively results in two code paths in many parts of your
platform: one for the SaaS version, and one for the self-hosted version. Even if
the code is physically the same (i.e. you provide some sort of easy to use
wrapper for self-hosted installations), you still need to think about the
differences.</p><p>In contrast, when you focus on <em>either</em> a SaaS or self-hosted setup you get to
dedicate all your attention to providing the best experience for the setup in
question. There are of course exceptions, but they are exactly that: exceptions,
and exceptions are rare.</p><h3 id="more-people-doesnt-equal-better-results">More people doesn't equal better results</h3><p>Like many other companies before it, GitLab hired large numbers of people over
the years and today employs over 2000 people. I don't know how many of those are
developers today, but I'm guessing at least a few hundred based on a quick
glance at their team page.</p><p>It's well known that adding more people to a project doesn't necessarily improve
productivity and results (see also "The Mythical Man-Month"), and yet almost
every western startup with venture capital seems to ignore this, hiring hundreds
of developers even if the product doesn't need nearly that many developers.</p><p>I don't have any data to back this up, but I suspect that most companies don't
need more than 20 developers, with some needing 20 to 50 developers, and only a
handful needing between 50 and 100 developers. Once you cross the 100 developer
mark, I think you need to start thinking about whether the scope of your
product(s) isn't getting out of hand before hiring even more people.</p><p>Note that I'm specifically talking about software developers here. For example,
if you're building custom hardware, you'll probably need more people to scale up
the production process. Sales and support are also two areas where you generally
do benefit from having more people, as these types of work require less
synchronisation between people.</p><h3 id="im-conflicted-on-the-use-of-ruby-on-rails">I'm conflicted on the use of Ruby on Rails</h3><p>GitLab is built using Ruby and Ruby on Rails, and this is a big part of what
allowed it to reach the success it enjoys today. At the same time, this
combination presents its challenges when the project reaches a large size with
many contributors of different experience levels. Rails in particular makes it
too easy to introduce code that doesn't perform well.</p><p>For example, if you want to display a list of projects along with a counter
showing the number of project members, it's far too easy to introduce the <a href="https://stackoverflow.com/questions/97197/what-is-the-n1-selects-problem-in-orm-object-relational-mapping">N+1
query problem</a>
by accident. While Rails (or more specifically, ActiveRecord) provides
functionality to solve this, it's an opt-in mechanism, inevitably leading to
developers forgetting about this. Many of the performance problems solved during
my first few years at GitLab were N+1 query problems.</p><p>Other frameworks have learned from this over the years and provide better
alternatives. The usual approach is that instead of being able to arbitrarily
query associated data, you have to pass in the data ahead of time. The benefit
here is that if you were to forget passing the data in, you'd run into some sort
of error rather than the code querying the data for you on a per-row basis,
introducing performance problems along the way.</p><p>Ruby itself is also a choice I have mixed opinions on. On one end, it's a
wonderful language I enjoyed using for a little under 10 years. On the other
end, its heavy use of meta programming makes it difficult to use in large
projects, even with the introduction of optional typing. I'm not just saying
that for the sake of saying it, I experienced it first hand when writing <a href="https://github.com/yorickpeterse/ruby-lint">a
static analysis tool for Ruby</a> years
ago.</p><p>In spite of all this, I'm not sure what alternative I would recommend instead of
the combination of Ruby and Ruby on Rails. Languages such as Go, Rust or Node.js
might be more efficient than Ruby, but none have a framework as capable as Ruby
on Rails. Python and Django <em>might</em> be an option, but I suspect you'll run into
similar problems as Ruby and Ruby on Rails, at least to some degree. It would
probably help if new web frameworks stopped obsessing over how to define your
routing tree, and instead focused more on productivity as a whole.</p><p>I have some vague ideas on how I'd approach this with
<a href="https://inko-lang.org/">Inko</a>, but there's a lot of other work that needs doing
before I can start writing a web framework in Inko.</p><h3 id="the-time-it-takes-to-deploy-code-is-vital-to-the-success-of-an-organization">The time it takes to deploy code is vital to the success of an organization</h3><p>This is something I already knew before joining GitLab, having spent a
significant amount of time setting up good deployment and testing pipelines at
my previous job, but working for GitLab reinforced this belief: you need to be
able to deploy your code <em>fast</em>, i.e. within at most an hour of pushing the
changes to whatever branch/tag/thing you deploy from. At GitLab it took
somewhere around four years for us to get even close to that, and we still had a
long way to go.</p><p>Apart from the obvious benefits, such as being able to respond to incidents more
efficiently (without having to hot-patch code to account for your deploys taking
hours), there's also a motivational benefit: being able to see your changes live
is <em>nice</em> because you actually get to see and make use of your work. Nothing is
more demotivating than spending weeks on a set of changes, only for it to take
another two weeks for them to be deployed.</p><p>For this to work, you need to be incredibly aggressive about cutting down deploy
times and the time it takes to run your test suite as part of your deployments.
Depending on the type of application and the types of services you're testing,
you may inherently need a certain amount of time to run the tests. The point
here is not "tests and deployments must never take more than X minutes", but
rather to (as an organization) make it a priority to be able to deploy as fast
as your business requirements allow you to. As obvious as this may seem, I
suspect many organizations aren't doing nearly as good of a job in this area as
they could.</p><h3 id="location-based-salaries-are-discriminatory">Location based salaries are discriminatory</h3><p>The salary you earn at GitLab is influenced by various variables, one of which
is location. The influence of your location isn't insignificant either. When you
are a company with a physical office and have a need to hire people in specific
areas, it might make sense to adjust pay based on location as you otherwise
might not be able to hire the necessary people in the required areas. But for an
all remote company without a physical office, and legal entities across the
whole world, there's no legitimate reason to pay two different people with the
same experience and responsibilities different salaries purely based on where
they live.</p><p>To illustrate: when I left GitLab my salary was around €120 000 per year, or
around €8500 per month, before taxes. For The Netherlands this is a good salary,
and you'll have a hard time finding companies that offer better <em>and</em> let you
work from home full time. But if I had instead lived in the Bay Area, I would've
earned at least twice that amount, possibly even more. Not because I am somehow
able to do my job better in the Bay Area, or because of any other valid
reason for that matter, but because I would be living in the Bay Area instead of
in The Netherlands.</p><p>No matter how you try to spin this, it's by all accounts an act of
discrimination to pay one person less than another purely based on where they
live. Think about it: if a company pays a person less because of the color of
their skin or their gender, the company would be in big trouble. But somehow
it's OK to pay a person less based on their location?</p><p>As for how to solve this, for companies it's easy: just pay based on the
position's requirements, not the location of the applicant. It doesn't matter
whether you're paying somebody in the Bay Area $100 000 per year, or somebody in
the Philippines, because the cost for you as a business is the same. For
employees my only advice is to try and negotiate a better salary, but this may
prove difficult as companies paying based on locations also tend to be stubborn
about it. I hope one day our laws catch up with this practice.</p><p>A company that seems to do a good job at this is the <a href="https://oxide.computer/">0xide Computer
Company</a>. Instead of paying employees based on their
location, 0xide pays employees the same amount (see <a href="https://oxide.computer/blog/compensation-as-a-reflection-of-values">this
post</a> for
more details), something I deeply admire and believe more companies should do.</p><h2 id="conclusion">Conclusion</h2><p>Looking back, my time at GitLab is a mix of both incredibly positive and
negative experiences. I'm incredibly proud of the achievements the various teams
I was on made, and the people I got to work with, but I'm also saddened by the
last year or so souring an otherwise great experience. I don't have any regrets
working for GitLab, and would do it all over again if I could, just a little
differently thanks to the benefit of hindsight. I also still recommend it as a
company to work for, because in spite of its flaws I think it does <em>much</em> better
than many other companies.</p>https://yorickpeterse.com/articles/a-decade-of-developing-a-programming-language/A decade of developing a programming language2023-11-14T11:00:00Z2023-11-14T11:00:00Z<p>
In 2013, I had an idea: "what if I were to build my programming language?". Back
then my idea came down to "an interpreted language that mixes elements from Ruby
and Smalltalk", and not much more.</p><p>Between 2013 and 2015 I spent time on and off trying different languages (C,
C++, <a href="https://dlang.org/">D</a> and various others I can't remember) to see which
one I would use to build my language in. While this didn't help me find a
language I <em>did</em> want to use, it did help me eliminate others. For example, C
proved to be too difficult to work with. D seemed more interesting and I managed
to implement something that vaguely resembles a virtual machine, but I
ultimately decided against using it. I don't remember exactly why, but I believe
it was due to the rift caused by the differences between D version 1 and 2, the
general lack of learning resources and packages, and the presence of a garbage
collector.</p><p>Somewhere towards the end of 2014 I discovered
<a href="https://www.rust-lang.org/">Rust</a>. While the state Rust was in at the time is
best described as "rough", and learning it (especially at the time with the lack
of guides) was difficult, I enjoyed using it; much more so than the other
languages I had experimented until that point.</p><p>2015 saw the release of <a href="https://blog.rust-lang.org/2015/05/15/Rust-1.0.html">Rust 1.0</a>,
and that same year I committed the <a href="https://github.com/inko-lang/inko/commit/f8cf2530e26042515ed4a6b06eabf46c425bc87e">first few lines of Rust
code</a>
for <a href="https://inko-lang.org/">Inko</a>, though it would take another two months or
so before the code started to (vaguely) resemble that of a programming language.</p><p>Fast-forward to 2023, and Inko is in a state where one can write meaningful
programs in it (e.g. <a href="https://github.com/yorickpeterse/openflow">HVAC automation
software</a>, a <a href="https://github.com/yorickpeterse/inko-markdown">Markdown
parser</a>, a <a href="https://github.com/yorickpeterse/clogs">changelog
generator</a> and more). Inko has also
changed considerably over the years: whereas it was once a gradually typed
interpreted language, it's now statically typed and compiles to machine code
using <a href="https://llvm.org/">LLVM</a>. And whereas Inko used to draw inspiration
heavily from Ruby and Smalltalk, these days it's closer to Rust,
<a href="https://www.erlang.org/">Erlang</a> and <a href="https://www.ponylang.io/">Pony</a> than it
is to Ruby or Smalltalk.</p><p>Given it's been 10 years since I first started working towards Inko, I'd like to
highlight (in no particular order) a few of the things I've learned about
building a programming language since first starting work on Inko. This is by no
means an exhaustive list, rather it's what I can remember at the time of
writing.</p><div class="admonition discuss"><i class="icon"></i><div class="text"><p>You can find discussions about this article on Reddit
<a href="https://www.reddit.com/r/ProgrammingLanguages/comments/17v05xd/a_decade_of_developing_a_programming_language/">here</a>
and <a href="https://www.reddit.com/r/programming/comments/17v0avj/a_decade_of_developing_a_programming_language/">here</a>,
on <a href="https://news.ycombinator.com/item?id=38261982">Hacker News</a>, and on
<a href="https://lobste.rs/s/wyeffq/decade_developing_programming_language">Lobsters</a>.</p></div></div><h2 id="table-of-contents">Table of contents</h2><ul class="toc"><li><a href="#avoid-gradual-typing">Avoid gradual typing</a></li><li><a href="#avoid-self-hosting-your-compiler">Avoid self-hosting your compiler</a></li><li><a href="#avoid-writing-your-own-code-generator-linker-etc">Avoid writing your own code generator, linker, etc</a></li><li><a href="#avoid-bike-shedding-about-syntax">Avoid bike shedding about syntax</a></li><li><a href="#cross-platform-support-is-a-challenge">Cross-platform support is a challenge</a></li><li><a href="#compiler-books-arent-worth-the-money">Compiler books aren't worth the money</a></li><li><a href="#growing-a-language-is-hard">Growing a language is hard</a></li><li><a href="#the-best-test-suite-is-a-real-application">The best test suite is a real application</a></li><li><a href="#dont-prioritize-performance-over-functionality">Don't prioritize performance over functionality</a></li><li><a href="#building-a-language-takes-time">Building a language takes time</a></li></ul><h2 id="avoid-gradual-typing">Avoid gradual typing</h2><p>A big change I made was to switch Inko from being a gradually typed language to
a statically typed language. The idea behind gradual typing was that it would
allow you to build a prototype or simple scripts in a short amount of time using
dynamic typing, then over time turn the program into a statically typed program
(where beneficial).</p><p>In reality, gradual typing ends up giving you the worst of both dynamic and
static typing: you get the uncertainty and lack of safety (in dynamically typed
contexts) of dynamic typing, and the cost of trying to fit your ideas into a
statically typed type system. I also found that the use of gradual typing didn't
actually make me more productive compared to using static typing. The result was
that I found myself avoiding dynamic typing in both Inko's standard library and
the programs I wrote. In fact, the few places where dynamic typing <em>was</em> used in
the standard library was due to the type system not being powerful enough to
provide a better alternative.</p><p>Gradual typing also has performance implications. Consider this example using
keyword arguments:</p><div class="highlight"><pre class="highlight"><code><span class="k">let</span> x: Any = some_value
x.foo(b: <span class="mi">42</span>, a: <span class="mi">10</span>)
</code></pre></div><p>Here <code>x</code> is typed as <code>Any</code>, which used to mean the value is dynamically typed.
Because we don't know the type of <code>x</code> in <code>x.foo(...)</code>, we can't resolve the
keyword arguments to positional arguments at compile-time. This meant Inko's
virtual machine had to provide a runtime fallback, and the keyword arguments had
to be encoded into the bytecode. While the cost wasn't significant, in a
statically typed language the cost is zero because we can resolve the arguments
at compile-time.</p><p>Another issue is that the presence of dynamic types can inhibit compile-time
optimizations, such as compile-time inlining (and all the optimizations that
depend on it). If a language uses a Just In Time (JIT) compiler, such as
JavaScript (and by extension <a href="https://www.typescriptlang.org/">TypeScript</a>), you
can optimize the code at runtime, but that means having to write a JIT compiler
which itself is a massive undertaking.</p><p>The presence of dynamic types also means that even statically typed code may
be incorrect, though this depends on how you approach casting dynamically typed
values to statically typed values. If such a cast doesn't require a runtime
check, you may end up passing incorrectly typed data to statically typed code.
If you do perform some sort of runtime check, this may affect performance when
such casts are common.</p><p><strong>Recommendation:</strong> either make your language statically typed or
dynamically typed (preferably statically typed, but that's a different topic),
as gradual typing just doesn't make sense for new languages.</p><div class="admonition info"><i class="icon"></i><div class="text"><p>The emphasis here is on <em>new</em> languages, as applying gradual typing to an
existing language <em>can</em> be useful, especially as an intermediate step towards
the language becoming fully statically typed.</p></div></div><h2 id="avoid-self-hosting-your-compiler">Avoid self-hosting your compiler</h2><p>Early in the development of Inko, I decided that I wanted to write the compiler
in Inko itself, commonly referred to as a "self-hosted compiler". The idea was
that by doing so, the compiler could be exposed through the standard library,
and to have a sufficiently complicated program to test everything Inko has to
offer.</p><p>While this seems great on paper, in practise it turns into a real challenge.
Maintaining a single compiler is already a challenge, but maintaining two
compilers (one to bootstrap your self-hosted compiler, and the self-hosted
compiler itself) is even more difficult. The process of building the compiler is
also more complicated: first you have to build the bootstrapping compiler, then
you can use that to build the self-hosted compiler. Ideally you then use that
self-hosted compiler to compile itself a second time, so you can ensure the
behaviour doesn't subtly change depending on what compiler (the bootstrapping or
self-hosted compiler) is used to compile your self-hosted compiler.</p><p>Because of these challenges, I abandoned this idea in favour of writing the
compiler in Rust, and keeping it that way for the foreseeable future.</p><p><strong>Recommendation:</strong> defer writing a self-hosted compiler until you have a solid
language and ecosystem. A solid language and ecosystem is infinitely more useful
to your users than a self-hosted compiler.</p><h2 id="avoid-writing-your-own-code-generator-linker-etc">Avoid writing your own code generator, linker, etc</h2><p>When writing a language, it's tempting to take on more than you can or probably
should handle. In particular, it may be tempting to write your own native code
generator, linker, C standard library, and so on (i.e what languages such as
<a href="https://ziglang.org/">Zig</a> and <a href="https://www.roc-lang.org/">Roc</a> are doing).</p><p>My general recommendation is to avoid this unless you have established a clear
need for this. And when you do think there's a need, I'd still avoid it. Writing
a language is hard enough as-is and can easily take years. For every such
component (a linker, a code generator, etc) you add on top, it will take
several more years before the stack as a whole becomes useful. That's ignoring
the painful fact that such bespoke components are highly unlikely to outperform
the established alternatives.</p><p><strong>Recommendation:</strong> there are many developers who think they can write a
better linker, code generator, and so on, but few developers who actually
succeed in doing so. As harsh as it may sound, you are probably not one of them.
Of course once you have an established language, you're free to reinvent as many
of these wheels as you see fit.</p><div class="admonition info"><i class="icon"></i><div class="text"><p>If you're writing an interpreted language, it's fine and probably even needed to
write your own (byte)code generator (unless you target an existing virtual
machine such as the JVM), as bytecode generators are typically not that
complicated to implement.</p></div></div><h2 id="avoid-bike-shedding-about-syntax">Avoid bike shedding about syntax</h2><p>The syntax of a language and how its parsed is one of the most boring aspects of
building a language. Writing parsers in general is pretty dull, and there's not
a lot you can innovate upon.</p><p>And yet, it's a subject many developers building their own language seem to
spend <em>way</em> too much time on. There are also plenty of articles titled something
along the lines of "How to build your own programming language", only covering
the basics of writing a parser and nothing more.</p><p>For Inko I took a different approach in its early days: I used an
<a href="https://en.wikipedia.org/wiki/S-expression">S-expression</a> syntax, instead of
designing my own syntax and writing a parser for it. This meant I was able to
experiment with the semantics and virtual machine of the language, instead of
worrying over what keyword to use for function definitions.</p><p><strong>Recommendation:</strong> use an existing syntax and parser when prototyping your
language, allowing you to focus on the semantics instead of the syntax. Once you
develop a better understanding of your language you can switch to your own
syntax.</p><h2 id="cross-platform-support-is-a-challenge">Cross-platform support is a challenge</h2><p>This shouldn't be entirely surprising, but supporting different platforms
(Linux, macOS, Windows, etc) is <em>hard</em>. For example, Inko used to support
Windows when it used an interpreter. When switching to a compiled language, I
had to drop support for Windows as I couldn't get certain things to work (e.g.
the assembly used for switching thread stacks).</p><p>Running tests on different platforms is also not nearly as easy as it should be.
Take <a href="https://github.com/features/actions">GitHub Actions</a>: you can use it to
run tests on Linux, macOS, and Windows. Unfortunately, the free tier (at the
time of writing) only supports AMD64 runners, and while it <em>does</em> support macOS
ARM64 runners, these cost $0.16 per minute.</p><p>The cost isn't even the biggest problem here, because depending on how often
tests run it may not be that big. Rather, the problem is that paid runners
typically aren't available for forks, meaning pull requests from third-party
contributors won't be able to run the tests using these runners.</p><p>And this is ignoring the problem of supporting platforms not supported by your
continuous integration platform (e.g. GitHub Actions) of choice. FreeBSD is a
good example of this: GitHub Actions just doesn't support it, so you need to use
<a href="https://www.qemu.org/">qemu</a> or similar software to run FreeBSD in a VM.</p><p>Even if you <em>just</em> support Linux, you still have to deal with the differences
between Linux distributions. For example, Inko uses a Rust wrapper for LLVM
(<a href="https://github.com/TheDan64/inkwell">Inkwell</a>), but the low-level LLVM wrapper
(<a href="https://gitlab.com/taricorp/llvm-sys.rs">llvm-sys</a>) it uses <a href="https://gitlab.com/taricorp/llvm-sys.rs/-/issues/44">doesn't compile
on Alpine Linux</a>, and so
Inko doesn't support Alpine Linux for the time being.</p><p>The extend to which this is a problem depends on the language you're trying to
build. For example, if you're building an interpreter written in Rust it
probably won't be that bad (though Windows is always going to be a challenge),
but it <em>is</em> something you need to be prepared for.</p><p><strong>Recommendation:</strong> if you're uncertain about supporting a certain platform, err
on the side of not supporting it and document this, instead of
sort-of-but-not-quite supporting it.</p><h2 id="compiler-books-arent-worth-the-money">Compiler books aren't worth the money</h2><p>While there are plenty of books on compiler development, they tend to not be
that useful. In particular, such books tend to dedicate a significant amount of
time to parsing, arguably the most boring part of a compiler, then only briefly
cover the more interesting topics such as optimizations. Oh, and good luck
finding a book that explains how to write a type-checker, let alone
one that covers more practical topics such as supporting sub-typing, generics,
and so on.</p><p><strong>Recommendation:</strong> start with reading <a href="https://craftinginterpreters.com/">Crafting
Interpreters</a>, and read through
<a href="https://www.reddit.com/r/ProgrammingLanguages/">/r/ProgrammingLanguages</a> on
Reddit. If you're interested in learning more about pattern matching, <a href="https://github.com/yorickpeterse/pattern-matching-in-rust">this Git
repository may prove
useful</a>.</p><h2 id="growing-a-language-is-hard">Growing a language is hard</h2><p>Building a language is a significant challenge on its own. Growing the number of
users using your language and the libraries written in your language? That's
even more difficult. In particular, it seems languages either explode in terms
of popularity/interest, even if that may not be warranted (looking at you,
<a href="https://vlang.io/">V</a>), or it takes <em>years</em> for them to get even a handful of
users.</p><p>Making a living off a programming language is <em>exceptionally</em> difficult, as the
number of people willing to donate money is even smaller than those willing to
try out your new language. This means either dedicating a lot of spare time
towards building your language, or quitting your job and funding the development
yourself (e.g. using your savings). This is what I did by the end of 2021 and
while I don't regret doing so, it's a bit painful to watch your wallet shrink
over time.</p><p>As far as advice goes, I'm not sure how to approach this as I'm still figuring
that out myself. What I do know is that a lot of existing advice isn't helpful
at all, as it amounts to "Just get more users, LOL". Perhaps in another 10 years
from now I'll know the answer.</p><h2 id="the-best-test-suite-is-a-real-application">The best test suite is a real application</h2><p>This one is a bit obvious, but worth highlighting regardless: writing unit tests
for your language (e.g. for the standard library functions) is important and
useful, but nowhere near as useful as writing a real application in the
language. For example, I wrote a program to <a href="https://github.com/yorickpeterse/openflow">control my house's HVAC
system</a> in Inko, revealing various
bugs and areas of improvement in the process. Such applications also act as a
showcase for your language, making it easier for potential users to develop an
understanding of what an average project in your language might look like.</p><p><strong>Recommendation:</strong> write a few sufficiently complicated programs that are
actually useful in your language, then use these as a way of testing
functionality and stability of your language. If you can't think of any programs
to write, consider porting <a href="https://github.com/yorickpeterse/clogs">this changelog generator written in
Inko</a>, as it's complex enough to
act as a good stress test for your language, but not so complex it will take
weeks to port.</p><h2 id="dont-prioritize-performance-over-functionality">Don't prioritize performance over functionality</h2><p>When building a language, it can be tempting to focus heavily on providing a
fast implementation, such as a fast and memory efficient compiler, and one can
easily spend months working on this. Potential users of your language may care
about performance to some degree, but what they care about more is being able to
<em>use</em> your language, write libraries in it, and not having to reimplement every
basic feature themselves because of a lacking standard library.</p><p>To put it differently: the value of good performance is proportional to the
amount of meaningful code (= real applications) written in a language.</p><p><strong>Recommendation:</strong> as the saying goes: first make it work, then make it fast.
This doesn't mean you should not care about performance at all, rather 70-80% of
your energy should be directed towards functionality, with the remaining 20-30%
directed towards making the language not unreasonably slow.</p><h2 id="building-a-language-takes-time">Building a language takes time</h2><p>To wrap things up, here's another observation that should be obvious but is
worth bringing up regardless: building a simple language for yourself in a short
amount of time is doable. Building a language meant to be used by many for many
years to come is going to take a <em>long</em> time. To illustrate, here are some
examples of a few languages and when they released their first stable release (a
<code>?</code> indicates no stable release is available at the time of writing):</p><table><thead><tr><th>Language</th><th>Started in</th><th>Release of 1.0.0</th></tr></thead><tbody><tr><td>Python</td><td>1989</td><td>1994</td></tr><tr><td>Ruby</td><td>1993</td><td>1996</td></tr><tr><td>Scala</td><td>2001</td><td>2004</td></tr><tr><td>Rust</td><td>2006</td><td>2015</td></tr><tr><td>Go</td><td>2007</td><td>2012</td></tr><tr><td>Elixir</td><td>2011</td><td>2014</td></tr><tr><td>Crystal</td><td>2011</td><td>2021</td></tr><tr><td>Vale</td><td>2012</td><td>?</td></tr><tr><td>Inko</td><td>2013</td><td>?</td></tr><tr><td>Gleam</td><td>2016</td><td>?</td></tr><tr><td>V</td><td>2019</td><td>?</td></tr></tbody></table><p>On top of that, there can be significant time between a language becoming stable
and it becoming popular. Ruby 1.0 released in 1996, but it wouldn't be until
2005 or so that Ruby became popular with the release of Ruby on Rails. Rust in
turn saw a rise in popularity following its first stable release, but it would
still take a few years for the language to take off. Scala released version
1.0.0 in 2004, but didn't see widespread adoption until some time between 2010
and 2015.</p><p>Based on these patterns, I suspect that most languages will need at least 5-10
years of development before reaching their first stable release, followed by
another 5 years or so before it starts to take off. That's all assuming you end
up lucky enough for it to actually take off, as there are many languages that
instead fade into obscurity.</p><p><strong>Recommendation:</strong> if you want your language to succeed, be prepared for it to
take at least 10-15 years. If you expect it to take the world by storm in just a
year, you'll be sorely disappointed.</p>https://yorickpeterse.com/articles/switching-to-fedora-silverblue/Switching to Fedora Silverblue2023-03-02T23:43:38Z2023-03-02T23:43:38Z<p>
For the last 10 years or so, <a href="https://archlinux.org/">Arch Linux</a> has been my
Linux distribution of choice. The early years were a bit rough, and the process
of moving to systemd wasn't without its challenges either, though the experience
has improved dramatically since then. In spite of these improvements, certain
issues persisted, such as having to manually perform update related steps every
now and then, fixing broken packages after an update, updating packages in a
particular order (e.g. <code>archlinux-keyring</code> requiring an update before you can
update other packages), and more.</p><p>Arch being a rolling release distribution also means that you're not supposed to
install a new package without first updating your existing packages (at least
for libraries). That is, <code>sudo pacman -S some-package</code> <em>may</em> lead to problems,
so it's recommended to use <code>sudo pacman -Syu some-package</code> instead (see <a href="https://wiki.archlinux.org/title/System_maintenance#Partial_upgrades_are_unsupported">this
section</a>
for more details). It's not a deal breaker, but it's yet another thing to keep
in mind.</p><p>Perhaps the most annoying part is that package updates aren't tested all that
well, if at all; or at least it feels that way. Linux kernel updates in
particular had a tendency to cause issues on my laptop. I remember one
particular instance where a bug in the Intel drivers (or something in the kernel
itself, I can't quite remember) resulted in weird screen flickering/artifacts,
requiring a rollback to a previous kernel version. Pinning packages using
<code>IgnorePkg</code> was the usual workaround, but it's not a suitable long-term solution
as updated packages may not work with older versions of the packages you're
ignoring/pinning.</p><p>Long story short, over the years I realised I care more for a reliable and easy
(or easier) to use distribution, instead of a distribution that gives you
maximum control.</p><p>This is where <a href="https://getfedora.org/">Fedora</a> comes in, and specifically
<a href="https://silverblue.fedoraproject.org/">Fedora Silverblue</a>. Fedora has been
around for years, and I've been keeping my eyes on it for a while. A while back
I built a tiny computer to run some home automation software, and I decided to
use <a href="https://getfedora.org/en/server/">Fedora Server</a> for it. This gave me the
chance to try Fedora without it getting in the way.</p><p>I ended up enjoying this enough that I decided to move my Linux installations to
Fedora. As I mainly work on my desktop (still running Arch Linux at the time of
writing), I decided to migrate my laptop first. I decided to go with Silverblue
as I like the idea of an immutable desktop and the ability to roll back updates
<em>without</em> leaving behind a dirty state.</p><p>The first step was to do some research into potential issues I might encounter.
Through this I found a few potential issues/challenges to deal with:</p><ul><li>Fedora ships a mirror of Flathub instead of using Flathub directly. You can
and probably should disable this. I found and used <a href="https://www.reddit.com/r/Fedora/comments/z2kk88/fedora_silverblue_replace_the_fedora_flatpak_repo/">this Reddit
post</a>
as a reference to do so.</li><li>Fedora ships with
<a href="https://man.archlinux.org/man/systemd-oomd.8">systemd-oomd</a>, and apparently
this has a tendency to cause more problems than it solves
(see <a href="https://www.reddit.com/r/Fedora/comments/tcsen3/is_there_a_way_to_permanently_disable_systemdoomd/">here</a>
and <a href="https://www.reddit.com/r/Fedora/comments/10s06fd/why_is_systemdoomd_still_a_thing/">here</a>).
I ended up disabling it using
<code>sudo systemctl stop systemd-oomd && sudo systemctl disable systemd-oomd && sudo systemctl mask systemd-oomd</code>.</li><li><a href="https://www.reddit.com/r/Fedora/comments/l944bb/is_it_possible_to_install_silverblue_on_an/gnmktzx/">Apparently TRIM support isn't handled properly when using full disk
encryption</a>
on Silverblue. The solution is to add <code>rd.luks.options=discard</code> to your kernel
arguments.</li><li>A few packages I needed aren't available in the official repositories or
<a href="https://copr.fedorainfracloud.org/coprs/">copr</a>, more on that later.</li><li>I read something about Flatpak (and thus the Firefox Flatpak) not supporting
<a href="https://en.wikipedia.org/wiki/Universal_2nd_Factor">U2F</a>, meaning I wouldn't
be able to use my YubiKey with Firefox. This turned out to work just fine.</li></ul><p>Having determined these issues had workarounds that I could live with, I
proceeded with the installation process. The installation process itself was
easy and ran without any issues, discussed below in no particular order.</p><p>After the installation finished I applied the necessary workarounds/fixes for
the above issues, such as disabling <code>systemd-oomd</code>. Unfortunately, this is where
I ran into some new and unexpected problems, though not all are exclusive to
Silverblue.</p><h2 id="getting-my-keyboard-layout-to-work">Getting my keyboard layout to work</h2><p>For my desktop I use a split keyboard that uses the <a href="https://colemakmods.github.io/mod-dh/">Colemak
Mod-DH</a> ortholinear layout. On my laptop
I use the same layout, through combination of a custom
<a href="https://gitlab.freedesktop.org/xkeyboard-config/xkeyboard-config">xkb</a> keyboard
layout and remapping the keycaps on my keyboard:</p><p><img src="/images/switching-to-fedora-silverblue/keyboard.jpg" alt="Laptop keyboard" /></p><p>While the xkb project includes support for the Colemak Mod-DH layout, it only
supports the variant where the bottom-left keys are XCDVZ, whereas the
ortholinear version uses ZXCDV. I don't quite remember why the ZXCDV version
isn't included, but I recall the reason being along the lines of "the XCDVZ
layout is better for staggered keyboards". I guess I'm the only person wanting
to use the same layout everywhere? Either way, my solution was to create a
custom layout and be done with it.</p><p>For the Arch installation I just created the necessary files (based on <a href="http://who-t.blogspot.com/2020/09/user-specific-xkb-configuration-putting.html">this
article</a>)
in the right place. I then performed the necessary magical incantations (which I
of course couldn't remember) to get this working everywhere.</p><p>For Silverblue I started off with the same setup, placing the files in
<code>~/.config/xkb</code> instead of placing them in <code>/usr/share/X11/xkb</code>. While GNOME
picked up the files just fine, I couldn't get this to work for the LUKS unlock
screen or when using a console/TTY started using <code>Alt</code> and a function key. I
also wasn't able to get GDM to use the layout. Placing the files in <code>/usr/share</code>
wasn't an option either, as it's read-only on Silverblue.</p><p>Getting this to work took an entire evening, and required a few distinct steps.
First, I build an <a href="https://copr.fedorainfracloud.org/coprs/yorickpeterse/colemak-dh-ortho/">RPM
package</a>
to move these files into the right place in <code>/usr/share</code>. I then used
<code>rpm-ostree</code> to <a href="https://docs.fedoraproject.org/en-US/iot/add-layered/">layer the
package</a> onto the base
image.</p><p>To get the console working I set <code>KEYMAP</code> in <code>/etc/vconsole.conf</code> to
<code>colemak_dh_ortho</code>. The default initramfs of Silverblue ignores changes to this
file, so to get this working I had to run <code>rpm-ostree initramfs --enable</code>. This
enables regenerating of the initramfs every time you create a new rpm-ostree
deployment, ensuring the necessary files are part of the initramfs. The downside
is that commands such as <code>rpm-ostree install</code> and <code>rpm-ostree update</code> take quite
a bit longer to finish. I also added <code>vconsole.keymap=colemak_dh_ortho</code> to my
kernel arguments for good measure, but I'm not sure this is necessary.</p><p>The final piece of the puzzle was to get GDM working, which for some reason just
<em>refused</em> to use this layout. I'm still not sure what exactly solved it, but I
think it was running <code>gsettings set org.gnome.libgnomekbd.keyboard layouts '["colemak_dh_ortho","us"]'</code>
followed by another reboot.</p><p>And all that took was well over six hours.</p><h2 id="getting-rid-of-gnome-software">Getting rid of GNOME Software</h2><p>GNOME software is the primary way of installing software through a GUI on
Fedora. I ran into two issues with it, though both are not that big of a deal.</p><p>First, it's quite clunky to use when it comes to uninstalling software: when you
remove a program, the list of installed programs is refreshed a few seconds
after the removal finishes, showing a spinner while doing so. This made
removing multiple programs a pain, as the spinner would typically show up just
as I was about to click on the "remove" button of the next program I wanted to
remove.</p><p>The second problem is that GNOME Software leaks memory like a sieve, and after
several hours of using my laptop (I wasn't even using GNOME Software during that
time) I found it had eaten up close to 1 GiB of memory.</p><p><a href="https://grugbrain.dev/">grug</a> tired of software leak memory. grug want reach
for club, but grug remember easier just remove GNOME software and use terminal,
so grug run <code>rpm-ostree remove gnome-software gnome-software-rpm-ostree</code>. Memory
leak not worth grug's time and energy.</p><h2 id="rpm-ostree-and-dnf-are-slow">rpm-ostree and dnf are slow</h2><p>DNF being slow is well known in the Fedora community. While DNF5 is supposed to
improve this, I'll believe it when I see it. For me the process of installing
and removing packages is fast enough, but refreshing mirror/package metadata is
frustratingly slow.</p><p>What I didn't expect is for rpm-ostree to also be as slow as a snail. While you
can stage updates in the background and will do most of your package related
work in a container, you still have to interact with rpm-ostree every now and
then. Coming from Arch Linux where <code>pacman</code> is super fast, the experience leaves
a lot to be desired. To illustrate, for this article I ran <code>rpm-ostree update</code>
and it took just over two minutes to upgrade a mere two packages. Of course I'm
aware rpm-ostree does more than just upgrading two packages, but I'm not
convinced this can't be done any faster.</p><h2 id="building-packages-for-fedora-is-frustrating">Building packages for Fedora is frustrating</h2><p>A few packages I needed were missing: <a href="https://github.com/LuaLS/lua-language-server">Lua language
server</a>,
<a href="https://github.com/JohnnyMorganz/StyLua">Stylua</a>, the Source Code Pro fonts
with support for Nerd Fonts, <a href="https://github.com/Lyude/neovim-gtk/">neovim-gtk</a>,
and an up-to-date <a href="https://github.com/postmodern/ruby-install/">ruby-install</a>.</p><p>Wanting to do the right thing I decided to read up on creating RPM packages and
setting up a copr repository; something I had to do for my keyboard layout
anyway. The experience was deeply frustrating: documentation on RPM packages is
scattered across different websites, some new and some ancient. These websites
also manage to somehow present you a <em>ton</em> of text, but not actually explain
anything useful at all.</p><div class="admonition info"><i class="icon"></i><div class="text"><p>The following is a brief rant on RPM packaging. If you're not interested
in reading it, the summary is this:</p><p>The process of building an RPM is confusing and frustrating, especially compared
to how easy it's to build a package for Arch Linux. This only affects those
actually interested in building packages.</p></div></div><p>To illustrate how frustrating this process is: through reading some tutorials I
came across the RPM <code>%package</code> macro, but finding out what it did was near
impossible. If you search for "RPM package macro" on Google, the first result
<a href="https://docs.fedoraproject.org/en-US/packaging-guidelines/RPMMacros/">points to this
page</a> that
doesn't mention the macro at all. The <a href="https://rpm-software-management.github.io/rpm/manual/macros.html">second
result</a>
doesn't mention it either. In fact, none of the results seem to mention this
macro, and searching for "RPM %package macro" doesn't work either as the <code>%</code> is
ignored. At some point I found <a href="https://rpm-software-management.github.io/rpm/manual/spec.html">this
page</a> which
briefly mentions what it does, but to do that I had to:</p><ol><li>Go to <a href="https://rpm.org/index.html">https://rpm.org/index.html</a></li><li>Click on "Documentation" and end up at <a href="https://rpm.org/documentation.html">https://rpm.org/documentation.html</a></li><li>Click on "RPM Reference Manual" and end up at
<a href="https://rpm-software-management.github.io/rpm/manual/">https://rpm-software-management.github.io/rpm/manual/</a></li><li>Click on "Spec Syntax" and end up at
<a href="https://rpm-software-management.github.io/rpm/manual/spec.html">https://rpm-software-management.github.io/rpm/manual/spec.html</a></li><li>Search for <code>%package</code> on the page</li></ol><p>While this may seem like a weirdly specific issue to mention, I ran into issues
like this <em>constantly</em> while trying to figure out what the idiomatic/modern way
is of building an RPM.</p><p>Of course it gets worse. What would make sense is having just one tool to build
a package, and <em>maybe</em> a separate tool to upload it to copr and start a build.
Of course there isn't just one tool: this is Linux where people disagree on
just about everything.</p><p>Building RPM packages involves two low-level programs: <code>spectool</code> and
<code>rpmbuild</code>. <code>spectool</code> is used for just listing and downloading sources from an
RPM <code>.spec</code> file, which describes how to build a package. Of course in typical
Linux fashion it only downloads external sources, so if you list a local file as
a source (e.g. an icon to install), you'll need to move them into the right
place yourself. <code>rpmbuild</code> only concerns itself with building a package, and
straight up ignores any sources listed in your spec file.</p><p>Of course people using these tools realised this isn't nice and decided to fix
it by unifying the two into one program that everybody uses. Right? No, of
course not, that would make too much sense.</p><p>First we have <a href="https://pagure.io/rpkg-util">rpkg-util</a> which builds
upon the two mentioned tools and adds some templating capabilities. It's the
default build strategy for copr when building from a VCS repository, so you'd
think it's <em>the</em> way to build a package. But of course it's no longer maintained
per their README, and looking at existing packages on copr it seems it's not
used a lot. Oh and it also spits out the most useless error messages I've ever
seen, such as this:</p><div class="highlight"><pre class="highlight"><code>$ rpkg <span class="k">local</span> --spec ~/path/to/spec/outside/of/the/current/dir
git_dir_version failed with value <span class="mi">1</span>
</code></pre></div><p>Then there's <a href="https://github.com/rpm-software-management/tito">tito</a>, which
tries to do a whole bunch of things related to packaging and releasing, but
somehow doesn't actually make the process easier. It's default output is
incredibly verbose and makes debugging build errors near impossible, it <a href="https://github.com/rpm-software-management/tito/issues/446">doesn't
handle patch files</a>,
and its documentation is sorely lacking. Similar to rpkg-util I also wasn't able
to find any big projects that use it, even though tito has been around for over
a decade.</p><p>For the record, I understand how one ends up with a situation like this, and I
have nothing against the people working on these tools, but having gone through
this process I think I now understand why RPM packages are less commonly
available compared to those for other distributions.</p><p>As for my own packages, I resorted to using <code>spectool</code> and <code>rpmbuild</code> directly
through a <code>Makefile</code>. For example, for lua-language-server I use the following
<code>Makefile</code>:</p><div class="highlight"><pre class="highlight"><code>SPEC := lua-language-server.spec
TOP := ${PWD}/build
prepare:
rm -rf build
spectool --define <span class="s">"_topdir ${TOP}"</span> -gR ${SPEC}
cp -p sources/* build/SOURCES/
srpm: prepare
rpmbuild --define <span class="s">"_topdir ${TOP}"</span> -bs ${SPEC}
rpm: prepare
rpmbuild --define <span class="s">"_topdir ${TOP}"</span> -bb ${SPEC}
clean:
rm -rf build
<span class="k">.PHONY</span>: srpm prepare rpm
</code></pre></div><p>The <code>--define</code> flags are there so the RPM files and directories end up in
<code>./build</code> instead of in your home directory. This way you can build multiple
packages without their source files potentially conflicting.</p><p>To publish a new package I then update the <code>.spec</code> file by hand (e.g. adjusting
the version), run <code>make srpm</code>, followed by
<code>copr build lua-language-server path/to/the/built/srpm</code>. It's not too bad, but
it's still worse than just running <code>makepkg -s</code> on Arch Linux. If you're looking
into building a package for Fedora, I'd suggest doing something similar to the
above and just avoid rpkg-util and tito entirely <em>unless</em> you are certain you
need these tools.</p><h2 id="selinux-can-be-frustrating">SELinux can be frustrating</h2><p>Before installing Silverblue I made a backup of my
<a href="https://linrunner.de/tlp/index.html">TLP</a> configuration. While Fedora ships
with
<a href="https://gitlab.freedesktop.org/hadess/power-profiles-daemon">power-profiles-daemon</a>,
I've read a little too much about it not doing much more than just throttling
your CPU, so I decided to stick with TLP. After all, TLP works fine so why
bother replacing it. I installed TLP, replaced the default
<code>/etc/tlp.conf</code> configuration file with my own, and reset its ownership to
<code>root:root</code>. When I tried to start TLP using <code>sudo systemctl start tlp</code>, it
failed. Of course when I ran it manually it worked just fine.</p><p>After a while I found out this was a SELinux problem, probably due to certain
SELinux settings/permissions getting lost when I replaced the default file. To
fix this I ran <code>sudo fixfiles restore /etc/tlp.conf</code>, after which TLP started up
without issue.</p><p>While SELinux does log when there are errors (assuming you even remember that it
does and where they're stored), the logs themselves aren't helpful. For example:</p><pre><code>type=AVC msg=audit(1677382357.686:651): avc: denied { read } for
pid=16822 comm="tlp-readconfs" name="tlp.conf" dev="dm-0" ino=533021
scontext=system_u:system_r:tlp_t:s0
tcontext=system_u:object_r:dosfs_t:s0 tclass=file permissive=0
</code></pre><p>While this log line includes a ton of information, it does nothing to help me
understand what I need to do to fix the actual problem.</p><h2 id="fonts-issues-with-firefox">Fonts issues with Firefox</h2><p>While using the Firefox Flatpak, I noticed the text was a little fuzzy and hard
to read. Upon closer inspection I noticed it was applying <a href="https://en.wikipedia.org/wiki/Subpixel_rendering">subpixel
rendering</a>, even though this
is turned off system-wide (as it should be). I found out this is due to Flatpak
not allowing access to <code>$XDG_CONFIG_HOME/fontconfig</code>, which seems to result in
Firefox (incorrectly) guessing what to do.</p><p>The solution is to use <a href="https://github.com/tchx84/Flatseal">Flatseal</a> to give
Firefox access to the <code>xdg-config/fontconfig:ro</code> filesystem subset, then
restart Firefox.</p><h2 id="locale-errors-when-using-distrobox">Locale errors when using Distrobox</h2><p>I'm using <a href="https://github.com/89luca89/distrobox/">Distrobox</a> instead of
<a href="https://github.com/containers/toolbox">Toolbox</a>, though this issue may also
apply to Toolbox: when running certain commands in the container, I was getting
a "Failed to set locale, defaulting to C.UTF-8" error. Per <a href="https://github.com/89luca89/distrobox/issues/258">this
issue</a> the fix is to run <code>sudo
dnf install glibc-langpack-en</code> in your container, changing the package name
according to the language you are using.</p><h2 id="what-went-well-and-some-tips">What went well, and some tips</h2><p>There may have been more issues I ran into, but these are the ones I can
remember. Most of these are specific to my setup though. For example, if you use
a QWERTY keyboard then getting started is easier. The cost of figuring out how
to build an RPM package is a one-time cost, and wouldn't apply to most users of
Silverblue. In fact, I suspect most users would only run into the Firefox font
problem, the Distrobox locale errors (assuming they're using Distrobox in the
first place), and the slowness of rpm-ostree and DNF.</p><p>Apart from these issues, I'm enjoying Silverblue so far. I also like how the
immutable nature of Silverblue forces you to rethink certain workflows or
decisions, such as building a proper (reusable) package instead of just dumping
some files in <code>/usr</code> or <code>/etc</code>, or using containers more actively. Not having to
worry about updates breaking your system (or at least not as easily as on Arch
Linux) is of course also great.</p><p>As far as tips and tricks go, there are a few that I can recommend.</p><h3 id="put-the-container-name-in-your-prompt">Put the container name in your prompt</h3><p>Because you'll be using containers when using Silverblue (at least when using
the terminal), I recommend putting the name of the current container in your
shell prompt. I use <a href="https://fishshell.com/">Fish</a> and have my prompt configured
as follows:</p><div class="highlight"><pre class="highlight"><code><span class="k">function</span> <span class="k">fish_prompt</span>
<span class="k">if</span> [ $PWD = $HOME ]
<span class="k">set</span> directory <span class="s">'~'</span>
<span class="k">else</span>
<span class="k">set</span> directory (basename $PWD)
<span class="k">end</span>
<span class="k">if</span> <span class="k">test</span> -n <span class="s">"$CONTAINER_ID"</span>
<span class="k">echo</span> -n <span class="s">"[$CONTAINER_ID] "</span>
<span class="k">end</span>
<span class="k">set_color</span> $fish_color_cwd
<span class="k">echo</span> -n $directory
<span class="k">set_color</span> normal
<span class="k">echo</span> -n <span class="s">" \$ "</span>
<span class="k">end</span>
</code></pre></div><p>Outside a container this results in a prompt like this:</p><div class="highlight"><pre class="highlight"><code>Downloads $ input-here
</code></pre></div><p>And inside a container:</p><div class="highlight"><pre class="highlight"><code>[fedora] Downloads $ input-here
</code></pre></div><h3 id="use-gnome-terminal-profiles-for-your-containers">Use GNOME terminal profiles for your containers</h3><p>Distrobox can create <code>.desktop</code> files for your containers, making it easier to
start/enter them. If you open a new tab in that terminal it will open the tab in
the default shell, not in the container; at least when using GNOME terminal. To
work around this I adjusted the generated <code>.desktop</code> file to instead start GNOME
terminal with a dedicated profile like so:</p><pre><code>[Desktop Entry]
Name=Fedora
GenericName=Terminal entering Fedora
Comment=Terminal entering Fedora
Category=Distrobox;System;Utility"
Exec=gnome-terminal --profile Fedora -- /usr/bin/distrobox enter --no-workdir fedora
Icon=/var/home/yorickpeterse/.local/share/icons/distrobox/fedora.svg
Keywords=distrobox;
NoDisplay=false
Terminal=false
TryExec=/usr/bin/distrobox
Type=Application
</code></pre><p>Here <code>--profile Fedora</code> specifies the GNOME terminal profile to use.
The <code>--no-workdir</code> option ensures the new terminal process always starts in the
container's home directory.</p><p>The GNOME terminal profile in turn is configured as follows:</p><ul><li>Command → "Custom command" is set to <code>distrobox enter --name fedora -- fish</code></li><li>Command → "Preserve working directory" is set to "Always"</li></ul><p>This way opening new tabs results in them entering the container, while
preserving the working directory of the previous tab.</p><h3 id="give-your-containers-a-custom-home-directory">Give your containers a custom home directory</h3><p>This isn't necessary if you only intend to use a single container, but if you
use multiple containers it's a must: when creating a container using Distrobox,
the <code>--home</code> flag is used to specify a custom home directory. This way the
container won't pollute your actual home directory, and two different containers
using the same files in your home directory won't conflict. For example:</p><div class="highlight"><pre class="highlight"><code>mkdir $HOME/homes
distrobox create --name fedora --image fedora:latest --home $HOME/homes/fedora
</code></pre></div><p>This creates a new container called "fedora" with its home directory set to
<code>~/homes/fedora</code>.</p><p>Inside the container you still have access to the real home directory. As all my
projects are in <code>~/Projects</code> (in my real home directory), I created a symbolic
link to this folder from the container's home directory (running this inside the
container):</p><div class="highlight"><pre class="highlight"><code>ln -s /var/home/yorickpeterse/Projects $HOME/Projects
</code></pre></div><p>This way inside the container's home directory I can just run <code>cd Projects</code>,
instead of <code>cd ../../Projects</code>.</p><h3 id="automatically-stage-rpm-ostree-updates">Automatically stage rpm-ostree updates</h3><p>I'm not sure how well this works if you still have GNOME Software installed (or
if it's even necessary), but I have rpm-ostree set up to automatically stage
updates. This is done in two steps:</p><ol><li>Add <code>AutomaticUpdatePolicy=stage</code> to <code>/etc/rpm-ostreed.conf</code> under the
<code>[Daemon]</code> section.</li><li>Run <code>sudo systemctl reload rpm-ostreed</code> followed by <code>sudo systemctl enable
--now rpm-ostreed-automatic.timer</code>.</li></ol><p>You can then verify if it's enabled by running <code>rpm-ostree status</code>. If enabled
you should see a message at the top along the lines of:</p><pre><code>AutomaticUpdates: stage; rpm-ostreed-automatic.timer: last run 24h ago
</code></pre><h3 id="layer-adw-gtk3">Layer adw-gtk3</h3><p>GTK3 applications look different from GTK4 applications, which is annoying. We
can fix this by using the <a href="https://github.com/lassekongo83/adw-gtk3">adw-gtk3</a>
project as follows:</p><ol><li>Run <code>sudo wget -P /etc/yum.repos.d/ https://copr.fedorainfracloud.org/coprs/nickavem/adw-gtk3/repo/fedora-37/nickavem-adw-gtk3-fedora-37.repo</code>.</li><li>Run <code>rpm-ostree install gnome-tweaks adw-gtk3 --apply-live</code>.</li><li>Open Tweaks and go to "Appearance", then under "Legacy Applications" choose
"Adw-gtk3". If the theme isn't listed there try rebooting first.</li></ol><h2 id="conclusion">Conclusion</h2><p>To conclude, I like Silverblue, in spite of the issues I ran into. In the coming
weeks I'll also move my desktop over to Silverblue, and at some point in the
future I'll also move my Windows gaming desktop to Silverblue. Most of my issues
are specific to my setup and probably won't apply to most users, though I
wouldn't recommend Silverblue to those not familiar with a terminal just yet; at
least not until GNOME Software is less clunky and stops hogging memory.</p>https://yorickpeterse.com/articles/im-leaving-gitlab-to-work-on-inko-full-time/I'm leaving GitLab to work on Inko full-time2021-12-14T15:54:48Z2021-12-14T15:54:48Z<p>
Back in October 2015 I joined <a href="https://about.gitlab.com/">GitLab</a>. I think I was
employee #28 at the time, with the total number of employees being somewhere
between 30 and 40 if I'm not mistaken. Fast-forward to today, and GitLab has
grown to almost 1600 employees.</p><p>While I enjoyed my time at GitLab, after a little over six years I feel it's
time for something new. In particular, I want to be able to dedicate more time
to <a href="https://inko-lang.org/">Inko</a>. With that in mind, I resigned from GitLab
with my last day being December 31st 2021. Starting January 1st 2022, I'll be
working on Inko full-time. The roadmap for 2022 is as follows:</p><ol><li>Finish the new compiler written in Rust, which also implements a new memory
management strategy for Inko.</li><li>Build a decentralised package manager</li><li>Grow the community</li></ol><p>For now I'm not adding more to the roadmap, as I'm not yet sure how productive
I'll be once I start working on Inko full-time.</p><p>The new memory management strategy is something I'm most excited about. This
strategy combines the efficient heap layout from
<a href="https://www.cs.utexas.edu/users/speedway/DaCapo/papers/immix-pldi-2008.pdf">Immix</a>
with a single-ownership model, but without the lifetimes complexity found in
Rust. The ownership model is based on the paper <a href="https://researcher.watson.ibm.com/researcher/files/us-bacon/Dingle07Ownership.pdf">Ownership You Can Count On: A
Hybrid Approach to Safe Explicit Memory
Management</a>,
though I intend to extend it with additional compile-time analysis and support
for generic data structures that support both owned and borrowed values. Of
course this approach comes with its own trade-offs, but I feel these trade-offs
are worth making, and will make Inko a compelling alternative to languages such
as Python, Ruby, and Erlang.</p><p>If you'd like to support the project financially, you can do so <a href="https://github.com/sponsors/YorickPeterse/">through GitHub
Sponsors</a>. And if you'd like to
follow progress made on Inko, consider joining the <a href="https://matrix.to/#/#inko-lang:matrix.org">Matrix
channel</a>, as I'll post short updates
there from time to time. I also intend to start recording videos on the
development of Inko and maybe start streaming, but I think it will take a bit of
time before I have the courage to do so.</p>https://yorickpeterse.com/articles/friendship-ended-with-the-garbage-collector/Friendship ended with the garbage collector2021-08-24T18:00:00Z2021-08-24T18:00:00Z<p>
It's been a while since the last update about my work on the <a href="https://inko-lang.org/">Inko programming
language</a>. Not because there hasn't been any progress,
but because I've been busy making changes. A <em>lot</em> of changes.</p><p>For the past two years or so I have been toying with the idea of replacing
Inko's garbage collector with something else. The rationale for this is that at
some point, all garbage collected languages run into the same issue: the
workload is too great for the garbage collector to keep up.</p><p>The solutions to such problems vary. Sometimes one has to spend hours tweaking
garbage collection settings. Such settings often lack good documentation, and
are highly dependent on the infrastructure used to run the software. Other times
one has to use hacks such as <a href="https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i-learnt-to-stop-worrying-and-love-the-heap-26c2462549a2/">allocating a 10 GB byte
array</a>.</p><p>This got me thinking: what if for Inko I got rid of the garbage collector
entirely, preventing users from running into these problems? After spending some
time looking into this (see <a href="https://gitlab.com/inko-lang/inko/-/issues/207">this
issue</a> for more details), I
decided to postpone the idea. I wasn't able to come up with a good solution at
the time, so I decided to take another look at it in the future.</p><p>Earlier this year I read the paper
<a href="https://researcher.watson.ibm.com/researcher/files/us-bacon/Dingle07Ownership.pdf">"Ownership You Can Count On: A Hybrid Approach to Safe Explicit Memory Management"</a>.
This paper is from 2006, and describes a single ownership model for managing
memory. The approach outlined is pretty straightforward: you have owned values,
and references. When an owned value goes out of scope, it's deallocated. When
creating a reference, you increment a counter stored in the owned value the
reference points to. When the reference goes out of scope, the count is reduced.
When an owned value goes out of scope and its reference count is not zero, the
program terminates with an error (which I'll refer to as a "panic").</p><p>Of course this approach has its own downside: a program may panic at when
dropping an owned value, if it still has one or more references pointing to it.
This is something you can prevent from happening (at least as much as possible)
using compiler analysis. Since you still have a runtime mechanism to fall back
to, this analysis doesn't have to be perfect. The result is that you can decide
how you want to balance developer productivity, correctness, and the complexity
of the implementation.</p><p>In contrast, Rust has a strict and complex ownership model. This model ensures
that if your program compiles (and you don't use unsafe code), you won't run
into memory related issues such as dangling references or use-after-free errors.
The trade-off here is extra complexity, not being able to implement certain
patterns in safe code (e.g. linked lists), and possibly more.</p><p>The approach outlined here was compelling enough for me to take another look at
using a single ownership model for Inko. Along the way, I found out about a
language called <a href="https://vale.dev/">Vale</a>, which draws inspiration from the same
paper.</p><h2 id="the-current-status">The current status</h2><p>Replacing the garbage collector with a single ownership model (amongst other
changes I'm making) is what I have been working on since March 2021. The
progress is tracked in the merge request <a href="https://gitlab.com/inko-lang/inko/-/merge_requests/120">"Single ownership, move semantics, and
a new memory layout"</a>.
Besides introducing a single ownership model, the merge request introduces
changes such as (but not limited to):</p><ul><li>Throwing errors is much cheaper, with the cost being similar to a regular
function return.</li><li>Defining processes is done similar to defining classes, and sending messages
looks like regular method calls.</li><li>A new compiler written in Rust, replacing the Ruby compiler. When our
self-hosting compiler is mature enough, the Rust compiler will be used to
bootstrap the self-hosting compiler.</li><li>A greatly improved allocator. We still use the Immix heap layout, and heaps
are now thread-local instead of process-local.</li><li>Method calls and field lookups no longer use hashing, and instead use regular
index lookups.</li><li>Dynamic dispatch is handled using a hashing approach inspired by <a href="https://thume.ca/2019/07/29/shenanigans-with-hash-tables/">Shenanigans
With Hash Tables</a>.
Using this approach we allow reopening of classes and implementing of traits
after defining a class, without the need for fat pointers. The compiler will
generate code such that collisions are rare, and that the cost of handling
collisions is as small as possible.</li></ul><h3 id="processes-and-messages">Processes and messages</h3><p>A big change that is the direct result of the single ownership model is how
processes send messages to each other. The released version of Inko takes an
approach similar to Erlang: each process has its own heap, and messages are deep
copied when sent. This removes the need for sharing memory, which in turn
removes the need for synchronisation. The cost is having to deep copy objects.
This can be time consuming, and handling circular objects is a challenge.
Copying of some objects can also fail at runtime (e.g. sockets), but there
wasn't a nice way of handling this.</p><p>When you use a single ownership model, you don't need copying. Instead, you just
transfer ownership to the receiving process. This also means you don't have to
maintain a heap per process. Instead, you can maintain a heap per OS thread (to
allow for fast thread-local allocations), as the ownership model guarantees no
two processes can access the same object concurrently. The result is a nicer
language, type-safe message passing, a reduction in memory usage due to
processes being smaller, and lots of other improvements.</p><p>To illustrate this, here is a simple example of implementing a distributed
counter:</p><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">class</span> Counter {
<span class="vi">@number</span>: UnsignedInt
<span class="k">async</span> def increment {
<span class="vi">@number</span> += <span class="mi">1</span>
}
<span class="k">async</span> def get -> UnsignedInt {
<span class="vi">@number</span>
}
}
def main {
<span class="k">let</span> counter = Counter { <span class="vi">@number</span> = <span class="mi">0</span> }
counter.increment
counter.increment
counter.get <span class="c"># => 2</span>
}
</code></pre></div><p>Defining processes is done using <code>async class</code>. When you create an instance of
an async class, a lightweight process (not an OS process) is spawned that owns
the instance. The process that created the instance is given a value of type
<code>async T</code>, or <code>async Counter</code> in the above example. This type acts as the
client, with the process acting as a server. Clients can be copied and sent to
other processes.</p><p>Messages are essentially remote procedure calls, and look like regular method
calls. When you create a process with one or more fields, or pass arguments
along with your message, the ownership of the values is transferred to the
receiving process. A few types can't be sent to different processes, such as
references, closures, and generators.</p><p>Message processing happens in FIFO order. When all clients disconnect, and the
process has no more messages to process, the process runs its destructor and
terminates.</p><p>When you send a message, the sender waits for a result to be produced, without
blocking the OS thread the process is running on. If you instead want a future
to resolve later, you can use the <code>async</code> keyword (<code>async counter.get</code>
instead of <code>counter.get</code>).</p><h3 id="circular-types">Circular types</h3><p>In languages with single ownership, circular types such as doubly linked lists
can be tricky to implement, typically requiring unsafe code such as raw
pointers. In Inko, such types are easy to implement:</p><div class="highlight"><pre class="highlight"><code><span class="k">class</span> DoublyLinkedList[T] {
<span class="vi">@head</span>: ?Node[T]
}
<span class="k">class</span> Node[T] {
<span class="vi">@value</span>: T
<span class="vi">@next</span>: ?Node[T]
<span class="vi">@prev</span>: ?<span class="k">ref</span> Node[T]
}
</code></pre></div><p>Here <code>?T</code> is syntax sugar for <code>Option[T]</code>, meaning it's an optional value. <code>ref
T</code> is a reference to an owned value of type <code>T</code>.</p><p>We don't need destructors, as Inko drops fields in reverse lexical order. For
our linked list example with nodes A and B (with B coming after A), the drop
order is as follows:</p><pre><code>1. A @prev
2. A @next --> 3. B @prev
4. B @next
5. B @value
6. B
7. A @value
8. A
</code></pre><p>When we reach step 8, the reference from B to A is dropped, so no error is
produced.</p><p>For more complex types a custom destructor may be needed to drop fields in a
different order, though such cases should be rare. Even then, you won't need any
unsafe code.</p><h3 id="generics-support-both-owned-values-and-references">Generics support both owned values and references</h3><p>A challenge identified in the ownership paper is allowing generic types to
support both owned values and references. The paper doesn't provide a solution,
and instead mentions implementing different types (so one Array type for owned
values, and one for references).</p><p>Inko will start by using pointer tagging to differentiate between owned values
and references. We already use pointer tagging for immediate values, and had an
extra bit to spare anyway. Any generic code that isn't inlined will use a
runtime check of this bit when dropping a generically typed value.</p><p>I decided against the use of monomorphisation for several reasons:</p><ul><li>We don't have (and I can't think of any) optimisations that can take advantage
of it.</li><li>It increases compile times, and I want to keep these as low as possible.</li><li>Through inlining most generic types can be removed.</li><li>It increases memory usage.</li><li>The Array type is built into the VM, and the VM uses it in several different
places. If we monomorhpise generic types (including Array), the VM needs to be
refactored such that it doesn't use the Array type directly. If we don't, the
VM won't know which implementation of the Array type to use.</li></ul><p>In the future Inko may use a different approach, but for the time being pointer
tagging should be good enough.</p><h3 id="heap-layout">Heap layout</h3><p>A benefit of garbage collected languages is that they can allocate and reclaim
memory such that allocations are fast, and fragmentation is kept low. Inko
retains the
<a href="https://www.cs.utexas.edu/users/speedway/DaCapo/papers/immix-pldi-2008.pdf">Immix</a>
heap layout and bump allocator. To reuse memory and combat fragmentation, Inko
threads scans a chunk of their heap before running a process. When a reusable
block of memory is found, it's moved to the end of the heap after the allocation
position. Scanning is done incrementally, ensuring that each scan takes a fixed
maximum amount of time. Objects are never moved around, as doing so requires
traversing all live objects (or read barriers) to update pointers to the moved
objects.</p><p>While this approach doesn't fully mitigate fragmentation, I believe it should be
good enough for the foreseeable future.</p><h2 id="remaining-work">Remaining work</h2><p>While work on the new virtual machine is finished, I'm still working on the new
compiler. As part of this I'll also need to rewrite parts of the self-hosting
compiler code written thus far. I suspect it will take a few more months before
the work is finished. I'm <em>super</em> excited about these changes, and I hope they
will make Inko a more compelling language to use. They will also make Inko a
much faster language.</p><p>If you'd like to stay up to date on the progress made, I recommend joining
Inko's <a href="https://matrix.to/#/#inko-lang:matrix.org">Matrix channel</a>, or
subscribing to <a href="https://www.reddit.com/r/inko/">/r/inko</a> on Reddit.</p>https://yorickpeterse.com/articles/libffi-rs-100/libffi-rs 1.0.0 is released2020-10-25T00:09:55Z2020-10-25T00:09:55Z<p>
</p><p><a href="https://crates.io/crates/libffi">libffi-rs</a> (<a href="https://github.com/tov/libffi-rs/">GitHub
repository</a>) is a Rust crate that provides
bindings to <a href="https://sourceware.org/libffi/">libffi</a>. I've been using the crate
for about two years now for <a href="https://inko-lang.org/">Inko</a>, and it works great.</p><p>Development of the crate slowed down in recent years, as the author <a href="https://github.com/tov/">Jesse A.
Tov</a> has been busy. To help the author out, I joined as
a maintainer, and earlier today I released version 1.0.0 of the libffi crate.</p><h2 id="whats-new">What's new</h2><p>Version 1.0.0 does not introduce any API changes compared to previous versions.
What it does introduce is the removal of the dependency on the
<a href="https://crates.io/crates/bindgen">bindgen</a> crate.</p><p>Previous versions of the crate use bindgen to generate libffi bindings at
build-time, which requires libclang to be installed. While installing libclang
is not a problem on Linux, on macOS and Windows it's a bit more tricky. The
bindgen crate also introduces quite the list of build-time Rust dependencies: 37
direct and indirect dependencies to be exact.</p><p>Starting with version 1.0.0, these dependencies are no longer necessary. The
removal of these dependencies means installing the crate is both easier and
faster, while providing the same functionality as before.</p><p>For more information about these changes, take a look at <a href="https://github.com/tov/libffi-sys-rs/pull/37">this pull
request</a>.</p><h2 id="upgrading">Upgrading</h2><p>Existing users should have no trouble updating to the latest version, as the
public API remains unchanged compared to the previous version (0.9.0). To
upgrade, change your dependency definition to the following:</p><div class="highlight"><pre class="highlight"><code>[dependencies]
libffi = <span class="s">"1.0.0"</span>
</code></pre></div><h2 id="future-plans">Future plans</h2><p>There is <a href="https://github.com/tov/libffi-rs/pull/14">an open pull request that improves ARMv7
support</a>, which I would like to
include in a future release. Apart from that there are no big plans at this
time, as the crate works well enough in its current state.</p>https://yorickpeterse.com/articles/10-years-of-software-development/10 years of software development2020-06-30T00:00:00Z2020-06-30T00:00:00Z<p>
</p><p>June 2020 marks my 10-year anniversary of becoming a software developer. When I
first started out I just turned 18 years old, and yet somehow thought I knew
everything better than everybody else. Of course this wasn't true, quite the
opposite in fact.</p><p>Fast forward 10 years, and I have learned quite a bit. I have learned new tools,
gained new skills, learned how to sit properly behind a desk, the list goes on.
But as I have gained more knowledge and experience, for some reason I have also
become increasingly more critical of myself.</p><p>Take for example <a href="https://inko-lang.org/">Inko</a>. Developing a programming
language is hard work, and requires a certain amount of knowledge. While I don't
consider myself an expert by any means, what I have achieved with Inko thus far
shows I have a certain amount of knowledge and experience. And yet with every
step I question myself. "Why would anybody use this?", "This isn't fast enough",
"That other language does it much better", "You are unlikely to succeed", the
list of questions and comments in my head goes on. Worse, the more progress I
make, the more critical I seem to become.</p><p>I don't know how to solve this, nor do I have any good advice for others dealing
with the same problem. All I can say is this: know that you are not alone. In
fact, it's probably safe to assume everybody you know suffers from the same
problem to a certain degree.</p><p>Maybe in another 10 years I'll have a solution. Or perhaps by then I have
learned to just live with it.</p>https://yorickpeterse.com/articles/deciding-when-to-collect-garbage/Deciding when to collect garbage2019-12-02T17:15:00Z2019-12-02T17:15:00Z<p>
</p><p>How to perform garbage collection is a widely explored topic, and there are all
sorts of different techniques. Sequential collectors, parallel collectors,
concurrent collectors, incremental collectors, real-time collectors, the list
goes on. There are also different techniques for allocators used, ranging from
free list allocators to bump allocators.</p><p>Deciding <em>when</em> to perform garbage collection appears to be written
about less frequently. I suspect the reason for this is that deciding when
to collect is specific to a programming language's behaviour. For example,
languages using immutable objects will allocate a lot and thus more frequent
collections may be desired.</p><p>Let's illustrate this using the best book one can buy to learn more about
garbage collection: <a href="http://gchandbook.org/">The Garbage Collection Handbook, 2nd
Edition</a>. This book consists of 416 pages, excluding the
preface, table of contents, glossary, etc. These 416 pages cover pretty much
everything there is to know about garbage collectors, how to implement them,
what their trade-offs are, and so on.</p><p>Of these 416 pages, I could not find any that focus specifically on when to
collect garbage. I do vaguely recall it's discussed somewhere in the book, but I
was unable to find this by looking at the table of contents and skimming through
several chapters.</p><p>In this article we'll take a look at the different techniques that can be used
to decide when to collect garbage, how to implement such a technique, and what
techniques a few programming languages out there use.</p><h2 id="table-of-contents">Table of contents</h2><ul class="toc"><li><a href="#deciding-when-to-collect">Deciding when to collect</a><ul><li><a href="#collecting-based-on-object-allocation-counts">Collecting based on object allocation counts</a></li><li><a href="#collecting-based-on-object-sizes">Collecting based on object sizes</a></li><li><a href="#collecting-based-on-object-weights">Collecting based on object weights</a></li><li><a href="#collecting-based-on-the-number-of-memory-blocks">Collecting based on the number of memory blocks</a></li><li><a href="#collecting-based-on-the-usage-percentage-of-a-fixed-size-heap">Collecting based on the usage percentage of a fixed-size heap</a></li><li><a href="#collecting-between-web-requests">Collecting between web requests</a></li><li><a href="#collecting-after-a-certain-time-has-passed">Collecting after a certain time has passed</a></li><li><a href="#collecting-when-the-system-runs-out-of-memory">Collecting when the system runs out of memory</a></li><li><a href="#collecting-based-on-past-collection-statistics">Collecting based on past collection statistics</a></li></ul></li><li><a href="#the-flaw-of-collecting-based-on-allocations">The flaw of collecting based on allocations</a></li><li><a href="#deciding-when-to-collect-using-rust">Deciding when to collect using Rust</a></li><li><a href="#languages-and-what-techniques-they-use">Languages and what techniques they use</a><ul><li><a href="#inko">Inko</a></li><li><a href="#java">Java</a></li><li><a href="#lua">Lua</a></li><li><a href="#ruby">Ruby</a></li></ul></li><li><a href="#conclusion">Conclusion</a></li></ul><h2 id="deciding-when-to-collect">Deciding when to collect</h2><p>Let's start by taking a look at the different ways a collector can determine if
garbage collection is necessary, in no particular order.</p><h3 id="collecting-based-on-object-allocation-counts">Collecting based on object allocation counts</h3><p>This approach is the most simple, and a commonly used approach. When a certain
number of objects is allocated since the last collection, we trigger a
collection. At the end of a collection we reset this counter. This is repeated
until the program terminates.</p><p>Most collectors using this approach will increase the threshold as the program
runs, if needed. For example, a collector may decide to double the threshold if
it could not release enough memory during a collection. This ensures that
garbage collections don't happen too frequently.</p><h3 id="collecting-based-on-object-sizes">Collecting based on object sizes</h3><p>A refinement of collecting based on object <em>counts</em> is to trigger a collection
after allocating a certain number of <em>bytes</em>. This is useful when you have
objects of different sizes. Imagine a system where we collect based on object
counts, and we allocate lots of large objects, but not enough to cross the
allocation count threshold. Because we collect based on counts and not sizes, we
may end up wasting more memory than necessary.</p><p>Obtaining the size of an object may not always be easy, or cheaper than just
counting the number of objects. It's also not helpful if all objects are the
same size, as counting the size would thus be the same as just counting the
amount of objects.</p><h3 id="collecting-based-on-object-weights">Collecting based on object weights</h3><p>Just allocating memory is not always all that needs to be done to initialise an
object. Fields need to be filled in, synchronisation may be needed based on what
kind of object is allocated, and so on. Instead of collecting based on the
number of allocated objects, a collector may decide to assign a weight to every
object, triggering a collection when the total weight exceeds a certain
threshold.</p><h3 id="collecting-based-on-the-number-of-memory-blocks">Collecting based on the number of memory blocks</h3><p>Counting individual object allocations may get expensive if allocations happen
frequently. Allocators in turn frequently divide memory in blocks, such as a
block of 8 KB. A collector can then decide to not count the number of allocated
objects, but the number of blocks in use. If a block can contain 100 objects,
this means we only need to increment and check our statistics once every 100
allocations; instead of doing so on every allocation. This may improve
performance, but can also delay garbage collection.</p><h3 id="collecting-based-on-the-usage-percentage-of-a-fixed-size-heap">Collecting based on the usage percentage of a fixed-size heap</h3><p>Instead of collecting based on a counter crossing a threshold, we assign a fixed
size to our heap. When a certain percentage of this heap is used we trigger a
collection. When the heap is full, we trigger a collection and/or error if no
additional memory is available.</p><p>This approach allows us to enforce an upper limit on the size of the heap, which
can be useful in memory constrained environments. The downside is that consuming
the entire heap may lead to the program terminating (depending on what the
collector does in this case), even when the system has memory available.</p><p>This approach also may not work well if tasks (lightweight processes, threads,
and so on) have their own heap, as preallocating memory for these heaps may be
expensive and end up consuming a lot of (virtual) memory.</p><h3 id="collecting-between-web-requests">Collecting between web requests</h3><p>A less common approach sometimes employed by web applications is to disable
garbage collection by default, and manually run it after completing a web
request. The idea of this approach is to defer any garbage collection pauses
until after a request, preventing garbage collections from negatively affecting
the user experience.</p><p>In practise I think this won't work as well as one might think. While an
accepted request won't be interrupted by a collection, future requests may take
longer to be handled due to a collection running between requests. With that
said, this can be influenced by the application's behaviour, so perhaps there
are cases where this does help.</p><h3 id="collecting-after-a-certain-time-has-passed">Collecting after a certain time has passed</h3><p>Instead of collecting based on some incremented number, a collector may decide
to collect after a certain amount of time has passed. To the best of my
knowledge this approach is not commonly used on its own. Instead, it's sometimes
used as a backup of sorts to ensure collections run periodically, even when
allocating only a small number objects.</p><p>Using this approach on its own is unlikely to work well, as there is no
correlation between the time elapsed and the need to collect garbage. That is,
just because five minutes have passed does not mean a collection is needed.</p><p><a href="https://golang.org/">Go</a> appears to use (or at least has used) this approach to
force a garbage collection if no collection has taken place for more than two
minutes. I have not been able to confirm if Go still does this as of Go 1.13.</p><h3 id="collecting-when-the-system-runs-out-of-memory">Collecting when the system runs out of memory</h3><p>When the operating system runs out of memory, we may want to trigger a
collection in an attempt to release memory back to the operating system. This
approach, if used, can be useful when used on top of another technique to
trigger regular collections.</p><p>The effectiveness of this is debatable. When collecting garbage we may need to
allocate some memory for temporary data structures (e.g. a queue to track
objects to scan), but this may result in the operating system terminating the
program as no memory is available. Since there is also no guarantee that a
collector is able to release memory back to the operating system, this may
result in collections wasting time.</p><h3 id="collecting-based-on-past-collection-statistics">Collecting based on past collection statistics</h3><p>This is another technique that may be applied on top of a previously mentioned
technique: trigger a collection (earlier) based on statistics gathered from a
previous collection cycle. For example, a collector may decide to delay a
collection if the previous collection spent too much time tracing objects. By
delaying the collection, the collector may need to trace fewer objects the next
time it runs.</p><h2 id="the-flaw-of-collecting-based-on-allocations">The flaw of collecting based on allocations</h2><p>Triggering collections based on allocations comes with a flaw: allocations and
the amount of garbage may not necessarily be related. This means that in some
cases a collection may be triggered too soon, while other times collections may
be triggered too late.</p><p>Since tracing collectors operate on the live objects, there's not much that can
be done about this. Reference counting collectors operate on dead objects and
thus would have a better view of how much garbage there is, but efficient
reference counting collectors are complex and come with their own drawbacks.
High performance reference counting collectors also behave similar to tracing
collectors, but may be much more complex to implement; meaning you may be better
off just using a tracing collector.</p><p>There may be some sort of hybrid approach where a tracing collector keeps track
of (an estimate of) dead objects, without using a fully blown reference counting
system. These statistics (perhaps in combination with other statistics) could
then be used to decide when a collection is needed. I am not aware of any
collectors that use this technique, and I have my doubts about the benefits
being greater than the drawbacks this technique would introduce.</p><h2 id="deciding-when-to-collect-using-rust">Deciding when to collect using Rust</h2><p>With that all covered, let's implement a simple strategy to determine when to
collect by counting the allocated objects. For these examples I'll use Rust.
First we'll start with some boilerplate:</p><div class="highlight"><pre class="highlight"><code><span class="k">use</span> std::alloc::{alloc, handle_alloc_error, Layout};
<span class="k">pub</span> <span class="k">struct</span> Heap {
<span class="c">/// The number of objects allocated since the last collection.</span>
allocations: usize,
<span class="c">/// The number of objects to allocate to trigger a collection.</span>
threshold: usize,
<span class="c">/// The factor to grow the threshold by (2.0 means a growth of 2x).</span>
growth_factor: f64,
<span class="c">/// The percentage of the threshold (0.0 is 0% and 1.0 is 100%) that should</span>
<span class="c">/// still be in use after a collection before increasing the threshold.</span>
resize_threshold: f64,
<span class="c">/// The number of objects marked during a collection.</span>
marked: usize,
}
<span class="k">impl</span> Heap {
<span class="k">pub</span> <span class="k">fn</span> new() -> <span class="k">Self</span> {
<span class="k">Self</span> {
allocations: <span class="mi">0</span>,
threshold: <span class="mi">32</span>,
growth_factor: <span class="mf">2.0</span>,
resize_threshold: <span class="mf">0.9</span>,
marked: <span class="mi">0</span>
}
}
}
</code></pre></div><p>The <code>Heap</code> type would be used for storing heap information (e.g. a pointer to a
block of memory to allocate into), and the number of allocations. For the sake
of this article we keep this implementation as simple as possible. We use an
arbitrary growth factor of 2.0. We use a float to allow for more precise growth
factors, such as 1.5 or 2.3. Other values such as the threshold and resize
threshold are also arbitrary.</p><p>Let's add a method to allocate objects:</p><div class="highlight"><pre class="highlight"><code><span class="k">impl</span> Heap {
<span class="k">pub</span> <span class="k">fn</span> allocate(<span class="k">&</span><span class="k">mut</span> <span class="k">self</span>, size: usize) -> *<span class="k">mut</span> u8 {
<span class="k">let</span> layout = Layout::from_size_align(size, <span class="mi">8</span>)
.expect(<span class="s">"The size and/or alignment is invalid"</span>);
<span class="k">let</span> pointer = <span class="k">unsafe</span> { alloc(layout) };
<span class="k">if</span> pointer.is_null() {
handle_alloc_error(layout);
}
<span class="k">self</span>.allocations += <span class="mi">1</span>;
pointer
}
}
</code></pre></div><p>Our <code>Heap::allocate()</code> method takes the number of bytes to allocate as an
argument, returning a raw pointer to the allocated memory. For the sake of
simplicity we align memory to 8 bytes. If an allocation fails (NULL is
returned), we let Rust handle this for us.</p><p>Now that we have the method to allocate memory, let's add two methods: one to
check if a collection is needed, and one to increase the threshold if needed:</p><div class="highlight"><pre class="highlight"><code><span class="k">impl</span> Heap {
<span class="k">pub</span> <span class="k">fn</span> should_collect(<span class="k">&</span><span class="k">self</span>) -> bool {
<span class="k">self</span>.allocations > <span class="k">self</span>.threshold
}
<span class="k">pub</span> <span class="k">fn</span> increase_allocation_threshold(<span class="k">&</span><span class="k">mut</span> <span class="k">self</span>) {
<span class="k">let</span> threshold = <span class="k">self</span>.threshold <span class="k">as</span> f64;
<span class="k">if</span> (<span class="k">self</span>.marked <span class="k">as</span> f64 / threshold) < <span class="k">self</span>.resize_threshold {
<span class="k">return</span>;
}
<span class="k">self</span>.threshold = (threshold * <span class="k">self</span>.growth_factor).ceil() <span class="k">as</span> usize;
}
}
</code></pre></div><p><code>Heap::should_collect()</code> is simple and should not need any explaining.
<code>Heap::increase_allocation_threshold()</code> checks if the number of marked objects
(this value would be updated by the collector while tracing objects) is too
great, increasing the threshold (using a growth factor) if needed.</p><p>That's all there is to it. Well, almost: a real collector probably needs to
store more data, update the statistics in the right place, and so on; but <em>just</em>
the code for deciding when to collect is straightforward.</p><h2 id="languages-and-what-techniques-they-use">Languages and what techniques they use</h2><p>Now let's take a look at some programming languages out there, and what approach
they use to determine when a collection is needed.</p><h3 id="inko">Inko</h3><p><a href="https://inko-lang.org">Inko</a> uses lightweight processes, each process has its
own heap, and the collector collects each process independently. The process
heaps consists of one or more 8 KB blocks. After a collection, the collector
returns any free blocks to a global collector for later use. Any still full
blocks the collector puts aside so they won't be used for allocations. Any
blocks with space available can be reused once the process resumes.</p><p>Every time a block is requested from the global allocator, a block allocation
counter is incremented. This is done for both the young and mature generation.
When this counter exceeds a certain threshold, a collection is triggered. If
after a collection the collector determines not enough blocks could be returned
to the global allocator, it will increase the threshold for the next allocation.
The various settings used for this (the initial thresholds, growth factors,
etc.) can all be configured using environment variables.</p><p>The current block thresholds are 8 MB for the young generation, and 16 MB for
the mature generation. These thresholds are arbitrary, and they will probably
change in the future. The mature generation threshold in particular seems rather
high, as 16 MB of blocks translates to around half a million objects; far too
much for a single lightweight process.</p><h3 id="java">Java</h3><p>The JVM enforces a maximum heap size that is configured when starting the JVM.
Due to all the different collectors the JVM supports it's hard to determine what
triggers a garbage collection. I suspect it's based on a variety of
statistics, such as how much of the (fixed-size) heap is in use, previous
collection timings, and so on.</p><h3 id="lua">Lua</h3><p>Per <a href="http://www.lua.org/manual/5.4/manual.html#2.5">this document</a>, Lua 5.4 has
two garbage collection modes: an incremental collector, and a generational
collector. Both collectors seem to use a similar approach to deciding when to
collect: when the amount of bytes allocated grows beyond a certain value, a
collection is triggered.</p><h3 id="ruby">Ruby</h3><p>Ruby uses several statistics to determine when to collect, and if a minor or
full collection should be performed. The article <a href="https://www.speedshop.co/2017/03/09/a-guide-to-gc-stat.html">Understanding Ruby GC through
GC.stat</a> covers
these various statistics pretty well.</p><p>When a certain number of objects is allocated, Ruby runs a minor collection.
Full collections can be triggered if the number of promoted objects exceeds a
threshold, or if one of several other conditions (which we won't cover here) is
met. Ruby will also increase these thresholds if needed, though I can't remember
if the collector always increases these thresholds, or only in certain cases.</p><h2 id="conclusion">Conclusion</h2><p>While this article is not the most in-depth overview of deciding when to trigger
a garbage collection, I hope it's useful enough to give a better understanding
of what may trigger a collection, and what impact the various techniques will
have.</p>https://yorickpeterse.com/articles/not-beating-c-with-96-lines-of-inko/Not beating C with 96 lines of Inko2019-11-22T12:00:00Z2019-11-22T12:00:00Z<p>
The article <a href="https://chrispenner.ca/posts/wc">"Beating C with 80 Lines of
Haskell"</a> discusses writing a simplified
version of <code>wc</code> using Haskell, and how it performs compared to the C
implementation. This resulted in various other people writing the same program
in different languages, and writing about doing so. At the time of writing,
there are implementations for:</p><ul><li><a href="http://verisimilitudes.net/2019-11-11">Ada</a></li><li><a href="https://github.com/expr-fi/fastlwc/">C</a></li><li><a href="http://verisimilitudes.net/2019-11-12">Common Lisp</a></li><li><a href="https://ummaycoc.github.io/wc.apl/">Dyalog APL</a></li><li><a href="https://futhark-lang.org/blog/2019-10-25-beating-c-with-futhark-on-gpu.html">Futhark</a></li><li><a href="https://ajeetdsouza.github.io/blog/posts/beating-c-with-70-lines-of-go/">Go</a></li><li><a href="https://chrispenner.ca/posts/wc">Haskell</a></li><li><a href="https://medium.com/@martinmroz/beating-c-with-120-lines-of-rust-wc-a0db679fe920">Rust</a></li></ul><p>Today we will be taking a look at writing a similar program in
<a href="https://inko-lang.org/">Inko</a>.</p><h2 id="benchmarking-setup">Benchmarking & setup</h2><p>Several articles mentioned above include some benchmarking data, such as how
long it takes to count the words of a file with a certain size (e.g. 1GB).
While we will also discuss some benchmarking data, it's important to not focus
on them too much. Instead, the numbers discussed below should be treated as
rough estimates at best.</p><p>For this article we will be comparing the Inko implementation to GNU <code>wc</code>
version 8.31, running on a 7th generation Thinkpad X1 Carbon. The CPU is a Intel
Core i5-8265U. The CPU governor used is the "performance" governor, and the
clock speed is 3.8 Ghz. The OS is Arch Linux running Linux kernel version
5.3.11. The storage device is an NVMe SSD.</p><h2 id="implementation">Implementation</h2><p>Like the other implementations, our implementation expects ASCII input. We also
won't implement any command-line options, or other features of <code>wc</code>. Our input
set will be <a href="https://github.com/ChrisPenner/wc/blob/master/data/big.txt">this file</a>
from the Haskell implementation. The file size is 6.2 MB.</p><p>For our Inko implementation we will take an approach to counting words similar
to the Go (and other) implementations: we read our input into a byte array, in
chunks of 64 KB. When we encounter a whitespace character, we set a flag and
increment the line count. When we reach a non-whitespace character and the flag
is set, we increment the word count and unset the flag. We repeat this until we
have consumed all input bytes.</p><h2 id="importing-our-dependencies">Importing our dependencies</h2><p>Let's start by importing the types and modules we need:</p><div class="highlight"><pre class="highlight"><code><span class="k">import</span> std::byte_array::ByteArray
<span class="k">import</span> std::env
<span class="k">import</span> std::fs::file
<span class="k">import</span> std::pair::Pair
<span class="k">import</span> std::process
<span class="k">import</span> std::stdio::stdout
<span class="k">import</span> std::string_buffer::StringBuffer
</code></pre></div><p>ByteArray stores a sequence of bytes, as actual bytes and not as (signed)
integers. This means a ByteArray of 4 bytes needs 1 byte per value, instead of 8
bytes (when using an integer). This type is not imported by default, so we have
to explicitly import it.</p><p>The module <code>std::fs::file</code> provides file IO types and methods. Inko uses
different types for files based on the open mode, such as <code>ReadOnlyFile</code> for
read-only files. We will see this in action later.</p><p>Pair is a binary tuple. We will use this so we don't have to define our own
types for in several places.</p><p>Unlike languages such as Ruby, operations using STDERR, STDOUT, and STDIN
require you to import the appropriate modules; instead of relying on global
methods or types. The module <code>std::stdio::stdout</code> is used for writing to STDOUT.</p><p>Our last import is the importing of the <code>StringBuffer</code> type. Inko does not have
string interpolation or formatting, so concatenating strings together (without
producing intermediate strings) requires the use of the <code>StringBuffer</code> type.
This is a bit clunky, but it's good enough for now.</p><h2 id="constants">Constants</h2><p>Next we will define several constants that we need to access in several methods:</p><div class="highlight"><pre class="highlight"><code><span class="k">let</span> CONCURRENCY = <span class="mi">8</span>
<span class="k">let</span> MAIN = process.current
<span class="k">let</span> NEWLINE = <span class="mi">10</span>
<span class="k">let</span> SINGLE_SPACE = <span class="mi">32</span>
<span class="k">let</span> SPACE_RANGE = <span class="mi">9</span>..<span class="mi">13</span>
</code></pre></div><p>The <code>CONCURRENCY</code> constant controls the number of processes we will spawn to
count words. The simplest approach would be to spawn one process for every
chunk. Since the work is purely CPU bound doing so doesn't improve performance
if we end up spawning more processes than the number of CPU cores.</p><p>The <code>MAIN</code> constant stores an object containing information about the current
process. All processes we spawn for counting words will send their results to
this process.</p><p>The next three constants define some byte values: byte 10 is the Unix newline
separator, byte 32 is a single space, and the range <code>9..13</code> covers all ASCII
whitespace characters (newlines, tabs, etc). In Inko <code>A..B</code> creates an inclusive
range from A to B.</p><h2 id="counting-words">Counting words</h2><p>It's time to define the methods and types we need to count the words in a
<code>ByteArray</code>, starting with two methods: <code>space?</code> and <code>worker_loop</code>:</p><div class="highlight"><pre class="highlight"><code>def space?(byte: Integer) -> Boolean {
SPACE_RANGE.cover?(byte).<span class="k">or</span> { byte == SINGLE_SPACE }
}
def worker_loop {
<span class="k">let</span> chunk = process.receive <span class="k">as</span> Chunk
MAIN.send(chunk.count)
worker_loop
}
</code></pre></div><p>The <code>space?</code> method returns <code>True</code> if the input byte is a whitespace character,
such as a single space or a newline. Inko has no if/else/or/and statements,
instead it uses messages, methods, and closures. Instead of writing <code>A || B</code>,
you would write <code>A.or { B }</code>, where <code>or</code> is a message sent to <code>A</code>. The curly
braces <code>{ B }</code> denote a closure, which in this case returns whatever <code>B</code> is.</p><p>The <code>worker_loop</code> method is a tail-recursive method called by the processes that
count words. Each loop the process will wait for an incoming message using
<code>process.receive</code>. Sending messages to processes uses dynamic typing, and Inko
is pretty strict about dynamic typing. For example, passing a dynamic type
(<code>Dynamic</code>) as an argument does not work if a non-dynamic type (e.g. <code>Integer</code>)
is expected. Sending messages to a dynamic type is fine, and will produce a new
dynamic type. This means we could condense this method to the following:</p><div class="highlight"><pre class="highlight"><code>def worker_loop {
MAIN.send(process.receive.count)
worker_loop
}
</code></pre></div><p>The reason we don't do this is to make it more clear what input we expect in
this method, and to prevent us from using the wrong method(s).</p><p>Inko supports tail call elimination, so our <code>worker_loop</code> method will not
overflow the stack. We could also use a closure and send the <code>loop</code> message to
it:</p><div class="highlight"><pre class="highlight"><code>def worker_loop {
{
<span class="k">let</span> chunk = process.receive <span class="k">as</span> Chunk
MAIN.send(chunk.count)
}.<span class="k">loop</span>
}
</code></pre></div><p>This achieves the same results and in fact <code>loop</code> is implemented using tail
recursion. Since using tail recursion ourselves in this method requires a little
less code we just use that, instead of using <code>loop</code>.</p><p>Now it's time to create an object used for counting words, which we will call
<code>Chunk</code>. This type will hold some state, such as the bytes to process and the
number of lines counted so far. We use a dedicated type so it's a bit easier to
send input to the word counting processes, and so we can use tail recursion when
iterating over the bytes to process. We define objects using the <code>object</code>
keyword:</p><div class="highlight"><pre class="highlight"><code>object Chunk {
}
</code></pre></div><p>Object attributes need to be defined explicitly when we define the object, so
let's do that:</p><div class="highlight"><pre class="highlight"><code>object Chunk {
<span class="vi">@previous_is_space</span>: Boolean
<span class="vi">@bytes</span>: ByteArray
<span class="vi">@lines</span>: Integer
<span class="vi">@words</span>: Integer
<span class="vi">@index</span>: Integer
}
</code></pre></div><p>In Inko we define and refer to attributes using the syntax <code>@NAME</code>. The <code>@</code> is
part of the name, so it's valid to define both an attribute <code>@foo</code> and a method
<code>foo</code>. When defining attributes we must also specify the type, such as <code>Integer</code>
for the attribute <code>@index</code>. The attribute <code>@previous_is_space</code> is used to record
if a previously processed byte was a whitespace character.</p><p>Now we need to define our initialiser method, which is always called <code>init</code>:</p><div class="highlight"><pre class="highlight"><code>def init(previous_is_space: Boolean, bytes: ByteArray) {
<span class="vi">@previous_is_space</span> = previous_is_space
<span class="vi">@bytes</span> = bytes
<span class="vi">@lines</span> = <span class="mi">0</span>
<span class="vi">@words</span> = <span class="mi">0</span>
<span class="vi">@index</span> = <span class="mi">0</span>
}
</code></pre></div><p>This method just sets the attributes to the right value. If we forget to set an
attribute in the <code>init</code> method, the compiler will produce an error.</p><p>We can now define a method to count words and lines, which we will creatively
name "count":</p><div class="highlight"><pre class="highlight"><code>def count -> Pair!(Integer, Integer) {
<span class="k">let</span> byte = <span class="vi">@bytes</span>[<span class="vi">@index</span>]
byte.nil?.if_true {
<span class="k">return</span> Pair.new(<span class="vi">@lines</span>, <span class="vi">@words</span>)
}
space?(byte!).<span class="k">if</span>(
<span class="k">true</span>: {
(byte == NEWLINE).if_true {
<span class="vi">@lines</span> += <span class="mi">1</span>
}
<span class="vi">@previous_is_space</span> = True
},
<span class="k">false</span>: {
<span class="vi">@previous_is_space</span>.if_true {
<span class="vi">@words</span> += <span class="mi">1</span>
<span class="vi">@previous_is_space</span> = False
}
}
)
<span class="vi">@index</span> += <span class="mi">1</span>
count
}
</code></pre></div><p>That's quite a lot to take in, so let's break it down. We start by obtaining the
current byte, and checking if it's <code>Nil</code>. Accessing an out of bounds index in a
<code>ByteArray</code> is valid, and returns <code>Nil</code>. When this is the case we have consumed
all input, and we can return the number of lines and words we have counted.
Instead of creating a custom object to store the lines and words, we use the
<code>Pair</code> type.</p><p>Remember that Inko does not have <code>if</code> statements, and instead uses messages and
method calls. Here <code>if_true</code> is sent to the result of <code>byte.nil?</code>, and the
closure passed as its argument will only be run if <code>byte.nil?</code> produced boolean
true.</p><p>Next up we have the code that determines what to do with the current byte:</p><div class="highlight"><pre class="highlight"><code>space?(byte!).<span class="k">if</span>(
<span class="k">true</span>: {
(byte == NEWLINE).if_true {
<span class="vi">@lines</span> += <span class="mi">1</span>
}
<span class="vi">@previous_is_space</span> = True
},
<span class="k">false</span>: {
<span class="vi">@previous_is_space</span>.if_true {
<span class="vi">@words</span> += <span class="mi">1</span>
<span class="vi">@previous_is_space</span> = False
}
}
)
</code></pre></div><p>We use the <code>space?</code> method we defined earlier on, and pass it the current byte.
We use <code>byte!</code> instead of just <code>byte</code>, as the type of <code>byte</code> is <code>?Integer</code> (an
Integer or Nil). Since <code>space?</code> expects an <code>Integer</code>, we have to cast our <code>byte</code>
variable to the right type. Doing this by hand gets tedious, so Inko offers the
<code>!</code> postfix operator to do just that.</p><p>Once we have obtained the result of <code>space?</code>, we send the <code>if</code> message to it and
pass two arguments: a closure to run when the receiver is true, and a closure
for when the receiver is false. Here <code>true:</code> and <code>false:</code> are just keyword
arguments used to clarify the purpose of the closures.</p><p>The last two lines are pretty simple: we just increment the byte index by 1,
then tail recurse back into the <code>count</code> method.</p><h2 id="scheduling-work">Scheduling work</h2><p>Now that we have our methods and types in place, we can start scheduling the
work. We'll start by opening the file in read-only mode, making sure a file is
actually provided:</p><div class="highlight"><pre class="highlight"><code>env.arguments[<span class="mi">0</span>].nil?.if_true {
process.panic(<span class="s">'You must specify a file to process'</span>)
}
<span class="k">let</span> path = env.arguments[<span class="mi">0</span>]!
<span class="k">let</span> input = <span class="k">try</span>! file.read_only(path)
</code></pre></div><p><code>env.arguments[0]</code> returns the first command-line argument, or <code>Nil</code> if no
there are no arguments. If this happens, we exit the program with a
<a href="https://inko-lang.org/manual/getting-started/error-handling/#header-panics">panic</a>.</p><p>Our file is opened using <code>file.read_only(path)</code>, which opens the file <code>path</code>
points to in read-only mode. We use <code>try!</code> to cause a panic if the file could
not be opened, since there isn't much we can do without being able to open the
file.</p><p>Bored yet? No? Good, we're almost there!</p><p>Now it's time to start our worker processes, and to start scheduling work:</p><div class="highlight"><pre class="highlight"><code><span class="k">let</span> workers =
CONCURRENCY.times.map do (_) { process.spawn { worker_loop } }.to_array
<span class="k">let</span> <span class="k">mut</span> bytes = <span class="mi">0</span>
<span class="k">let</span> <span class="k">mut</span> words = <span class="mi">0</span>
<span class="k">let</span> <span class="k">mut</span> lines = <span class="mi">0</span>
<span class="k">let</span> <span class="k">mut</span> previous_is_space = True
<span class="k">let</span> <span class="k">mut</span> jobs = <span class="mi">0</span>
<span class="k">let</span> buffer = ByteArray.new
</code></pre></div><p>The <code>workers</code> assignment is the most interesting. The bit
<code>CONCURRENCY.times.map</code> creates an iterator that runs 8 times (since we set
<code>CONCURRENCY</code> to 8), mapping the input value (an integer ranging from 0 to 7) to
the result of <code>process.spawn</code>. Since we don't care about the input integer, we
define the argument name as <code>_</code>. We then collect the results into an <code>Array</code>
using the <code>to_array</code> message. Each spawned process runs the <code>worker_loop</code>
method, until the program is finished. The other variables are not interesting,
so let's skip those.</p><p>We will divide work across the processes in a round-robin fashion, until we run
out of bytes to read. Every process is given a chunk of equal size:</p><div class="highlight"><pre class="highlight"><code>{
<span class="k">try</span>! input.read_bytes(bytes: buffer, size: CHUNK_SIZE).positive?
}.while_true {
workers[jobs % workers.length]
.send(Chunk.new(previous_is_space: previous_is_space, bytes: buffer))
previous_is_space = space?(buffer[-<span class="mi">1</span>]!)
bytes += buffer.length
jobs += <span class="mi">1</span>
buffer.clear
}
</code></pre></div><p>We create a closure that returns the result of
<code>input.read_bytes(...).positive?</code>, which is a boolean. The result of
<code>input.read_bytes(...)</code> is an integer signaling the number of bytes read. If the
operation fails, we panic (by using the <code>try!</code> keyword). The method <code>read_bytes</code>
reads bytes <em>into</em> a provided <code>ByteArray</code>, instead of returning a <code>ByteArray</code>.</p><p><code>while_true</code> is a message sent to this closure, and will run its argument (also
a closure) as long as the receiver returns boolean true.</p><p>Work is balanced across processes by sending the chunks to processes:</p><div class="highlight"><pre class="highlight"><code>workers[jobs % workers.length]
.send(Chunk.new(previous_is_space: previous_is_space, bytes: buffer))
</code></pre></div><p>The expression <code>jobs % workers.length</code> produces an integer/index between zero
and the last index in the <code>workers</code> array. Since the <code>workers</code> <code>Array</code> stores
<code>Process</code> objects, we can just send <code>send</code> to them to have the message (a
<code>Chunk</code> object in this case) sent to the process.</p><p>Since we perform work in parallel, we have to determine if a chunk follows
whitespace when scheduling them. We do this using <code>previous_is_space =
space?(buffer[-1]!)</code>. Inko allows you to access negative indexes of <code>Array</code> and
<code>ByteArray</code> types, which translate to indexes from the end of the list. In other
words, the index -1 accesses the last element in the list.</p><p>After this we just increment the number of bytes read, the number of jobs
scheduled, and we clear our buffer. We reuse the same <code>ByteArray</code> so we don't
have to create a new one for every 64 KB of bytes that we read.</p><p>Now we can wait for all the results to be sent back from our workers, then
present them:</p><div class="highlight"><pre class="highlight"><code>{ jobs.positive? }.while_true {
<span class="k">let</span> count = process.receive <span class="k">as</span> Pair!(Integer, Integer)
lines += count.first
words += count.second
jobs -= <span class="mi">1</span>
}
stdout.print(StringBuffer.new(
<span class="s">' '</span>,
lines.to_string,
<span class="s">' '</span>,
words.to_string,
<span class="s">' '</span>,
bytes.to_string,
<span class="s">' '</span>,
path
))
</code></pre></div><p>Here we wait for incoming messages, cast them to the right type (a Pair
of the number of lines and words), then add the results to the total number of
lines and words. Lastly, we present the results by writing them to STDOUT.</p><p>Our final version looks like this:</p><div class="highlight"><pre class="highlight"><code><span class="k">import</span> std::byte_array::ByteArray
<span class="k">import</span> std::env
<span class="k">import</span> std::fs::file
<span class="k">import</span> std::pair::Pair
<span class="k">import</span> std::process
<span class="k">import</span> std::stdio::stdout
<span class="k">import</span> std::string_buffer::StringBuffer
<span class="k">let</span> CONCURRENCY = <span class="mi">8</span>
<span class="k">let</span> MAIN = process.current
<span class="k">let</span> NEWLINE = <span class="mi">10</span>
<span class="k">let</span> SINGLE_SPACE = <span class="mi">32</span>
<span class="k">let</span> SPACE_RANGE = <span class="mi">9</span>..<span class="mi">13</span>
<span class="k">let</span> CHUNK_SIZE = <span class="mi">64</span> * <span class="mi">1024</span>
def space?(byte: Integer) -> Boolean {
SPACE_RANGE.cover?(byte).<span class="k">or</span> { byte == SINGLE_SPACE }
}
def worker_loop {
<span class="k">let</span> chunk = process.receive <span class="k">as</span> Chunk
MAIN.send(chunk.count)
worker_loop
}
object Chunk {
<span class="vi">@previous_is_space</span>: Boolean
<span class="vi">@bytes</span>: ByteArray
<span class="vi">@lines</span>: Integer
<span class="vi">@words</span>: Integer
<span class="vi">@index</span>: Integer
def init(previous_is_space: Boolean, bytes: ByteArray) {
<span class="vi">@previous_is_space</span> = previous_is_space
<span class="vi">@bytes</span> = bytes
<span class="vi">@lines</span> = <span class="mi">0</span>
<span class="vi">@words</span> = <span class="mi">0</span>
<span class="vi">@index</span> = <span class="mi">0</span>
}
def count -> Pair!(Integer, Integer) {
<span class="k">let</span> byte = <span class="vi">@bytes</span>[<span class="vi">@index</span>]
byte.nil?.if_true {
<span class="k">return</span> Pair.new(<span class="vi">@lines</span>, <span class="vi">@words</span>)
}
space?(byte!).<span class="k">if</span>(
<span class="k">true</span>: {
(byte == NEWLINE).if_true {
<span class="vi">@lines</span> += <span class="mi">1</span>
}
<span class="vi">@previous_is_space</span> = True
},
<span class="k">false</span>: {
<span class="vi">@previous_is_space</span>.if_true {
<span class="vi">@words</span> += <span class="mi">1</span>
<span class="vi">@previous_is_space</span> = False
}
}
)
<span class="vi">@index</span> += <span class="mi">1</span>
count
}
}
env.arguments[<span class="mi">0</span>].nil?.if_true {
process.panic(<span class="s">'You must specify a file to process'</span>)
}
<span class="k">let</span> path = env.arguments[<span class="mi">0</span>]!
<span class="k">let</span> input = <span class="k">try</span>! file.read_only(path)
<span class="k">let</span> workers =
CONCURRENCY.times.map do (_) { process.spawn { worker_loop } }.to_array
<span class="k">let</span> <span class="k">mut</span> bytes = <span class="mi">0</span>
<span class="k">let</span> <span class="k">mut</span> words = <span class="mi">0</span>
<span class="k">let</span> <span class="k">mut</span> lines = <span class="mi">0</span>
<span class="k">let</span> <span class="k">mut</span> previous_is_space = True
<span class="k">let</span> <span class="k">mut</span> jobs = <span class="mi">0</span>
<span class="k">let</span> buffer = ByteArray.new
{
<span class="k">try</span>! input.read_bytes(bytes: buffer, size: CHUNK_SIZE).positive?
}.while_true {
workers[jobs % workers.length]
.send(Chunk.new(previous_is_space: previous_is_space, bytes: buffer))
previous_is_space = space?(buffer[-<span class="mi">1</span>]!)
bytes += buffer.length
jobs += <span class="mi">1</span>
buffer.clear
}
{ jobs.positive? }.while_true {
<span class="k">let</span> count = process.receive <span class="k">as</span> Pair!(Integer, Integer)
lines += count.first
words += count.second
jobs -= <span class="mi">1</span>
}
stdout.print(StringBuffer.new(
<span class="s">' '</span>,
lines.to_string,
<span class="s">' '</span>,
words.to_string,
<span class="s">' '</span>,
bytes.to_string,
<span class="s">' '</span>,
path
))
</code></pre></div><h2 id="performance">Performance</h2><p>Let's start by running GNU <code>wc</code> to see how it performs:</p><pre><code>$ time -f "%es %MKB" wc big.txt
128457 1095695 6488666 big.txt
0.03s 2136KB
</code></pre><p>This only took 0.03 seconds (30 milliseconds), and used a peak RSS of 2.08 MB.
Not bad!</p><p>Now let's see how our Inko implementation performs:</p><pre><code>$ time -f "%es %MKB" inko wc.inko big.txt
128457 1095695 6488666 big.txt
8.34s 260272KB
</code></pre><p>Ouch! Our implementation uses a peak RSS of 254 MB, and takes 8.34 seconds to
count the words and lines. What's going on here? Is our implementation bad, or
is Inko just slow?</p><p>Well, sort of. Our implementation isn't bad at all. Maybe it would be a bit
nicer if we wouldn't have to use the <code>StringBuffer</code> type, but apart from that
there is not a lot worth changing. Instead, the problem is Inko. More precisely,
the complete lack of optimisations applied by Inko's compiler.</p><h3 id="optimisations-or-lack-thereof">Optimisations, or lack thereof</h3><p>When creating a programming language you need a compiler to compile your
language. The first compiler thus needs to be written in a different language.
For Inko I opted to use Ruby since it's widely available, and a language I have
worked with for almost ten years. The goal is to rewrite Inko's compiler in Inko
itself, something that is actively worked on.</p><p>Because we want to replace the Ruby compiler with a compiler written in Inko, we
spent little time on adding optimisations to the Ruby compiler. In fact, the
only optimisations it applies are:</p><ol><li>Tail call elimination</li><li>Replacing keyword arguments passed in-order with positional arguments</li></ol><p>Other languages typically perform some form of method inlining, constant
folding, optimising certain method calls into specialised instructions (e.g.
translating <code>A + B</code> into something that doesn't require a method call), etc.
Inko's current compiler does none of that, producing code that does not perform
as well as it should.</p><h3 id="closure-allocations">Closure allocations</h3><p>This brings us to the main problem of our implementation: closure allocations.
Specifically, the use of closures instead of statements such as <code>if</code> and
<code>while</code>. Allocating a closure is not that expensive, but in our implementation
of <code>wc</code> we are allocating a lot. Our <code>count</code> method alone will create at least
five closures for every byte. For a 64 KB chunk that results in a total of 327
680 closures. More allocations also means more garbage collections. While we can
reuse memory after a collection, collections still take up time.</p><p>To combat this we plan to add an optimisation pass to the self-hosting compiler
that will eliminate closure allocations where possible. For example, cases such
as <code>if_true</code> and <code>if_false</code> can be optimised to not use closures at all. It's
hard to say how big the impact of this would be on our <code>wc</code> implementation, but
I would not be surprised if we can cut the runtime in half; or maybe reduce it
even more.</p><h3 id="garbage-collection-performance">Garbage collection performance</h3><p>Another problem we are running into is that Inko's garbage collector is spending
far more time tracing objects than should be necessary. Under normal
circumstances Inko's garbage collector is able to trace lots of objects in less
than one millisecond, but for our <code>wc</code> implementation it can take several
milliseconds to trace 20-30 objects. We can see this by running our <code>wc</code>
implementation while setting the environment variable <code>INKO_PRINT_GC_TIMINGS</code> to
<code>true</code> (some output is removed to keep things readable):</p><pre><code>$ env INKO_PRINT_GC_TIMINGS=true time -f "%es %MKB" inko wc.inko big.txt
[0x7fb240004ec0] GC in 2.528122ms, 28 marked, 0 promoted, 0 evacuated
[0x7fb240004670] GC in 15.437073ms, 28 marked, 0 promoted, 0 evacuated
[0x7fb240005630] GC in 28.714244ms, 28 marked, 0 promoted, 0 evacuated
[0x7fb240007440] GC in 30.711002ms, 28 marked, 0 promoted, 0 evacuated
</code></pre><p>This even happens when we limit the number of tracing threads to 1, instead of
the default of half the number of CPU cores:</p><pre><code>$ env INKO_TRACER_THREADS=1 \
INKO_PRINT_GC_TIMINGS=true time -f "%es %MKB" inko wc.inko big.txt
[0x7fdbfc005dd0] GC in 581.006µs, 28 marked, 0 promoted, 0 evacuated
[0x7fdbfc005630] GC in 2.047803ms, 28 marked, 0 promoted, 0 evacuated
[0x7fdbfc007bb0] GC in 918.097µs, 28 marked, 0 promoted, 0 evacuated
[0x7fdbfc004ec0] GC in 1.104836ms, 28 marked, 0 promoted, 0 evacuated
</code></pre><p>The timings may be a bit better, but they are still pretty bad given we end up
only marking a small number of objects. Take the following program as an
example:</p><div class="highlight"><pre class="highlight"><code>object Thing {}
<span class="k">let</span> things = <span class="mi">28</span>.times.map do (_) { Thing.new }.to_array
<span class="mi">1_000_000</span>.times.each do (integer) {
integer.to_float
}
</code></pre></div><p>Here we create an array containing 28 <code>Thing</code> instances, which we keep around.
We then create one million float objects, which are heap allocated. If we run
this with the <code>INKO_PRINT_GC_TIMINGS</code> variable set, the output is as follows:</p><pre><code>$ env INKO_PRINT_GC_TIMINGS=true inko foo.inko
[0x5620ad17df70] GC in 523.047µs, 44 marked, 0 promoted, 0 evacuated
[0x5620ad17df70] GC in 480.612µs, 46 marked, 0 promoted, 0 evacuated
[0x5620ad17df70] GC in 493.339µs, 63 marked, 43 promoted, 0 evacuated
[0x5620ad17df70] GC in 552.766µs, 9 marked, 0 promoted, 0 evacuated
</code></pre><p>These timings are much closer to what one would expect.</p><p>It's not quite clear yet what is causing this slowdown. Based on some profiling
using Valgrind I suspect the
<a href="https://github.com/crossbeam-rs/crossbeam">crossbeam</a> library (which we use in
the garbage collector) is to blame, as Valgrind's data suggests most time is
spent in crossbeam code; even though the code should be fast. The crossbeam
types we use rely on an epoch based garbage collection mechanism, and per <a href="https://github.com/crossbeam-rs/rfcs/blob/master/text/2017-05-23-epoch-gc.md#oversubscription">this
crossbeam
RFC</a>
it seems this may not work too well when spawning lots of short-lived threads;
as is done when tracing objects.</p><p>A possible solution would be to use a fixed-size thread pool for tracing
objects, instead of spawning tracing threads on-demand. We do not use this
approach at the moment because the current approach is easier to implement. An
approach I have been thinking of is to give each collector thread its own pool
of tracing threads, spawned when the collector threads first starts up. This
approach means a tracing pool only ever collects a single process at a time,
allowing us to pass certain data around once (= when starting the tracing),
instead of having to pass it around with every new job that is scheduled. This
is something I will have to take a look at in the coming weeks.</p><h2 id="wrapping-up">Wrapping up</h2><p>We did not manage to beat C with Inko, but that was never the goal of this
exercise. Instead, I merely wanted to showcase how one would approach the
problem using Inko, and get more people interested in Inko as a result.</p><p>The optimisations discussed will be applied over time, gradually improving
performance of Inko. One day we will also add a JIT, though I suspect it will
take several years before we will have a JIT. The potential crossbeam bottleneck
is also worth investigating.</p><p>I doubt a dynamic language such as Inko will be able to beat C, but if we can at
least beat other dynamic languages (e.g. Ruby) that is good enough.</p><p>For more information about Inko, take a look at the <a href="https://inko-lang.org/">Inko
website</a> or the <a href="https://gitlab.com/inko-lang/inko">Git
repository</a>. If you would like to sponsor the
development of Inko with a small monthly contribution, please take a look at the
<a href="https://inko-lang.org/sponsors/">sponsors page</a> for more information.</p>https://yorickpeterse.com/articles/writing-a-self-hosting-compiler-for-inko/Writing a self-hosting compiler for Inko2019-06-08T00:00:00Z2019-06-08T00:00:00Z<p>
About a year ago I wrote <a href="/articles/inko-a-brief-introduction/">"Inko: a brief introduction"</a>, and
later published the <a href="https://inko-lang.org">Inko website</a>. Since then, I made a lot of
progress towards making it useful for everyday use. Some recent milestones
include:</p><ul><li>A Foreign Function Interface.</li><li>A new process scheduler that is easier to maintain, and performs better.</li><li>Non-blocking sockets, without the need for callbacks.</li><li>Reduced memory usage per process.</li></ul><p>The next milestone for Inko is having a self-hosting compiler. But why would one
want to write a self-hosting compiler? Why not use an already established
language? What are the benefits of writing a self-hosting compiler? Let's find
out!</p><h2 id="the-first-compiler">The first compiler</h2><p>When creating a language, you need a way to compile its source code. But we
can't use our own language, since we are still developing it. To deal with this,
developers use a different language for the first compiler. Two examples of this
are Rust and Go. The first compiler for Rust was written in OCaml, and the first
compiler for Go was written in C.</p><p>For Inko's current compiler we use Ruby. Before writing the compiler in Ruby I
made an attempt at writing it in Rust. Inko's Virtual Machine is also written in
Rust, so using Rust for the compiler made sense at the time. Writing the
compiler in Rust turned out to be frustrating, as I kept running into minor
issues along the way. After about a month, I decided to cut my losses and use
Ruby instead. Using Ruby allowed me to deliver a working compiler faster.</p><p>There were also two others reasons for using Ruby instead of Rust:</p><ol><li>The compiler would one day be rewritten in Inko. This meant that quality was
not the focus of the first compiler. Instead, it had to focus getting enough
done so I could start building the standard library.</li><li>Ruby is closer to Inko than Rust is, which makes it easier to port code to
the new compiler.</li></ol><p>Rust tends to be an unforgiving language, at least it feels that way. This makes
sense when you are writing production-ready software, but can slow you down when
trying to prototype a compiler.</p><h2 id="benefits-of-a-self-hosting-compiler">Benefits of a self-hosting compiler</h2><p>If we have to use a different language for our first compiler, why not keep
using this compiler? Why should one spend the extra time and effort on making
their compiler self-hosting?</p><p>A typical compiler consist of different components, such as:</p><ul><li>A lexer.</li><li>A parser (sometimes the parser also takes care of lexing the input).</li><li>Type checking.</li><li>Optimisation passes.</li><li>Code generation.</li></ul><p>To write our compiler in our own language, the language must provide the
necessary features. Such features might be:</p><ul><li>String slicing.</li><li>Concurrency primitives.</li><li>A unit testing framework.</li><li>APIs for working with the filesystem.</li></ul><p>Adding these features to the standard library benefits all users of the
language. We could come up with a list of features to add, without a reference
program. But it can be difficult to come up with every possible feature, before
there is a use case for them. Worse, we may end up adding features that turn
out to not be useful once actually used.</p><p>Performance is also important for a programming language. Your language can have
all the features in the world, but users will not use it if the language is too
slow. To ensure our language performs well, we need a way to measure and improve
its performance. One way of doing this is by writing synthetic benchmarks.
While useful for measuring specific sections of code, they are not useful for
determining the impact of a change on a larger program.</p><p>A more realistic way of measuring performance is using a program with users.
Compilers are an excellent reference. For example, a lexer operates on
sequences of characters or bytes, executing code for every value in the
sequence. Without any optimisations, such code could be slow. By writing our
compiler in our own language, we have a program to measure the performance
impact of changes made to the language.</p><p>While not a benefit per se, making the compiler self-hosting is a way of showing
the capabilities of the language. If you can write the language's compiler in
the language itself, you can write any other program in the language.</p><h2 id="towards-a-self-hosting-inko-compiler">Towards a self-hosting Inko compiler</h2><p>The first step towards a self-hosting compiler was to simplify the syntax in
various places. For example, Inko allowed you to implement a trait in two
different ways: when defining an object, or separately. Implementing a trait
when defining an object looked like this:</p><div class="highlight"><pre class="highlight"><code>object Person <span class="k">impl</span> ToString {
<span class="c"># ...</span>
}
</code></pre></div><p>The alternative is to implement the trait separately:</p><div class="highlight"><pre class="highlight"><code><span class="k">impl</span> ToString <span class="k">for</span> Person {
<span class="c"># ...</span>
}
</code></pre></div><p>I added support for both so that object definitions and trait implementations
were closer together. This complicates various parts of the compiler. In
practise I also found it not to be as useful as anticipated.</p><p>Another syntax change is the removal of support for unicode identifiers. Being
able to use unicode identifiers could be useful, but it complicates the
lexer. I also doubt it will see much use in the coming years.</p><p>With the syntax simplified, I started implementing the lexer. The merge
request tracking progress is <a href="https://gitlab.com/inko-lang/inko/merge_requests/59">"Implement Inko's lexer in Inko itself"</a>.</p><h2 id="implementing-inkos-lexer-in-inko">Implementing Inko's lexer in Inko</h2><p>As I work on the compiler I will write about the progress made, starting with
the lexer. After all, talking about the compiler and not showing anything would
be boring.</p><p>The basic idea of a lexer is simple: take a sequence of bytes or characters, and
produce one or more "tokens". A token is some sort of object containing at least
two values: a type indicator of some sort, and a value. The type indicator could
be a string, integer, enum, or something else. The value is typically a string.</p><p>Inko uses an object called <code>Token</code> for tokens, defined as follows (excluding
methods not relevant for this example):</p><div class="highlight"><pre class="highlight"><code>object Token {
def init(type: String, value: String, location: SourceLocation) {
<span class="k">let</span> <span class="vi">@type</span> = type
<span class="k">let</span> <span class="vi">@value</span> = value
<span class="k">let</span> <span class="vi">@location</span> = location
}
}
</code></pre></div><p>For those unfamiliar with Inko's syntax, this defines and object called <code>Token</code>
and its constructor method <code>init</code>. The <code>init</code> method takes three arguments:</p><ol><li><code>type</code>: the type name of the token, such as <code>'integer'</code> or <code>'comma'</code>.</li><li><code>value</code>: the value of the token, such as <code>'10'</code> for an integer.</li><li><code>location</code>: an object that contains source location information, such as the
line range and column number.</li></ol><p>The <code>init</code> method sets three instance attributes: <code>@type</code>, <code>@value</code>, and
<code>@location</code>.</p><p>For the lexer, Inko uses an object called <code>Lexer</code>. Showing all the lexer's
source code would be a bit much. Instead, we'll highlight some interesting
parts. The constructor of the lexer is as follows:</p><div class="highlight"><pre class="highlight"><code>object Lexer {
def init(input: ToByteArray, file: ToPath) {
<span class="k">let</span> <span class="vi">@input</span> = input.to_byte_array
<span class="k">let</span> <span class="vi">@file</span> = file.to_path
<span class="c"># ...</span>
}
}
</code></pre></div><p><code>ToByteArray</code> is a trait that provides the method <code>to_byte_array</code>, for
converting a type to a <code>ByteArray</code>. When reading data from a file, Inko will
read it into a <code>ByteArray</code>. Converting this to a <code>String</code> requires allocating an
extra object, and twice the memory. The type <code>ByteArray</code> also implements the
<code>ToByteArray</code> trait. This allows lexing of files, without allocating a <code>String</code>:</p><p><code>ToPath</code> is a trait that provides the method <code>to_path</code>, for converting a type to
a <code>Path</code>. <code>Path</code> is a type that represents file paths, providing a more
pleasant interface compared to using a <code>String</code>. Using this trait allows one to
supply either a <code>String</code> or a <code>Path</code> as the <code>file</code> argument:</p><div class="highlight"><pre class="highlight"><code><span class="k">import</span> std::compiler::lexer::Lexer
<span class="k">import</span> std::fs::path::Path
Lexer.new(input: <span class="s">'10'</span>, file: <span class="s">'test.inko'</span>)
Lexer.new(input: <span class="s">'10'</span>, file: Path.new(<span class="s">'test.inko'</span>))
</code></pre></div><p>The <code>Lexer</code> type is an iterator, allowing the user to retrieve tokens one by
one:</p><div class="highlight"><pre class="highlight"><code><span class="k">import</span> std::compiler::lexer::Lexer
<span class="k">let</span> lexer = Lexer.new(input: <span class="s">'10'</span>, file: <span class="s">'test.inko'</span>)
<span class="k">let</span> token = lexer.<span class="k">next</span>
token.type <span class="c"># => 'integer'</span>
token.value <span class="c"># => '10'</span>
</code></pre></div><p>To determine what token to produce, a <code>Lexer</code> will look at the current byte in
the input. Based on the current byte, <code>next</code> sends different messages to the
<code>Lexer</code>. The implementation of <code>next</code> is a bit much to cover, but more or less
looks as follows:</p><div class="highlight"><pre class="highlight"><code>def <span class="k">next</span> -> ?Token {
<span class="k">let</span> current = current_byte
current == A
.if_true {
<span class="k">return</span> foo
}
current == B
.if_true {
<span class="k">return</span> bar
}
Nil
}
</code></pre></div><p>The return type here is <code>?Token</code>, meaning it may return a <code>Token</code> or <code>Nil</code>.</p><p>Inko does not have a <code>match</code> or <code>switch</code> statement, instead we compare objects
for equality and use block returns. In the above example, if <code>current == A</code>
evaluates to true we return the result of <code>foo</code>, skipping the code that follows
it. Reading the above code, one might think that the code is incorrect. In most
languages, this code:</p><div class="highlight"><pre class="highlight"><code>A == B
.foo
</code></pre></div><p>Is parsed as this:</p><div class="highlight"><pre class="highlight"><code>A == (B.foo)
</code></pre></div><p>In Inko this is not the case. <em>If</em> the message that follows a binary operation
(<code>A == B</code>) is on a new line, it's sent to the <em>result</em>. This means it's parsed
as follows:</p><div class="highlight"><pre class="highlight"><code>(A == B).foo
</code></pre></div><p>This allows one to write this:</p><div class="highlight"><pre class="highlight"><code>A == B
.<span class="k">and</span> { C }
.if_true {
<span class="c"># ...</span>
}
</code></pre></div><p>Instead of this:</p><div class="highlight"><pre class="highlight"><code>(A == B)
.<span class="k">and</span> { C }
.if_true {
<span class="c"># ...</span>
}
</code></pre></div><p>For certain tokens we need to perform more complex checks. For example, for
integers we can not compare for equality because an integer can start with
different values (<code>0</code>, <code>1</code>, etc). Instead, we use Inko's range type like so:</p><div class="highlight"><pre class="highlight"><code>INTEGER_DIGIT_RANGE.cover?(current).if_true {
<span class="k">return</span> number
}
</code></pre></div><p>Here <code>INTEGER_DIGIT_RANGE</code> is a range (using the <code>Range</code> type) covering the
digits 0 to 9. The method <code>cover?</code> checks if its argument is contained in the
range, without evaluating all values in the range.</p><p>The implementations of the methods that produce tokens vary. Some are simple,
others are more complex. Strings in particular are tricky, as they can contain
escaped quotes and escape sequences (<code>\n</code>, <code>\r</code>, etc).</p><p>Numbers are also tricky, as there are different number types and formats:</p><ul><li>Regular integers: <code>123</code>.</li><li>Hexadecimal integers: <code>0x123abc</code>, <code>0X123ABC</code>.</li><li>Floats: <code>10.23</code>, <code>1e2</code>, <code>1E2</code>, <code>1e+2</code>, <code>1E+2</code>, <code>1e-2</code>, <code>1E-2</code>.</li></ul><p>The difficulty here is that the type is not known until reaching a certain
character, such as <code>.</code> or <code>x</code>.</p><p>Covering all this would be far too much, so I recommend taking a closer look
at the merge request <a href="https://gitlab.com/inko-lang/inko/merge_requests/59">"Implement Inko's lexer in Inko itself"</a>.</p><h2 id="work-after-the-lexer">Work after the lexer</h2><p>After finishing work on the lexer, the parser is next. After that, I will
have to spend some time planning what steps would be next. I would like for the
compiler to be parallel and incremental, but I do not yet have an idea on how to
implement this. I also need to revisit the type system, as certain parts feel a
bit hacky.</p><p>Determining how long all this takes is difficult. After implementing the parser
I will have a better estimate. I expect it will take between three and six
months. I do have a three week vacation in a couple of weeks, and I tend to be
productive during my vacations. Perhaps a bit too productive.</p>https://yorickpeterse.com/articles/inko-a-brief-introduction/Inko: a brief introduction2018-05-02T22:00:00Z2018-05-02T22:00:00Z<p><a href="https://gitlab.com/inko-lang/inko">Inko</a> is a programming language I started working on in early 2015. The
goal of the project is to create a gradually typed, object-oriented programming
language with a focus on safety and concurrency. Inko draws inspiration from
various other languages, such as Smalltalk, Erlang, Rust, and Ruby. Like any
other language, it is not perfect but the more time I spend working on it, the
more I believe it could turn out to be a useful programming language.</p><p>While the language is still quite far from being usable, I have been making a
lot of progress with both the compiler and the standard library. As a result, I
think it's time to start writing a bit more about the language, starting with a
brief introduction of what Inko is all about.</p><p>Keep in mind that the exact syntax is subject to change and that some
topics/features discussed in this article might not yet be available. In
particular, large parts of the compiler's type system and syntax are being
rewritten as part of the <a href="https://gitlab.com/inko-lang/inko/merge_requests/1">"Rewrite the Ruby compiler's type system"</a> merge
request.</p><h2 id="table-of-contents">Table of contents</h2><ul class="toc"><li><a href="#history">History</a></li><li><a href="#object-model">Object model</a></li><li><a href="#message-passing">Message passing</a></li><li><a href="#type-system">Type system</a></li><li><a href="#booleans-and-nil">Booleans and Nil</a></li><li><a href="#error-handling">Error handling</a><ul><li><a href="#error-handling-principles">Error handling principles</a><ul><li><a href="#method-signatures-must-include-the-error-type">Method signatures must include the error type</a></li><li><a href="#only-a-single-type-can-be-thrown">Only a single type can be thrown</a></li><li><a href="#methods-that-define-a-throw-type-must-actually-throw-it">Methods that define a throw type must actually throw it</a></li><li><a href="#sending-a-message-that-may-throw-requires-explicit-error-handling">Sending a message that may throw requires explicit error handling</a></li><li><a href="#the-try-keyword-only-supports-a-single-expression">The "try" keyword only supports a single expression</a></li></ul></li><li><a href="#bugs-are-not-recoverable">Bugs are not recoverable</a></li></ul></li><li><a href="#concurrency">Concurrency</a></li><li><a href="#memory-management">Memory management</a></li><li><a href="#portable-bytecode">Portable bytecode</a></li><li><a href="#examples">Examples</a><ul><li><a href="#checking-if-a-string-starts-with-another-string">Checking if a String starts with another String</a></li><li><a href="#loops-and-tail-call-elimination">Loops and tail call elimination</a></li><li><a href="#processes-and-communication">Processes and communication</a></li><li><a href="#file-operations">File operations</a></li></ul></li><li><a href="#trying-it-out">Trying it out</a></li></ul><h2 id="history">History</h2><p>The idea of building my own programming language dates to early 2013. Back then,
I knew little about programming languages, parser, virtual machines, and so
on. I also wasn't quite sure what I was looking for in this language. It wasn't
until early 2015 that I started writing code for the project, starting with the
virtual machine. It was also around this time that I started to have a better
understanding of what I was looking for: a language with a strong
object-oriented model and excellent support for concurrency, borrowing various
features from languages I admire, such as Smalltalk and Erlang.</p><p>I ended up writing the virtual machine in Rust, though Rust wasn't my first
choice. At the time Rust was still new and unstable, with both the syntax
and functionality changing frequently. So I first looked into other languages
such as C, C++, and D. While I made quite a bit of progress with using D, I felt
that using a garbage collected language for a virtual machine was less than
ideal. Ultimately I decided to go with Rust since it seemed to be the most
suitable. At first, this was quite frustrating, but as Rust began settling in
the frustration fortunately went away.</p><p>Today I'm quite happy with the choice of using Rust for the VM. Rust certainly
has its flaws, but I find it much easier and much more pleasant to use than
languages such as C/C++ and similar low-level programming languages.</p><h2 id="object-model">Object model</h2><p>Inko is a <a href="https://en.wikipedia.org/wiki/Prototype-based_programming">prototype-based</a> object-oriented programming language,
though the use of prototypes is mostly hidden from the user. Instead of
inheritance, Inko uses composition using <a href="https://en.wikipedia.org/wiki/Trait_(computer_programming)">traits</a>. I never really
enjoyed the use of inheritance as I feel it couples objects too tightly, and
composition through traits feels like the right answer to this problem. While
Inko supports the creation of class-like objects using an <code>object</code> keyword, we
simply call these "objects". This may seem odd but it helps clarify that these
aren't traditional classes that support inheritance. For example, if we want to
define a "Person" object of sorts, you could do so as follows in Ruby:</p><div class="highlight"><pre class="highlight"><code><span class="k">class</span> Person
<span class="k">def</span> initialize(name)
<span class="vi">@name</span> = name
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div><p>The equivalent Inko code would be:</p><div class="highlight"><pre class="highlight"><code>object Person {
def init(name: String) {
<span class="k">let</span> <span class="vi">@name</span> = name
}
}
</code></pre></div><p>Here <code>let @name = name</code> defines an instance attribute called <code>@name</code> set to the
value of the <code>name</code> argument, with the type of <code>name</code> being a <code>String</code>. If we
wanted to use dynamic typing, we would simply leave out the type signature:</p><div class="highlight"><pre class="highlight"><code>object Person {
def init(name) {
<span class="k">let</span> <span class="vi">@name</span> = name
}
}
</code></pre></div><h2 id="message-passing">Message passing</h2><p>Inko uses message passing for pretty much everything, including constructs such
as "if" and "while"., allowing objects to decide how such constructs should
behave, instead of the language dictating what evaluates to be true and false,
for example. This means that instead of using an "if statement", you would use
the "if" <em>message</em>.</p><p>Say you want to check if <code>x</code> is greater than 10. In Ruby (and many other
programming languages) you may write such code as follows:</p><div class="highlight"><pre class="highlight"><code><span class="k">if</span> x > <span class="mi">10</span>
do_something
<span class="k">else</span>
do_something_else
<span class="k">end</span>
</code></pre></div><p>In Inko we would instead write:</p><div class="highlight"><pre class="highlight"><code>x > <span class="mi">10</span>
.<span class="k">if</span> <span class="k">true</span>: {
do_something
}, <span class="k">false</span>: {
do_something_else
}
</code></pre></div><p>Here <code>if</code> is a message sent to the result of <code>x > 10</code> (this relies on some
special syntax support so you don't have to write <code>(x > 10).if</code>). <code>true:</code> and
<code>false:</code> are simply keyword arguments sent to the <code>if</code> message, and the curly
braces are closures. The object the <code>if</code> message is sent two determines which of
the two closures is executed.</p><p>Methods can be defined using a <code>def</code> keyword, take an optional arguments list,
and may specify the throw and return type:</p><div class="highlight"><pre class="highlight"><code>def example(argument: Type) -> ReturnType {
<span class="c"># ...</span>
}
</code></pre></div><p>If you leave out the argument types or the return type Inko will use a dynamic
type instead:</p><div class="highlight"><pre class="highlight"><code>def example(argument: Type) {
<span class="c"># This method can return values of any type since its return type is inferred</span>
<span class="c"># as a dynamic type.</span>
}
</code></pre></div><h2 id="type-system">Type system</h2><p>Inko is a gradually typed programming language. Gradual typing gives you the
benefits of a statically typed language while still allowing you to trade
type safety for flexibility where necessary. Gradual typing is also useful when
prototyping or when building a simple program that won't really benefit from
static typing (e.g. a quick script to manage some music files).</p><p>To ensure type safety, Inko uses static typing by default, requiring you to
opt-in to dynamic typing where desired. Using dynamic typing is straightforward:
simply leave out the type signature in various places and Inko will treat the
types as dynamic types.</p><p>Like any other reasonable statically typed language, Inko supports generics
programming. For example, we can define a generic "List" type like so:</p><div class="highlight"><pre class="highlight"><code>object List!(T) {
<span class="c"># ...</span>
}
</code></pre></div><p>Here <code>!(T)</code> defines the list of type parameters of the "List" type. The type
parameter syntax is taken from <a href="https://dlang.org/">D</a>. While unusual it removes the need for
additional syntax when explicitly passing type parameters with a message. For
example, Rust uses <code><T></code> and requires you to write <code>foo::<T>()</code> when explicitly
passing a type parameter as <code>foo<T></code> would be parsed as <code>(foo) < (T>)</code>.</p><p>Using <code>!(T)</code> means that we can instead write <code>foo!(T)</code>, which is much easier on
the fingers. Scala uses <code>[]</code> (e.g. <code>List[T]</code>), and while easier to type (on
QWERTY it doesn't require the use of the shift key) Inko isn't able to use this
syntax because <code>[]</code> is a valid message name. For example: <code>foo[10]</code> translates
to <code>foo.[](10)</code>.</p><p>Generics can be used in objects, traits, and methods. For example:</p><div class="highlight"><pre class="highlight"><code>object Person {
def ==(other: Self) -> Boolean {
<span class="c"># ...</span>
}
}
</code></pre></div><p>Here <code>other</code> uses the "Self" type which tells the compiler that <code>other</code> is of
the same type as the enclosing object ("Person" in this case).</p><h2 id="booleans-and-nil">Booleans and Nil</h2><p>In many languages, the boolean values <code>true</code> and <code>false</code> are some kind of
primitive value instead of a structure or object. In Inko, they are just regular
objects like any other. The type <code>Boolean</code> in turn is just a trait implemented
by the Boolean objects <code>True</code> and <code>False</code>.</p><p>The absence of a value can be indicated using a <code>Nil</code>. <code>Nil</code> is just a regular
object like any other, but there's only one instance of this object. <code>Nil</code> is
set up in such a way that any message sent to it returns <code>Nil</code>, except for a few
messages that have a custom implementation. For example, <code>Nil.foo</code> would return
<code>Nil</code> but <code>Nil.to_integer</code> would return <code>0</code>. This greatly simplifies code as we
no longer need to constantly check if we're dealing with a value of type <code>T</code> or
<code>Nil</code>, though of course we still can if necessary.</p><p>Optional types can be used to indicate that something can be either of type <code>T</code>
or <code>Nil</code>. For example, to define an optional return value we would write:</p><div class="highlight"><pre class="highlight"><code>def example -> ?Integer {
Nil
}
</code></pre></div><p>It is an error to pass a <code>Nil</code> to a regular type (e.g. <code>String</code>), but it's
perfectly fine to pass a <code>Nil</code> to an optional type (e.g. <code>?String</code>).</p><p>One example of where this is useful is when retrieving an array value by its
index. Like Ruby, an array will return a <code>Nil</code> when there is no value for a
given index. In Ruby, this means you may need to check what type of value you
are dealing with, for example:</p><div class="highlight"><pre class="highlight"><code>user = list_of_users[<span class="mi">4</span>]
<span class="k">if</span> user
user.username
<span class="k">else</span>
<span class="s">''</span>
<span class="k">end</span>
</code></pre></div><p>In Inko, we can instead write the following:</p><div class="highlight"><pre class="highlight"><code>list_of_users[<span class="mi">4</span>].username.to_string
</code></pre></div><p>Should <code>list_of_users[4]</code> return a <code>Nil</code> then sending <code>username</code> will produce
another <code>Nil</code>. Sending <code>to_string</code> to <code>Nil</code> will produce an empty <code>String</code> since
<code>Nil</code> defines its own implementation of this method.</p><p>In short, by having <code>Nil</code> return a new <code>Nil</code> for unknown messages we can greatly
reduce the amount of code necessary to deal with values that might be absent
(but we can still check for a <code>Nil</code> where necessary).</p><h2 id="error-handling">Error handling</h2><p>Inko uses exceptions for error handling, drawing inspiration from an article
titled <a href="http://joeduffyblog.com/2016/02/07/the-error-model/">"The Error Model"</a> by Joe Duffy. The article is quite
long but definitely worth the read.</p><p>I went with exception handling, since the happy path of the code should not be
slowed down by error handling code. For example, when using a more functional
approach, such as using a <code>Result</code> type, you always need to check what you're
dealing with and "unwrap" the underlying value. When using exceptions, on the
other hand, you just use the code as if it didn't throw an error, automatically
jumping to a different region of code when it does.</p><h3 id="error-handling-principles">Error handling principles</h3><p>The basic principles of Inko's error handling system are that it should be clear
when something throws, what it throws, and most important of all that code
doesn't lie about any of this. To achieve this, Inko has a set of rules that
must be followed when working with errors.</p><h4 id="method-signatures-must-include-the-error-type">Method signatures must include the error type</h4><p>A method that throws an error must include the error type in its signature. This
can be done using the <code>!!</code> keyword in the method signature:</p><div class="highlight"><pre class="highlight"><code>def foo !! SomeError {
<span class="c"># ...</span>
}
</code></pre></div><p>This ensures that by just looking at the method (signature) we immediately know
what errors we have to deal with.</p><p>A method that does not define an error type to throw <em>can not</em> throw. This means
the following method would not compile:</p><div class="highlight"><pre class="highlight"><code>def foo {
<span class="k">throw</span> <span class="mi">10</span>
}
</code></pre></div><h4 id="only-a-single-type-can-be-thrown">Only a single type can be thrown</h4><p>A method can only throw an error of a single type, though you can specify the
type to be a trait and throw any value that implements this trait. By
restricting the number of possible types to just a single one we remove the need
for having to catch many different error types. It also simplifies the syntax.</p><h4 id="methods-that-define-a-throw-type-must-actually-throw-it">Methods that define a throw type must actually throw it</h4><p>A method that specifies a type to throw must actually throw this type at some
point, not doing so results in a compiler error. This means that the following
method would not compile since it never throws a value:</p><div class="highlight"><pre class="highlight"><code>def foo !! Integer -> Integer {
<span class="mi">10</span>
}
</code></pre></div><h4 id="sending-a-message-that-may-throw-requires-explicit-error-handling">Sending a message that may throw requires explicit error handling</h4><p>When sending a message that may throw, we <em>must</em> wrap the send in a <code>try</code>
expression:</p><div class="highlight"><pre class="highlight"><code><span class="k">try</span> foo
</code></pre></div><p>This makes it clear to the reader that <code>foo</code> may throw, without requiring them
to first find the implementation of the method to figure this out.</p><p>By default, the <code>try</code> expression will just re-throw the error type, but you can
explicitly handle the error by using an <code>else</code> expression:</p><div class="highlight"><pre class="highlight"><code><span class="k">try</span> foo <span class="k">else</span> (error) bar(error)
</code></pre></div><p>Here we would run <code>foo</code> and if it succeeds, we'd return whatever <code>foo</code> returned.
If <code>foo</code> threw an error, we'd run <code>bar</code> instead. Here the <code>error</code> variable would
contain the object that was thrown. The type of <code>error</code> is inferred by the
compiler.</p><p>The <code>else</code> expression supports multi-line expressions as well, which can be
useful when your error handling logic is more complex:</p><div class="highlight"><pre class="highlight"><code><span class="k">try</span> foo <span class="k">else</span> (error) {
bar(error)
baz(error)
}
</code></pre></div><p>Sometimes we just want to terminate the program if an operation failed. In this
case, we can use <code>try!</code> instead of <code>try</code>:</p><div class="highlight"><pre class="highlight"><code><span class="k">try</span>! foo
</code></pre></div><h4 id="the-try-keyword-only-supports-a-single-expression">The "try" keyword only supports a single expression</h4><p>To prevent one from wrapping hundreds of lines of code in a single "try"
expression, the syntax simply doesn't support this; instead you can only use a
single expression with "try" expression. This means that the following code
would produce a syntax error:</p><div class="highlight"><pre class="highlight"><code><span class="k">try</span> {
foo
bar
}
</code></pre></div><p>This however is perfectly fine:</p><div class="highlight"><pre class="highlight"><code><span class="k">try</span> {
foo
}
</code></pre></div><p>Curly braces can still be used in case the expression doesn't fit on a single
line, or it's simply more readable by using curly braces.</p><h3 id="bugs-are-not-recoverable">Bugs are not recoverable</h3><p>Many languages that use exceptions make the mistake of using exceptions for
errors caused by bugs. In Ruby, dividing by zero will result in a
<code>ZeroDivisionError</code> error being thrown. Inko instead uses "panics". When a panic
occurs, the virtual machine will print a stacktrace of the panicking process and
<em>terminate the entire program</em>. This ensures that bugs are caught as early as
possible, and more importantly can't be hidden by simply catching and ignoring
the exception. Some examples of operations that may panic:</p><ol><li>Dividing by zero.</li><li>Formatting a time object using an incorrect string format.</li><li>Trying to allocate memory when no system memory is available.</li></ol><p>The general idea is fairly straightforward: if an error is the result of a bug
or <em>shouldn't</em> happen then it should be a panic. If an error is likely to occur
frequently (e.g. a network timeout) it should be an exception.</p><h2 id="concurrency">Concurrency</h2><p>Inko's concurrency model is heavily inspired by Erlang. Instead of using OS
threads directly Inko provides lightweight processes. These processes have their
own heap and are garbage collected independently.</p><p>Communication between these processes happens through message passing, with the
messages being deep copied. Certain permanent objects (e.g. modules) are
allocated on a separate permanent heap and processes can access these objects
without copying. While deep copying comes with a performance penalty (depending
on the size of the data being copied) it ensures that a process can never refer
to the memory of another process. This in turn ensures that the garbage
collector only has to suspend the process that it has to garbage collect,
instead of also having to suspend any processes that use this process' memory.</p><p>Processes use preemptive multitasking using a reduction system similar to
Erlang. In short: every process has a number of "reductions" it can perform.
Once this value reaches 0 the value is reset and the process is suspended. The
virtual machine provides two thread pools for executing processes: one for
regular processes, and one for processes that may perform blocking operations
(e.g. reading from a file).</p><p>Inko provides the means to move a process between these two pools whenever
necessary. This means that when performing a blocking operation we don't need to
spawn a separate process in a separate thread pool, instead we just move our
process from one pool to another; moving it back once our blocking operation has
been completed.</p><p>Sending and receiving messages uses dynamic typing as Inko's type system can not
be used to specify the types of messages a process may support. To work around
this Inko will eventually support a type-safe API. The exact semantics are not
yet defined, but if you're curious you can read more about this in the issue
<a href="https://gitlab.com/inko-lang/inko/issues/99">"Type safe actor API"</a>.</p><h2 id="memory-management">Memory management</h2><p>Inko is a garbage collected language. The garbage collector is a parallel,
generational garbage collector based on <a href="http://www.cs.utexas.edu/users/speedway/DaCapo/papers/immix-pldi-2008.pdf">Immix</a>. Fun fact: to the best of
my knowledge Inko's garbage collector is the only full implementation of Immix
apart from the one provided by <a href="https://github.com/JikesRVM/JikesRVM">JikesRVM</a>. There are a few other
implementations of Immix, but the ones that I know of typically don't implement
evacuation or other parts of Immix.</p><p>The garbage collector can collect process independently, though a process will
be suspended during garbage collection. The collector being parallel means it
will use multiple threads to garbage collect the memory of a process.</p><p>How well the garbage collector performs is hard to say as I have only run a few
basic benchmarks. These benchmarks usually involved garbage collecting a few
million objects and from the top of my head this would usually only take a few
milliseconds. Once Inko matures a bit more I'll most likely spend more time
writing (and publishing) benchmarks.</p><h2 id="portable-bytecode">Portable bytecode</h2><p>The bytecode of the virtual machine is portable between CPU architectures and
operating systems. This means that bytecode compiled on a 64 bits CPU can be run
on a 32 bits CPU. This may seem like a minor feature but it makes it easier to
distribute bytecode files as you no longer need to compile your program for
every architecture.</p><p>In the future Inko may support a way of bundling such bytecode files similar to
<a href="https://en.wikipedia.org/wiki/JAR_(file_format)">JAR</a>, though this isn't supported at the moment.</p><h2 id="examples">Examples</h2><p>With all of that out of the way let's take a look at some examples of Inko
source code. The examples discussed below are all taken from the standard
library, which can be found <a href="https://gitlab.com/inko-lang/inko/tree/master/runtime/std">here</a>.</p><h3 id="checking-if-a-string-starts-with-another-string">Checking if a String starts with another String</h3><p>Checking if one <code>String</code> starts with another <code>String</code> can be done using the
method <code>String#starts_with?</code> in the <code>std::string</code> module. The implementation of
this method is pretty straightforward:</p><div class="highlight"><pre class="highlight"><code>def starts_with?(prefix: String) -> Boolean {
prefix.length > length
.if_true {
<span class="k">return</span> False
}
slice(<span class="mi">0</span>, prefix.length) == prefix
}
</code></pre></div><p>The argument <code>prefix</code> is the <code>String</code> we are looking for, and our return value
is a <code>Boolean</code>. In the method we start with the following:</p><div class="highlight"><pre class="highlight"><code>prefix.length > length
.if_true {
<span class="k">return</span> False
}
</code></pre></div><p>This is a simple optimisation: if the <code>String</code> we are looking for is greater
than the <code>String</code> we are checking then we can just return <code>False</code> right away
("hello" can never start with "hello world" for example). In Ruby you would
write this as follows:</p><div class="highlight"><pre class="highlight"><code><span class="k">if</span> prefix.length > length
<span class="k">return</span> <span class="k">false</span>
<span class="k">end</span>
<span class="c"># Alternatively:</span>
<span class="k">return</span> <span class="k">false</span> <span class="k">if</span> prefix.length > length
</code></pre></div><p>Next up we have the actual comparison:</p><div class="highlight"><pre class="highlight"><code>slice(<span class="mi">0</span>, prefix.length) == prefix
</code></pre></div><p>This operation is pretty straightforward: first we generate a new <code>String</code>
starting at character 0 and include <code>prefix.length</code> characters. We then simply
check if this equals the given prefix <code>String</code>. Note that string slicing
operates on characters, not bytes.</p><h3 id="loops-and-tail-call-elimination">Loops and tail call elimination</h3><p>Loops are created using closures, instead of using a special <code>while</code> or <code>loop</code>
keyword. A loop using a conditional is created by sending <code>while_true</code> or
<code>while_false</code> to a closure:</p><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="k">mut</span> number = <span class="mi">0</span>
{ number < <span class="mi">10</span> }.while_true {
number += <span class="mi">1</span>
}
</code></pre></div><p>Here we create a loop that runs as long as the result of the closure <code>{ number <
10 }</code> evaluates to true. As long as this is the case we execute the closure
passed to the <code>while_true</code> message.</p><p>An infinite loop is created by sending <code>loop</code> to a closure:</p><div class="highlight"><pre class="highlight"><code>{
<span class="c"># This will run forever</span>
}.<span class="k">loop</span>
</code></pre></div><p>The <code>while_true</code> method is implemented as follows:</p><div class="highlight"><pre class="highlight"><code>def while_true(block: do) -> Nil {
call.if_false { <span class="k">return</span> }
block.call
while_true(block)
}
</code></pre></div><p>Let's start with the signature. This method takes one argument <code>block</code>, which
has its type set to <code>do</code>. In this context <code>do</code> is used to specify that we expect
a closure with no arguments and a dynamic return type. If we required an
argument we would instead write <code>do (Integer)</code>. If we wanted to also include a
return type we could write <code>do (Integer) -> Integer</code>. We can also use the
<code>lambda</code> keyword to create a lambda. The difference between the two is simple: a
closure can capture outer local variables, a lambda can not. When the type
signature requires a closure you can also pass a lambda, but not the other way
around. Closures and lambdas are collectively referred to as "blocks".</p><p>Now let's look at the body of this method:</p><div class="highlight"><pre class="highlight"><code>call.if_false { <span class="k">return</span> }
block.call
while_true(block)
</code></pre></div><p>First we run the receiving block, returning early if it returned something that
evaluates to false. If it evaluates to true we'll simply execute the block
passed in the <code>block</code> argument, then we will call ourselves again. Inko supports
tail call elimination so we can simply keep calling <code>while_true</code> indefinitely
without blowing up the call stack.</p><p>The <code>loop</code> method is a simple method that also relies on tail call elimination:</p><div class="highlight"><pre class="highlight"><code>def <span class="k">loop</span> -> Nil {
call
<span class="k">loop</span>
}
</code></pre></div><p>Here <code>call</code> will run the receiving block, then we simply recurse into <code>loop</code> to
repeat this process.</p><p>Because Inko uses preemptive multitasking, loops such as those shown above will
never block an OS thread indefinitely. Instead, the virtual machine will suspend
the process once it has consumed all of its reductions, resuming execution of
the process some time later.</p><h3 id="processes-and-communication">Processes and communication</h3><p>To start a process, we first need to import the <code>std::process</code> module like so:</p><div class="highlight"><pre class="highlight"><code><span class="k">import</span> std::process
</code></pre></div><p>Next we can start a process like so:</p><div class="highlight"><pre class="highlight"><code><span class="k">import</span> std::process
<span class="k">let</span> pid = process.spawn {
<span class="c"># This runs in a separate process</span>
}
</code></pre></div><p>We can send messages to a process using <code>process.send</code> and receive them using
<code>process.receive</code>:</p><div class="highlight"><pre class="highlight"><code><span class="k">import</span> std::process
<span class="k">let</span> pid = process.spawn {
process.receive <span class="c"># This would produce 'hello'</span>
}
process.send(pid, <span class="s">'hello'</span>)
</code></pre></div><p>When using <code>process.receive</code> without any messages being available the process
will be suspended until a new message arrives.</p><h3 id="file-operations">File operations</h3><p>For our last example, we'll look at a simple file operation: reading a file. In
a typical language, you would open the file with a specific mode, then read from
it. For example, in Ruby you would do the following:</p><div class="highlight"><pre class="highlight"><code>file = File.open(<span class="s">'example.txt'</span>, <span class="s">'r'</span>)
file.read
</code></pre></div><p>Many languages will use the same data types for files opened in different file
modes. This means that the following Ruby code would compile, but produce a
runtime error (since the file is not opened for writing):</p><div class="highlight"><pre class="highlight"><code>file = File.open(<span class="s">'example.txt'</span>, <span class="s">'r'</span>)
file.write(<span class="s">'hello'</span>)
</code></pre></div><p>Inko uses different types for files opened in different modes. For example, a
file opened in read-only mode is a <code>ReadOnlyFile</code> while a file opened in
write-only mode is a <code>WriteOnlyFile</code>. This means our first example is written as
follows:</p><div class="highlight"><pre class="highlight"><code><span class="k">import</span> std::fs::file
<span class="k">let</span> file = file.read_only(<span class="s">'example.txt'</span>)
<span class="k">try</span>! file.read <span class="c"># This will terminate the program if we couldn't read the data</span>
</code></pre></div><p>Our second example would be as follows:</p><div class="highlight"><pre class="highlight"><code><span class="k">import</span> std::fs::file
<span class="k">let</span> file = file.read_only(<span class="s">'example.txt'</span>)
<span class="k">try</span>! file.write(<span class="s">'hello'</span>)
</code></pre></div><p>This code however will not compile since a <code>ReadOnlyFile</code> does not respond to
the <code>write</code> message. I really like this API because it's straightforward to
implement and removes the need for having to worry about using the wrong file
mode for your operations.</p><h2 id="trying-it-out">Trying it out</h2><p>If you're curious about Inko, you can give it a try yourself, but keep in mind
that with Inko being a young language this process is a bit painful.</p><p>To try things out you need to have three things installed:</p><ol><li>Ruby 2.4 or newer.</li><li>Bundler (<code>gem install bundler</code>).</li><li>Rust 1.10 or newer using a nightly build (stable Rust is unfortunately not
supported at the moment).</li></ol><p>Once these requirements are met you can clone the Git repository:</p><div class="highlight"><pre class="highlight"><code>git clone git@gitlab.com:inko-lang/inko.git
<span class="k">cd</span> inko
</code></pre></div><p>To build the compiler, you need to run:</p><div class="highlight"><pre class="highlight"><code><span class="k">cd</span> compiler
bundle install
</code></pre></div><p>To build the virtual machine, you need to run (from the root directory):</p><div class="highlight"><pre class="highlight"><code><span class="k">cd</span> vm
make release
</code></pre></div><p>Once done you can compile a program (from the root directory) as follows:</p><div class="highlight"><pre class="highlight"><code>./compiler/bin/inkoc /tmp/<span class="k">test</span>.inko -i ./runtime/ -t /tmp/inkoc-build
</code></pre></div><p>This will compile the program located at <code>/tmp/test.inko</code> and store all the
bytecode files in <code>/tmp/inkoc-build</code>. Once compiled the compiler will print the
file path of the bytecode file that belongs to the input file (<code>/tmp/test.inko</code>
in this case).</p><p>To run your program you start the VM as follows:</p><div class="highlight"><pre class="highlight"><code>./vm/target/release/ivm \
-I /tmp/inkoc-build \
/tmp/inkoc-build/path/to/bytecode.inkoc
</code></pre></div><p>These two commands can be merged into a single one as follows:</p><div class="highlight"><pre class="highlight"><code>./vm/target/release/ivm \
-I /tmp/inkoc-build \
$(./compiler/bin/inkoc /tmp/<span class="k">test</span>.inko -i ./runtime/)
</code></pre></div><p>Of course this is far from ideal and in the future this will be greatly
simplified, but for now running a program sadly requires some additional work.</p><p>In the future I will be writing more about Inko's internals such as the garbage
collector and the allocator. If you want to stay up to date on the latest Inko
news the easiest ways of doing so are:</p><ol><li>Star the project on <a href="https://gitlab.com/inko-lang/inko">GitLab.com</a>.</li><li>Subscribe to my website's <a href="/feed.xml">Atom feed</a>.</li><li>Follow me on <a href="https://archive.is/6LWOm">Twitter</a>.</li></ol>https://yorickpeterse.com/articles/compiling-xpath-to-ruby/Compiling XPath to Ruby2015-09-06T22:45:00Z2015-09-06T22:45:00Z<p>The process of evaluating a programming or query language is typically broken up
in 3 steps:</p><ol><li>The lexing phase, which turns raw text into a sequence of "tokens". Tokens
are usually a pair (e.g. an array or tuple) of a type and a value.</li><li>The parsing phase, which turns a sequence of tokens into an
<a href="https://en.wikipedia.org/wiki/Abstract_syntax_tree">Abstract Syntax Tree (AST)</a>.</li><li>An evaluation phase, producing a set of instructions a machine should execute
based on an AST.</li></ol><p>For the third step there are two ways of doing things:</p><ol><li>Instructions are executed on the fly.</li><li>Instructions are generated and executed separately.</li></ol><p>Both options have their benefits and drawbacks. A system that executes
instructions on the fly is typically easier to implement. However, these systems
tend to be slower as there's very little to no room for optimizations as
execution depends directly on the input AST. Directly evaluating ASTs also makes
it very hard (if not downright impossible) to perform
<a href="https://en.wikipedia.org/wiki/Just-in-time_compilation">Just In Time (JIT) compilation</a>.</p><p>A system that first generates instructions and <em>then</em> executes them can be
harder to implement, at the benefit of allowing for better optimizations.</p><p>An example of the first method would be Ruby 1.8, while an example of the second
method is your average C compiler (e.g. gcc).</p><h2 id="xpath-evaluation-in-oga">XPath Evaluation in Oga</h2><p>Up until version 1.3.0, Oga used to evaluate XPath queries on the fly. While the
code was fairly easy to work with, performance left a lot to be desired. The
setup of this evaluator was as following:</p><p>Every type of AST node would have a corresponding handler method called <code>on_X</code>
where <code>X</code> would be the type of the AST node. For example, an <code>int</code> AST node
would be handled by <code>on_int</code>. Each of these handlers would take their input,
operate on it, and return the result. The usual return value would be an
instance of <code>Oga::XML::NodeSet</code>, an Array-like object used for storing XML
nodes.</p><p>The performance impact of this setup depends on two things: the size of the
input document, and the size and complexity of the given XPath query. For small
documents the performance wasn't too bad, but for larger documents (e.g. the
<a href="https://github.com/YorickPeterse/oga/blob/ac5cb3d24f407a6ed8d8b583e59fa89084e9acb5/benchmark/fixtures/big.xml.gz">10 MB test file</a> used for benchmarks) this could result in even
simple queries taking seconds to complete.</p><p>In short, if I wanted to improve performance I would need to come up with a
radically different way of evaluating XPath queries.</p><h2 id="compiling-xpath">Compiling XPath</h2><p>The alternative I started looking into was compiling XPath to some kind of
format that could be executed in a more efficient way. One option would be to
compile to some custom <a href="https://en.wikipedia.org/wiki/Bytecode">bytecode</a> format and evaluate that. However,
ideally the target format would be something that could take advantage of
optimizations already provided by Ruby implementations. That way I wouldn't have
to write my own optimization passes or maybe even some sort of JIT compiler.</p><p>Compiling to Ruby bytecode would be an option, if it weren't for every
implementation using its own bytecode format. Also, no implementation to date
actually considers the bytecode part of their public API (as far as I'm aware),
meaning it could change at any given point.</p><p>Ruby source code on the other hand works across implementations, is stable, and
can take advantage of all performance optimizations a Ruby implementation might
have to offer.</p><p>Starting with version 1.3.0, Oga compiles XPath expressions to Ruby source code.
The result is a Proc that takes an input document (or element) and returns the
result of the XPath expression it was compiled from. The compiled Procs are
cached on a per expression basis. This means that if you run the same query in a
loop, Oga only has to compile it once.</p><p>Code wise the setup is fairly similar to the old evaluator. There are still AST
node type specific handlers (<code>on_int</code>, <code>on_axis_following_sibling</code>, etc).
However, instead of returning <code>Oga::XML::NodeSet</code> instances they return AST
nodes used to produce Ruby source code.</p><h2 id="performance-improvements">Performance Improvements</h2><p>The new compiler setup yields significant performance improvements over the old
evaluator setup. In certain cases performance is even better than Nokogiri,
which uses C for its XPath evaluation.</p><p>Of course any performance claim is meaningless without a benchmark to back it
up. Oga has several benchmarks for the new compiler, these resides in the
<a href="https://github.com/YorickPeterse/oga/tree/ac5cb3d24f407a6ed8d8b583e59fa89084e9acb5/benchmark/xpath/compiler">benchmark/xpath/compiler</a> directory of the repository.</p><p>Benchmarks were run on a Thinkpad T520 running Linux 4.1 with a bunch of
applications in the background, while listening to the
<a href="https://www.youtube.com/watch?v=83jWwQfK-f8">Metal Gear Solid 5: The Phantom Pain soundtrack</a> on YouTube.
In other words, treat these numbers with a grain of salt. For best results you
should run these benchmarks yourself. To do so, clone the Git repository of Oga,
run <code>rake generate fixtures</code> and then run one of the benchmark files like any
other Ruby script.</p><p>First, lets look at the benchmark <code>big_xml_average_bench.rb</code>. This benchmark
takes a <a href="https://github.com/YorickPeterse/oga/blob/ac5cb3d24f407a6ed8d8b583e59fa89084e9acb5/benchmark/fixtures/big.xml.gz">10 MB test file</a> and runs the query
<code>descendant-or-self::location</code> 10 times, measuring the execution time for every
iteration. Using Oga 1.2.3 we get the following output:</p><pre><code>Iteration: 1: 3.493
Iteration: 2: 2.868
Iteration: 3: 2.934
Iteration: 4: 2.965
Iteration: 5: 2.926
Iteration: 6: 2.928
Iteration: 7: 3.008
Iteration: 8: 2.977
Iteration: 9: 2.938
Iteration: 10: 2.993
Iterations: 10
Average: 3.003 sec
</code></pre><p>Using Oga 1.3.0 the output is as following instead:</p><pre><code>Iteration: 1: 0.432
Iteration: 2: 0.448
Iteration: 3: 0.522
Iteration: 4: 0.453
Iteration: 5: 0.44
Iteration: 6: 0.494
Iteration: 7: 0.448
Iteration: 8: 0.431
Iteration: 9: 0.432
Iteration: 10: 0.437
Iterations: 10
Average: 0.454 sec
</code></pre><p>Here Oga 1.3.0 is about 6.6 times faster.</p><p>Next, lets look at the benchmark <code>concurrent_time_bench.rb</code>. This benchmark uses
the XML file <a href="https://github.com/YorickPeterse/oga/blob/master/benchmark/fixtures/kaf.xml.gz">kaf.xml</a> and runs the query <code>KAF/terms/term</code> 10 times in
parallel using 5 threads. The idea of this benchmark is to measure performance
as the number of threads increase. A higher number of threads can result in more
pressure on the garbage collector (GC), depending on the code being benchmarked.
More pressure on the GC can in turn result in poorer performance due to the GC
having to stop all threads more often.</p><p>Using Oga 1.2.3 the results of this benchmark are as following:</p><pre><code>Preparing...
Starting threads...
Samples: 50
Average: 0.2316 seconds
</code></pre><p>Using Oga 1.3.0:</p><pre><code>Preparing...
Starting threads...
Samples: 50
Average: 0.0342 seconds
</code></pre><p>Here Oga 1.3.0 is also around 6.6 times faster.</p><p>Finally, lets look at the benchmark <code>comparing_gems_bench.rb</code>. This benchmark
uses the XML document <code><root><number>10</number></root></code> and retrieves all text
nodes of all <code><number></code> nodes. This benchmark uses
<a href="https://github.com/evanphx/benchmark-ips">benchmark-ips</a>.</p><p>The benchmark runs this query for the following libraries:</p><ul><li>Ox: 2.2.0</li><li>Nokogiri: 1.6.6.2</li><li>REXML: MRI 2.2.1 was used (as REXML is bundled in Ruby's standard library)</li><li>Oga</li></ul><p>Note that Ox doesn't actually support XPath, it instead offers its own querying
language. As a result it's not entirely fair to compare it with the other
libraries. However, for the sake of showing the performance difference of Ox'
query language versus the rest I've included it any way.</p><p>Using these Gems and Oga 1.2.3, the results are as following:</p><pre><code>Calculating -------------------------------------
Ox 14.548k i/100ms
Nokogiri 3.879k i/100ms
Oga 2.681k i/100ms
REXML 1.114k i/100ms
-------------------------------------------------
Ox 197.284k (± 3.9%) i/s - 989.264k
Nokogiri 46.701k (± 9.7%) i/s - 232.740k
Oga 28.293k (± 2.0%) i/s - 142.093k
REXML 11.901k (± 2.8%) i/s - 60.156k
Comparison:
Ox: 197284.2 i/s
Nokogiri: 46701.1 i/s - 4.22x slower
Oga: 28292.6 i/s - 6.97x slower
REXML: 11900.5 i/s - 16.58x slower
</code></pre><p>And using Oga 1.3.0:</p><pre><code>Calculating -------------------------------------
Ox 15.227k i/100ms
Nokogiri 3.966k i/100ms
Oga 13.874k i/100ms
REXML 1.168k i/100ms
-------------------------------------------------
Ox 201.044k (± 1.5%) i/s - 1.005M
Nokogiri 47.338k (± 8.6%) i/s - 237.960k
Oga 166.485k (± 9.8%) i/s - 832.440k
REXML 11.693k (± 5.3%) i/s - 58.400k
Comparison:
Ox: 201044.3 i/s
Oga: 166485.5 i/s - 1.21x slower
Nokogiri: 47338.3 i/s - 4.25x slower
REXML: 11692.7 i/s - 17.19x slower
</code></pre><p>Here Oga 1.3.0 is about 5.8 times faster compared to version 1.2.3. Using 1.3.0
Oga outperforms not only REXML but also Nokogiri.</p><p>Please keep in mind that performance will vary depending on the size of the
input document and the query being used. There will be cases where Oga
outperforms others, but there will (probably) also be cases where it performs
worse.</p><h2 id="wrapping-up">Wrapping Up</h2><p>The source code for the compiler can be found in
<a href="https://github.com/YorickPeterse/oga/blob/b07c75e96495f06c0914d135d30b26e55bcbb483/lib/oga/xpath/compiler.rb">lib/oga/xpath/compiler.rb</a>. The source code used for the Ruby AST
and code generation can be found in <a href="https://github.com/YorickPeterse/oga/tree/b07c75e96495f06c0914d135d30b26e55bcbb483/lib/oga/ruby">lib/oga/ruby</a>. There are still
plenty of parts in the compiler that could be optimized further as the current
code is largely ported from the old evaluator.</p><p>Those who wish to take advantage of the new compiler can simply update to Oga
1.3.0. A full list of changes can be found in <a href="https://github.com/YorickPeterse/oga/blob/b07c75e96495f06c0914d135d30b26e55bcbb483/CHANGELOG.md#130---2015-09-06">the changelog</a>.</p>https://yorickpeterse.com/articles/hello-gitlab/Hello, GitLab!2015-08-31T20:49:00Z2015-08-31T20:49:00Z<p>I'm excited to announce that I will be joining <a href="https://about.gitlab.com/">GitLab</a> starting October
1st. I greatly enjoyed my time at <a href="http://www.olery.com">Olery</a>, but after almost 3 years I
felt it was time for a new adventure. If you're based in Amsterdam and love
working with Ruby you should definitely send your details over to
<a href="jobs@olery.com">jobs@olery.com</a>.</p><p>At GitLab my time will be broken up in to two chunks. 80% of my time (4 days)
will be spent on improving performance and stability of the platform. This will
include things such as improving the response time of web pages, cutting down
memory usage, decreasing the time it takes to process Git repository data, etc.</p><p>The other 20% of my time (1 day) will be spent on improving Rubinius. Initially
I'll start with wrapping up some existing work such as
<a href="https://github.com/rubysl/rubysl-socket/pull/9">updating rubysl-socket</a>, <a href="https://github.com/rubinius/rubinius/pull/3356">pull request #3356</a>,
<a href="https://github.com/rubinius/rubinius/pull/3372">pull request #3372</a> and finishing
<a href="https://github.com/rubinius/rubinius/issues/3264">the work needed to support Ruby 2.2</a>.</p><p>Once this has been taken care of I plan to work on two things:</p><ol><li>Improving performance of Rubinius itself.</li><li>Building tools to help improve Rubinius and applications using Rubinius.</li></ol><p>One idea I'm already toying with is adding the ability of tracing object
allocations using Ruby itself. Tracing allocations should have a very low
overhead and should not require disabling the garbage collector for accurate
statistics. This in turn would allow one to run a tracer in their production
application (e.g. using something like New Relic's Ruby agent) <em>without</em> having
to worry about slowing the application down to a crawl.</p><p>Another idea is to add a way of tracing constant/method cache invalidations. In
particular constant cache invalidations can be tricky to debug, even when using
Rubinius' <code>-Xic.debug</code> and <code>-Xserial.debug</code> options. For more information about
this idea you can refer to <a href="https://github.com/rubinius/rubinius/issues/3490">issue #3490</a>.</p><p>Adding support for LLVM 3.6/MCJIT (<a href="https://github.com/rubinius/rubinius/pull/3367">pull request #3367</a>) is something I
will sadly not be working on any time soon. In order to do so I would first have
to learn about all the nitty-gritty details of LLVM, which in itself can easily
take months. As such I'm leaving this up to Brian Shirai, who already started
working on the various parts needed to support LLVM 3.6.</p><p>Finally, I'd like to thank GitLab for this opportunity. While 1 day a week might
not seem like much, it's <em>a lot</em> better than the 1 or 2 hours a week (if I'm
lucky) I can currently dedicate to Rubinius. Hopefully in the future I can
dedicate even more time to Rubinius, but only time will tell (no pun intended).</p>https://yorickpeterse.com/articles/oga-1-0-released/Oga 1.0 Released2015-05-20T21:00:00Z2015-05-20T21:00:00Z<p>Until now if one wanted to parse XML and/or HTML in Ruby the most common choice
would be <a href="http://www.nokogiri.org/">Nokogiri</a>. Nokogiri however is not without its problem,
<a href="/articles/oga-a-new-xml-and-html-parser-for-ruby/">as I have discussed in the past</a>. Other existing alternatives
usually only focus on XML (such as Ox and REXML), making them unsuitable for
those in need of HTML support.</p><p>Starting today Ruby developers will be able to use a solid alternative as I'm
happy to announce that 449 days after the very <a href="https://github.com/YorickPeterse/oga/commit/6326bdd8c943299e9adc4d2cb6de00934da3609b">first commit</a>
Oga 1.0 has finally been released.</p><p>Version 1.0 of Oga will be the first version to be considered stable per
<a href="http://semver.org/spec/v2.0.0.html">semantic versioning 2.0</a>. This doesn't mean it will be bug free, it
just means the API is not meant to change in backwards incompatible ways between
minor releases. While Oga is already being used in production for a while I was
reluctant to increment the version to 1.0 until at least proper HTML5 support
was introduced.</p><p>A lot has changed over the last 16 months. The old Racc parsers have been
replaced by LL(1) parsers using <a href="https://gitlab.com/yorickpeterse/ruby-ll">ruby-ll</a>, support was added for HTML5,
XML/HTML entity conversion, handling of invalid XML/HTML, better SAX parsing,
Windows support and much more.</p><p>The exact list of changes can be found in the <a href="http://code.yorickpeterse.com/oga/latest/file.CHANGELOG.html">changelog</a>. If you
want to jump straight to trying out Oga you can install it from RubyGems:</p><pre><code>gem install oga
</code></pre><p>Oga doesn't depend on libxml so the installation process should only take a few
seconds.</p><p>Oga's Git repository is located at <a href="https://gitlab.com/yorickpeterse/oga">https://gitlab.com/yorickpeterse/oga</a>, the
documentation can be found at <a href="http://code.yorickpeterse.com/oga/latest/">http://code.yorickpeterse.com/oga/latest/</a>. Those
interested in migrating from Nokogiri can use to the guide
<a href="http://code.yorickpeterse.com/oga/latest/file.migrating_from_nokogiri.html">"Migrating From Nokogiri"</a>.</p>https://yorickpeterse.com/articles/oga-a-new-xml-and-html-parser-for-ruby/Oga: a new XML/HTML parser for Ruby2014-09-12T14:45:00Z2014-09-12T14:45:00Z<p>In the Ruby ecosystem there are plenty of HTTP libraries. Net::HTTP, HTTParty,
HTTPClient, Patron, Curb, Excon, Tyhpoeus, just to name a few. There are so many
of them it's almost as if it's required that one writes an HTTP client in order
to call themselves a Ruby developer.</p><p>When it comes to XML/HTML parsing on the other hand the options are quite
limited. The two most common libraries are Nokogiri and REXML. Both these
libraries however have various flaws that makes working with them less than
pleasant. REXML is generally quite slow, only supports XML and can use quite a
chunk of memory when parsing data.</p><p>Nokogiri on the other hand is quite fast, but in turn is not thread-safe and in
certain places has a bit of an odd API. Nokogiri also vendors its own copy of
libxml which greatly increases install sizes and times. Most important of all,
Nokogiri simply doesn't work on Rubinius.</p><p>So what exactly is the problem with Nokogiri and Rubinius? Well, on MRI and
Rubinius Nokogiri will use a C extension. This extension in turn uses libxml.
Due to MRI having a GIL everything might appear to be working as expected,
however on Rubinius all hell breaks loose. To be exact, at certain points in
time bogus data (e.g. null pointers) are sent to the garbage collector, this in
turn crashes Rubinius. Both I and Brian Shirai (<a href="https://github.com/brixen">brixen</a>) have spent
quite some time trying to figure out what the heck is going on, without any
success so far. The exact details of all this can be found in the following
Nokogiri issue: <a href="https://github.com/sparklemotion/nokogiri/issues/1047">https://github.com/sparklemotion/nokogiri/issues/1047</a>.</p><p>This particular problem is thus severe that some of the production applications
I've tested (that use Nokogiri heavily) consistently crash around 30 seconds
into the process' lifetime. As a result it's impossible for me to run these
applications on Rubinius. If a process were to crash once every few days I might
be able to live with it while searching for a solution, every 30 seconds however
is just not an option.</p><p>All of this prompted me to start working on an alternative, an alternative that
doesn't require complicated system libraries or Ruby implementation specific
codebases. For the past 8 months I've been working on exactly that. I've called
the project Oga, and it can be found on GitLab.com:
<a href="https://gitlab.com/yorickpeterse/oga">https://gitlab.com/yorickpeterse/oga</a>. Today, 199 days after the first Git
commit, I'll be releasing the first version on RubyGems.</p><p>Oga is primarily written in Ruby (91% Ruby according to Github), with a small
native extension for the XML lexer. It supports parsing of XML and HTML, comes
with XPath expressions, support for XML namespaces and much more. It works on
MRI, Rubinius and JRuby and doesn't require large system libraries. This in turn
means smaller Gem sizes and <em>much</em> faster installation times. For more
information, see the <a href="https://gitlab.com/yorickpeterse/oga/blob/master/README.md">Oga README</a>.</p><p>Oga can be installed from RubyGems as following (the installation process should
only take a few seconds):</p><pre><code>gem install oga
</code></pre><p>Once installed you can start parsing XML and HTML documents. For example, lets
parse the Reddit frontpage and get all article titles:</p><div class="highlight"><pre class="highlight"><code>require <span class="s">'oga'</span>
require <span class="s">'net/http'</span>
body = Net:<span class="ss">:HTTP</span>.get(URI.parse(<span class="s">'http://www.reddit.com/'</span>))
document = Oga.parse_html(body)
titles = document.xpath(<span class="s">'//div[contains(@class, "entry")]/p[@class="title"]/a/text()'</span>)
titles.each <span class="k">do</span> |title|
puts title.text
<span class="k">end</span>
</code></pre></div><p>Because Oga is a very young library there is a big chance you'll bump into bugs
or other issues (I'm going to be honest here). For example, HTML parsing is not
yet as solid as it should be (<a href="https://gitlab.com/yorickpeterse/oga/issues/20">https://gitlab.com/yorickpeterse/oga/issues/20</a>),
Oga also does not yet honor the encoding set in the document itself
(<a href="https://gitlab.com/yorickpeterse/oga/issues/29">https://gitlab.com/yorickpeterse/oga/issues/29</a>). If you happen to run into
any problems/bugs, please report these at the <a href="https://gitlab.com/yorickpeterse/oga/issues/new">issue tracker</a>.
Feedback and questions are also more than welcome.</p><p>Personally I'm really excited about what Oga currently is and what it will
become (it also seems other share that sentiment). I was not expecting it to
take nearly 8 months to write such a library, but looking back at everything it
was more than worth the effort.</p><p>And last, I'd like to thank the following people:</p><ul><li><a href="https://github.com/whitequark">Peter Zotov</a>: for helping me out with Ragel numerous times</li><li><a href="https://github.com/brixen">Brian Shirai</a> for debugging the initial problems with Nokogiri as
well as his support of the project in general</li><li><a href="https://github.com/headius">Charles Nutter</a> for helping me out with getting a new version of
Racc released, his interest in profiling/benchmarking Oga and his support of
project in general</li><li>Countless of other people that have shown great interest ever since I started
working on Oga</li></ul>https://yorickpeterse.com/articles/hacking-extconf-rb/Hacking extconf.rb2013-06-08T23:00:00Z2013-06-08T23:00:00Z<div class="admonition info"><i class="icon"></i><div class="text"><p>As it turns out you can make the process discussed in this article easier by
using a Rakefile instead of an extconf.rb file. See the bottom of this article
for more information.</p></div></div><p>In Ruby land <a href="http://rubygems.org/">RubyGems</a> is the de facto package manager. RubyGems
allows you to easily distribute your Ruby packages (known as "Gems"). These
packages come in two flavours:</p><ul><li>Pure Ruby Gems</li><li>Gems that include C code (or any other compiled code for that matter) that
is compiled upon installation</li></ul><p>The latter is commonly used to create Ruby bindings for C libraries such as
<a href="http://www.xmlsoft.org/">libxml2</a>. The benefit of using C bindings is that they generally
perform better than their pure Ruby equivalents.</p><p>To install a C extension RubyGems executes a Ruby file called "extconf.rb"
(though you can change the name) to generate a Makefile and then runs <code>make</code>
and <code>make install</code> to build and install the extension. To get this done you'll
have to tell RubyGems where it can find the required files, this is done in
your Gem specification as following:</p><div class="highlight"><pre class="highlight"><code>Gem:<span class="ss">:Specification</span>.new <span class="k">do</span> |gem|
<span class="c"># ...</span>
<span class="c"># These files are used to generate Makefile files which in turn are used</span>
<span class="c"># to build and install the C extension.</span>
gem.extensions = [<span class="s">'ext/my_extension/extconf.rb'</span>]
<span class="c"># ...</span>
<span class="k">end</span>
</code></pre></div><p>Here the configuration file is located in <code>ext/my_extension/extconf.rb</code>. These
files typically look like something along the lines of the following:</p><div class="highlight"><pre class="highlight"><code>require <span class="s">'mkmf'</span>
have_header(<span class="s">'some_header'</span>)
find_executable(<span class="s">'some_required_executable'</span>)
$CFLAGS << <span class="s">' -Wextra -Wall -pedantic '</span>
create_makefile(<span class="s">'my_extension/my_extension'</span>)
</code></pre></div><p>Because all of this is executed upon Gem installation (and thus on the end
user's computer) this opens up interesting possibilities. For example, you
could check if specific files are available in a certain directory or as is
more commonly done check for headers and such. It also allows you to execute
arbitrary commands (which can potentially be dangerous).</p><p>For a project at <a href="http://olery.com/">Olery</a> we had to wrap code written in various
languages (Java, Python and Perl to be exact) in Ruby and distribute it. This
introduces a problem though: how do you ensure that all the dependencies of
both the Ruby and underlying code (e.g. Python) are installed? How do you
ensure that the right versions are available? In other words: dependency
management.</p><p>To give an example, one of the underlying code bases was written in Perl and
vendored the dependencies in the Git repository of the project. Normally Perl
is easy to use: you just run it. However, this particular project used one Perl
package that had a C binding and thus had to be compiled upon installation.</p><p>In Perl you normally install packages using CPAN (or CPAN Minus). However, CPAN
is rolling release and thus only keeps track of the most recent version of each
package. This means that a package could break at any given time without us
knowing about it beforehand. Another problem is that CPAN might not always be
available, configured or might require root access to install packages (this
depends on the configuration though). In other words, relying on CPAN would
probably make things too painful to deal with.</p><p>We decided to go down a different route: manually compile the package upon
installation. Since it was vendored and packaged along with the Ruby code this
in theory should not be too hard.</p><p>To achieve this we had to find a way to tap into the installation process of a
Gem. The only way to do this without requiring the user to run extra commands
after installing the Gem is to tap into the C extension build process. Since
this process is executed on the user's machine it allows you to inject
arbitrary actions. In other words, we had to hijack extconf.rb to compile the
Perl code.</p><p>To recap, building a C extension happens as following:</p><ol><li>Download the Gem</li><li>Run the extconf.rb file(s) of the Gem to generate the Makefile(s)</li><li>Run <code>make</code> and <code>make install</code> for each Makefile to build and install the
corresponding extensions.</li><li>Move the generated extension file (e.g. <code>my_extension.so</code>) to the lib
directory of the Gem so that it becomes available in the load path.</li></ol><p>Our solution was as following: use extconf.rb to compile the Perl code and use
a dummy Makefile to trick RubyGems into believing that the C extension was
built successfully. Without a valid Makefile RubyGems would otherwise just
abort the process.</p><p>As an example we'll build a Gem called "wat". The first step is to create a
basic Gem specification (only relevant code is shown here):</p><div class="highlight"><pre class="highlight"><code>Gem:<span class="ss">:Specification</span>.new <span class="k">do</span> |gem|
gem.name = <span class="s">'wat'</span>
gem.extensions = [<span class="s">'ext/wat/extconf.rb'</span>]
<span class="k">end</span>
</code></pre></div><p>In our case the extconf.rb file had to do two things: check for the required
dependencies (e.g. the "perl" command) and compile the extensions:</p><div class="highlight"><pre class="highlight"><code>require <span class="s">'mkmf'</span>
<span class="c"># Stops the installation process if one of these commands is not found in</span>
<span class="c"># $PATH.</span>
find_executable(<span class="s">'perl'</span>)
find_executable(<span class="s">'make'</span>)
<span class="c"># Create a dummy extension file. Without this RubyGems would abort the</span>
<span class="c"># installation process. On Linux this would result in the file "wat.so"</span>
<span class="c"># being created in the current working directory.</span>
<span class="c">#</span>
<span class="c"># Normally the generated Makefile would take care of this but since we</span>
<span class="c"># don't generate one we'll have to do this manually.</span>
<span class="c">#</span>
File.touch(File.join(Dir.pwd, <span class="s">'wat.'</span> + RbConfig:<span class="ss">:CONFIG</span>[<span class="s">'DLEXT'</span>]))
directories_with_perl_code.each <span class="k">do</span> |directory|
Dir.chdir(directory) <span class="k">do</span>
sh <span class="s">'perl Makefile.PL PREFIX=path/to/local/installation LIB=path/to/local/lib'</span>
sh <span class="s">'make && make install && make clean'</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="c"># This is normally set by calling create_makefile() but we don't need that</span>
<span class="c"># method since we'll provide a dummy Makefile. Without setting this value</span>
<span class="c"># RubyGems will abort the installation.</span>
$makefile_created = <span class="k">true</span>
</code></pre></div><p>This takes care of ensuring our dependencies are there, the Perl code is
compiled and RubyGems doesn't abort the installation process.</p><p>Next up we'll need to create a dummy Makefile. This Makefile goes in the same
directory as the extconf.rb file and looks pretty simple:</p><pre><code>all:
true
install:
true
</code></pre><p>The <code>true</code> commands are used to ensure that the commands run successfully,
again RubyGems would abort installation if one of them failed.</p><p>This solution, as dirty as it may sound, was actually surprisingly elegant. Of
course you should not use this as an excuse to turn RubyGems into a universal
package manager. However, if you need to take care of some basic dependency
management or need to run arbitrary commands upon installation it's not even
that bad. And no, I did not do drugs while writing that.</p><p>After discussing this with <a href="https://github.com/whitequark/">Peter Zotov</a> it turns out that the
above process can be done a bit easier by using a Rakefile instead of an
extconf.rb file. An example of a project using this approach is
<a href="https://github.com/ruby-llvm/ruby-llvm/blob/master/ruby-llvm.gemspec">ruby-llvm</a>. I haven't investigated this option myself so I can't
tell for certain though.</p><h2 id="using-a-rakefile">Using a Rakefile</h2><p>After writing this article it was discovered that the above process can be made
significantly easier by using a Rakefile. To be more exact, any file that does
not match the following pattern can be used without having to create the above
dummy files:</p><div class="highlight"><pre class="highlight"><code>/\A(extconf|makefile).rb\z/
</code></pre></div><p>This information is based on <a href="https://github.com/ruby/ruby/blob/34f5700a0947243198dea5461b80fa8be5ba19ea/lib/mkmf.rb#L2598-L2600">this</a> code. These particular lines of
code cause the installation process to fail (since mkmf exits with a non
successful exit status) if the filename of an extension matches the above
pattern and the variable <code>$extmk</code> is set to <code>false</code>.</p><p>In our particular use case this meant that I could get rid of the dummy
Makefile and C extension file since it's actually mkmf that insists on these
files being created and not RubyGems. This in turn made the code considerably
smaller and much less of a hack.</p>https://yorickpeterse.com/articles/debugging-with-pry/Debugging With Pry2011-11-27T00:00:00Z2011-11-27T00:00:00Z<p>Pry is a REPL (Read Eval Print Loop) that was written as a better alternative
to IRB. It comes with syntax highlighting, code indentation that actually works
and several other features that make it easier to debug code. I stumbled upon
Pry when looking for an alternative to both IRB and the way I was debugging my
code (placing <code>puts</code> all over the place, I think it's called "Puts Driven
Development").</p><p>Pry tries to do a lot of things and I was actually quite surprised how well it
does that. It might not stick to the Unix idea of only doing a single thing
(and doing that very well) but it makes my life (and the lifes of others) so
much easier that it's easy to forget.</p><p>Pry is primarily meant to be used as a REPL. There are a lot of things that
make Pry so much more pleasant to use than IRB. One of the things almost any
Ruby programmer will notice when using IRB is that its indentation support is a
bit clunky. Indenting itself works fine most of the time but it fails to
un-indent code properly as illustrated in the code snippet below (pasted
directly from an IRB session):</p><pre><code>ruby-1.9.3-p0 :001 > class User
ruby-1.9.3-p0 :002?> def greet
ruby-1.9.3-p0 :003?> puts "Hello world"
ruby-1.9.3-p0 :004?> end
ruby-1.9.3-p0 :005?> end
</code></pre><p>Luckily Pry handles this just fine, whether you're trying to indent a class or
a hash containing an array containing a proc and so on. Pry does this by
resetting the terminal output every time a new line is entered. The downside of
this approach is that it only works on terminals that understand ANSI escape
codes. In Pry the above example works like it should do:</p><pre><code>[1] pry(main)> class User
[1] pry(main)* def greet
[1] pry(main)* puts "Hello world"
[1] pry(main)* end
[1] pry(main)* end
</code></pre><p>Besides indentation Pry does a lot more. A feature that I think is very cool is
the ability to show documentation and source code of methods right in your REPL
(sadly this feature doesn't work with classes or modules at the time of
writing). This means that you no longer have to use the <code>ri</code> command to
search documentation for methods. You also don't need to install the RDoc
documentation as Pry pulls it directly from the source code. Showing the source
code of a method or its documentation can be done by using the <code>show-method</code>
and <code>show-doc</code> command. For example, invoking <code>show-method pry</code> in a Pry
session would give you the following output:</p><pre><code>[1] pry(main)> show-method pry
From: /path/trimmed/for/readability/lib/pry/core_extensions.rb @ line 19:
Number of lines: 3
Owner: Object
Visibility: public
def pry(target=self)
Pry.start(target)
end
</code></pre><p>Calling <code>show-doc pry</code> would instead show the following:</p><pre><code>[2] pry(main)> show-doc pry
From: /path/trimmed/for/readability/lib/pry/core_extensions.rb @ line 19:
Number of lines: 17
Owner: Object
Visibility: public
Signature: pry(target=?)
Start a Pry REPL.
This method differs from Pry.start in that it does not
support an options hash. Also, when no parameter is provided, the Pry
session will start on the implied receiver rather than on
top-level (as in the case of Pry.start).
It has two forms of invocation. In the first form no parameter
should be provided and it will start a pry session on the
receiver. In the second form it should be invoked without an
explicit receiver and one parameter; this will start a Pry
session on the parameter.
param [Object, Binding] target The receiver of the Pry session.
example First form
"dummy".pry
example Second form
pry "dummy"
example Start a Pry session on current self (whatever that is)
pry
</code></pre><p>You can also run these commands for code that was written in C. This requires
you to install the gem <code>pry-doc</code> (<code>gem install pry-doc</code>). Do note that this
only works for core C code, currently Pry does not support this for third party
extensions.</p><p>Another very cool feature is that Pry can be used as a debugging tool for your
code without having to manually jump into a session. By loading Pry, which can
be done by writing <code>require "pry"</code> or by using the option <code>-r pry</code> when
invoking Ruby you gain access to everything Pry has to offer. The most useful
tool is <code>binding.pry</code>. This method starts a Pry session and pauses the
script.</p><p>Lets say you have the following script and want to see the values of the
variables:</p><div class="highlight"><pre class="highlight"><code>language = <span class="s">'Ruby'</span>
number = <span class="mi">10</span>
<span class="c"># Do something awesome with the above variables.</span>
</code></pre></div><p>The typical approach would be to insert a puts statement above the comment
followed by an exit statement. Pry in a way can do a similar thing, it just
makes it a lot more awesome. If you modify the script as following you can
truly debug your code like a boss:</p><div class="highlight"><pre class="highlight"><code>language = <span class="s">'Ruby'</span>
number = <span class="mi">10</span>
binding.pry
<span class="c"># Do something awesome with the above variables.</span>
</code></pre></div><p>If you now run the script by calling <code>ruby -r pry file.rb</code> you get a fancy
Pry session:</p><pre><code>[yorickpeterse@Wifi-Ninja in ~]$ ruby -r pry file.rb
From: file.rb @ line 4 in Object#N/A:
1: language = 'Ruby'
2: number = 10
3:
=> 4: binding.pry
5:
6: # Do something awesome with the above variables.
[1] pry(main)>
</code></pre><p>A nice thing about starting Pry this way is that it starts in the context of
the call to <code>binding.pry</code> meaning you get access to data such as the local
variables. These can be displayed by calling <code>ls</code> or by simply typing their
name.</p><pre><code>[yorickpeterse@Wifi-Ninja in ~]$ ruby -r pry file.rb
From: file.rb @ line 4 in Object#N/A:
1: language = 'Ruby'
2: number = 10
3:
=> 4: binding.pry
5:
6: # Do something awesome with the above variables.
[1] pry(main)> ls
self methods: include private public to_s
locals: _ _dir_ _ex_ _file_ _in_ _out_ _pry_ language number
[2] pry(main)> number
=> 10
[3] pry(main)>
</code></pre><p>Moving out of the "breakpoint" (or moving to the next one if you have multiple
ones defined) can be done by hitting <code>^D</code> (Ctrl+D usually).</p><p>Besides the features mentioned in this article Pry has several more. For
example, long output is piped to Less. This can be quite useful if you're
trying to display a big hash using <code>pp</code>. The full list of features can be
found on the <a href="http://pry.github.com/">Pry website</a> as well as by invoking the <code>help</code>
command inside a Pry session. If you're in need of help or have any suggestions
you can join the IRC channel #pry on the Freenode network (irc.freenode.net).
The source code of Pry is hosted on <a href="http://github.com/pry/pry">Github</a>.</p>https://yorickpeterse.com/articles/use-bcrypt-fool/Use BCrypt Fool!2011-04-13T09:41:00Z2011-04-13T09:41:00Z<p>Almost any application will eventually need to store a collection of passwords
or another type of data that has to be stored using a hashing algorithm. Blogs,
forums, issue trackers, they all need to store user data and these passwords.
This article covers the common mistakes made when dealing with passwords and
what you should use instead. In order to fully understand this article some
basic knowledge of programming and computers is required, you should also know
a bit about the common hashing algorithms such as MD5 and SHA1.</p><h2 id="the-problem">The Problem</h2><p>When developing applications developers make the common mistake of thinking
they have a solid understanding of how hashing works. They think that by doing
X they're done and perfectly safe. Guess what, that's not the case (not even
close). The following mistakes are the most common:</p><ul><li>Using a broken algorithm (MD5, SHA1)</li><li>Hashing a password N times in the form of hash( hash(password) ) * N</li><li>Limiting the length of passwords to N characters</li></ul><p>We'll start with the first problem. Up until a few years ago MD5 was the most
common hashing algorithm used for passwords (and other data as well). MD5 was
considered to be pretty safe until a group of people managed to prove how weak
it really was: they were able to generate a set of collisions in a relatively
short amount of time (a few hours or so). This set off a chain reaction and
many more flaws were found.</p><p>Luckily MD5 isn't the only hashing algorithm out there, there's SHA1 and the
SHA2 family as well as a few other ones. SHA1-SHA2 are much strong than MD5 and
at the time of writing (April 2011) only SHA1 has been compromised. Technically
it would take serious amount of time to crack SHA1 but the idea of using an
algorithm that *can* be cracked before humanity is wiped out should be enough
for people to not use it for privacy related data.</p><p>So why are collisions bad? Can't we just use a very very long password or use
method X (insert your favorite counter measure)? Yes, you can. The problem
however isn't fixed, you're merely making the process slower rather than fixing
the actual root of the problem. Time for an example. Assuming we have a
hashing function called "hash" and two strings, A and B (where A and B are
unique), our hashing process of these strings would look like the following:</p><pre><code>pwd1 = hash(A)
pwd2 = hash(B)
</code></pre><p>In this case both pwd1 and pwd2 are unique. At this point a lot of people think
they're good to go as they assume nobody is willing to wait for a certain
period of time before they're able to crack the password, this is a *very*
stupid mistake. While trying to crack a password (by bruteforcing it for
example) may take a long time on a single computer most hackers can easily boot
up a few servers or even worse, use a botnet. All known hashing algorithms
(except BCrypt, more on that later) are affected by a single common problem:
<a href="https://secure.wikimedia.org/wikipedia/en/wiki/Moore's_law">Moore's law</a>. Moore's law states that every two years the amount
of transistors that can be put in a computer doubles. This means that the
faster computers get the quicker they're able to crack a password. A hacker
merely has to use N computers and the time required to retrieve the original
password will be greatly reduced.</p><p>Because of this problem developers try to come up with solutions. These
solutions don't actually solve the problem, they just make it harder and
require more time. A common "fix" is to hash a password N times and then save
it in the database. Developers do this for a few reasons:</p><ul><li>It's supposed to be slower</li><li>In order to retrieve the original password a hacker has to crack multiple
hashes instead of only one.</li></ul><p>The fun thing is that this entire process doesn't actually make the password
more secure. The first reason is pretty easy to bust: simply add more hardware
(or better hardware) and you're good to go. The second reason is a bit harder
to bust as it depends on the algorithm that is used. If we look back at our
hash() function the process of hashing a hash multiple times would look like
the following:</p><pre><code>hash = hash( hash(hash(A)) )
</code></pre><p>In this example there are 3 calls to the hashing function. If A was "yorick"
this would look a bit like the following:</p><pre><code>hash(yorick) -> j238103
hash(j238103) -> a9shda9
hash(a9shda9) -> 11s08j1
</code></pre><p>In this case "11s08j1" is the final hash that will be stored in our database.
At this point developers usually lay down their work and take a coffee or a tea
thinking they've done a good job and are hacker proof. Guess what, they're not.
What just happened is that the process of hashing A multiple times actually
increased the possibility of a hash collision. While we do have to crack the
hashing process N times for each call to hash() we don't actually have to start
at the very end (with "11s08j1"). The reason for this is that "11s08j1" isn't
directly based on "yorick" but on "a9shda9". This means that we merely have to
find the hash that results in "11s08j1" when using our hash function. If we
find a collision we can simply crack it again and we'd end up with our original
password.</p><p>In order to explain this properly I simplified the process of hashing A N times:</p><pre><code>password --> hash 1 --> hash 2 --> final hash
</code></pre><p>In order to retrieve the original password ("password") we'd have to find a
collision for "hash 2". We can't use hash 1 as it's source ("password") can be
considered totally random and would take more time. However, the source of hash
2 is much easier due one big issue: the entropy (the amount of possible
combinations) of the password has been decreased. If we look back at the
previous example we know the final hash is "11s08j1" and that the original
password is "yorick". Using various techniques (rainbow tables, bruteforcing,
etc) we can quickly identify the source of "final hash". The value of "hash 2"
is "a9shda9", while in our example this looks more random (it is) than the
original password common hashing algorithms only use regular characters
(letters and numbers) for their output. A good example of this is the following
Ruby example:</p><div class="highlight"><pre class="highlight"><code>require <span class="s">'digest'</span>
password = <span class="s">'as9(A*&SD&(@))'</span>
hash = Digest:<span class="ss">:SHA1</span>.new.hexdigest(password)
p hash <span class="c"># => "d4c36f9b1f003bee2e5dcafdf6b006110709dfb5"</span>
</code></pre></div><p>The hash of the password (which is just something I randomly typed on my
keyboard) may be longer but it only uses letters and numbers opposed to all the
gibberish in the original password. The same happens with our hash() function
and this allows us to quickly retrieve the original password. If we have the
original hash of "final hash" we can then simply continue reversing the process
until we end up at "yorick".</p><p>The reason why you can't initially find the source of "hash 2" is because you
can't find out what "hash 1" is because it's not stored somewhere while "final
hash" is.</p><p>To cut a long story short, hashing a hash N times doesn't make your passwords
more secure and can actually make it less secure as a hacker can quite easily
reverse the process by generating hash collisions.</p><h2 id="the-solution">The Solution</h2><p>It has already been mentioned before but the solution is to use an algorithm
called "BCrypt". BCrypt is a hashing algorithm based on <a href="http://en.wikipedia.org/wiki/Blowfish_(cipher)">Blowfish</a>
with a small twist: it keeps up with Moore's law. The idea of BCrypt is quite
simple, don't just use regular characters (and thus increasing the entropy) and
make sure password X always takes the same amount of time regardless of how
powerful the hardware is that's used to generate X. I'm not going to cover all
the technical details but basically BCrypt requires you to specify a
cost/workfactor in order to generate a password. This workfactor not only makes
the entire process slower but is also used to generate the end hash. This means
that if somebody were to change the workfactor the hash would also be
different. In other words, hackers, you're fucked. In order for a hacker to
gain the original password he must use the same workfactor and thus has to wait
N times longer than when not using a workfactor.</p><p>Time for an example in Ruby:</p><div class="highlight"><pre class="highlight"><code>require <span class="s">'benchmark'</span>
require <span class="s">'bcrypt'</span>
password = <span class="s">'yorick'</span>
amount = <span class="mi">100</span>
Benchmark.bmbm(<span class="mi">20</span>) <span class="k">do</span> |run|
run.report(<span class="s">"Cost of 5"</span>) <span class="k">do</span>
amount.times <span class="k">do</span>
hash = BCrypt:<span class="ss">:Password</span>.create(password, <span class="ss">:cost</span> => <span class="mi">5</span>)
<span class="k">end</span>
<span class="k">end</span>
run.report(<span class="s">"Cost of 10"</span>) <span class="k">do</span>
amount.times <span class="k">do</span>
hash = BCrypt:<span class="ss">:Password</span>.create(password, <span class="ss">:cost</span> => <span class="mi">10</span>)
<span class="k">end</span>
<span class="k">end</span>
run.report(<span class="s">"Cost of 15"</span>) <span class="k">do</span>
amount.times <span class="k">do</span>
hash = BCrypt:<span class="ss">:Password</span>.create(password, <span class="ss">:cost</span> => <span class="mi">15</span>)
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div><p>For the non Ruby people, this is a simple benchmark script that shows the time
it takes to hash "yorick" with BCrypt with a cost/workfactor of 5, 10 and 15 a
total of 100 times. The results of this benchmark would look like the
following:</p><pre><code>Rehearsal -------------------------------------------------------
Cost of 5 0.250000 0.000000 0.250000 ( 0.249723)
Cost of 10 7.740000 0.010000 7.750000 ( 7.879849)
Cost of 15 247.510000 0.460000 247.970000 (255.346897)
-------------------------------------------- total: 255.970000sec
user system total real
Cost of 5 0.250000 0.000000 0.250000 ( 0.272549)
Cost of 10 7.750000 0.030000 7.780000 ( 8.442511)
Cost of 15 247.530000 0.480000 248.010000 (254.815985)
</code></pre><p>The column we're really interested in is the "real" column. As you can see a
cost of 5 only takes about 250 miliseconds while a cost of 15 takes a whopping
250 seconds (around 4 minutes).</p><p>To cut another long story short: BCrypt adopts to Moore's law and makes it
impossible for a hacker to crack a password using rainbow tables or other
techniques.</p><h2 id="implementations">Implementations</h2><p>The BCrypt hashing algorithm is implemented in quite a few languages. I've
collected a list of resources for various languages so you can start using
BCrypt right away.</p><h3 id="php">PHP</h3><p>PHP allows you to use BCrypt passwords using the <a href="http://nl3.php.net/manual/en/function.crypt.php">crypt()</a> function.
This works as following:</p><div class="highlight"><pre class="highlight"><code><?php
$hash = crypt(<span class="s">'rasmuslerdorf'</span>, <span class="s">'$2a$07$usesomesillystringforsalt$'</span>);
</code></pre></div><h3 id="ruby">Ruby</h3><p>For Ruby there's a gem called "bcrypt-ruby" which can be installed using
Rubygems:</p><pre><code>$ gem install bcrypt-ruby
</code></pre><p>Once installed you can use it as following:</p><pre><code>require 'bcrypt'
hash = BCrypt::Password.create('yorick', :cost => 10)
</code></pre><h3 id="perl">Perl</h3><p>For Perl there's <a href="http://search.cpan.org/dist/Crypt-Eksblowfish/">Crypt::Eksblowfish</a> which works as following:</p><pre><code data-language="perl">use Crypt::Eksblowfish::Bcrypt qw(bcrypt_hash);
$salt = '1p23j1-9381-23';
$password = 'yorick';
$hash = bcrypt_hash({
key_nul => 1,
cost => 10,
salt => $salt,
}, $password);
</code></pre><h3 id="others">Others</h3><ul><li>Python has <a href="https://github.com/dlitz/pycrypto">The Python Cryptography Toolkit</a></li><li>Lua seems to have <a href="https://github.com/silentbicycle/lua-bcrypt">this</a> implementation</li><li>There's an <a href="https://github.com/skarab/erlang-bcrypt">Erlang implementation</a> as well</li></ul><h2 id="special-thanks">Special Thanks</h2><p>I'd like to thank the following IRC folks for helping me out (all of them can
be found on Freenode):</p><ul><li>squeeks from #forrst-chat</li><li>amr from #forrst-chat</li><li>dominikh from #ramaze</li></ul>