<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://engineering.qubecinema.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://engineering.qubecinema.com/" rel="alternate" type="text/html" /><updated>2025-04-01T08:19:11+00:00</updated><id>https://engineering.qubecinema.com/feed.xml</id><title type="html">Qube Cinema Engineering</title><subtitle>Sharing the technical challenges we go through in our day to day product development.</subtitle><entry><title type="html">Load Balancing iPerf3 Servers</title><link href="https://engineering.qubecinema.com/2020/08/08/load-balancing-iperf3-servers.html" rel="alternate" type="text/html" title="Load Balancing iPerf3 Servers" /><published>2020-08-08T00:00:00+00:00</published><updated>2020-08-08T00:00:00+00:00</updated><id>https://engineering.qubecinema.com/2020/08/08/load-balancing-iperf3-servers</id><content type="html" xml:base="https://engineering.qubecinema.com/2020/08/08/load-balancing-iperf3-servers.html"><![CDATA[<p>Recently I had to set up a TCP load balancer for iperf3 server to allow simultaneous tests from multiple iperf3 clients. <a href="https://iperf.fr/">iperf3</a> is a tool to measure network performance, and I used <a href="https://balance.inlab.net/">Balance</a> as the load balancer. This post describes the need and steps to run iperf3 servers behind a TCP load balancer running on the same host.</p>

<p>It all started with the need to measure 4G internet bandwidth at strategic locations across India. I quickly wrote a bash script to run iperf3 client against an iperf3 server hosted by us. The bandwidth measurement results are uploaded to another server in the cloud. A <code class="language-plaintext highlighter-rouge">systemd</code> timer periodically runs the script.</p>

<p>Everything worked as expected in the beginning. But, soon we started seeing many occurrences of the error <code class="language-plaintext highlighter-rouge">iperf3: error - the server is busy running a test. try again later</code> all over our results. My investigation reveals that by design, iperf3 server doesn’t allow simultaneous tests from more than one client: the actual reason unknown, maybe historic design decision?</p>

<p>The conventional solution to such problem is to run iperf3 server on multiple machines/containers and expose them through a layer 4 load balancer like AWS NLB. But I didn’t choose that approach due to the additional cost and complexity. Instead, I chose to use a simple TCP load balancer called <a href="https://balance.inlab.net/">Balance</a>.</p>

<h2 id="the-setup">The setup</h2>

<p>The plan is simple, run multiple instances/processes of iperf3 server in the same host but on different ports. Then we run Balance specifying the load balancing port and the target hosts. In our case, the hostname of our target hosts are <code class="language-plaintext highlighter-rouge">localhost</code> as both iperf3 servers and the load balancer are running on the same machine. So only the difference in the target hosts is the port. The command-lines for the setup are below.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre>
<span class="nv">$ </span>iperf3 <span class="nt">-s</span> <span class="nt">-p</span> 2101 &amp;
<span class="nv">$ </span>iperf3 <span class="nt">-s</span> <span class="nt">-p</span> 2102 &amp;
<span class="nv">$ </span>iperf3 <span class="nt">-s</span> <span class="nt">-p</span> 2103 &amp;
<span class="nv">$ </span>iperf3 <span class="nt">-s</span> <span class="nt">-p</span> 2104 &amp;
<span class="nv">$ </span>iperf3 <span class="nt">-s</span> <span class="nt">-p</span> 2105 &amp;
<span class="nv">$ </span>balance <span class="nt">-f</span> 2100 localhost:2101 localhost:2102 localhost:2103 localhost:2104 localhost:2105

</pre></td></tr></tbody></table></code></pre></div></div>

<p>Once the setup is ready, I tried testing with iperf3 client running on my laptop. The client connected, but it didn’t progress with the testing. The client simply hung. I could also witness the client connection in the servers. But strangely, two servers reported the same client connection. Bit of a web search, I found this <a href="https://github.com/esnet/iperf/issues/823">iperf3 issue</a>. Seems iperf3 uses two connections for its operation: one for control, one for data. In the load balancing setup, each of this connection ended up in different iperf3 servers as they balanced on a round-robin basis. This explains why the two iperf3 servers reported the same client connection and why the test didn’t progress.</p>

<p>I know AWS load balancers allow us to configure sticky sessions for such use case. Balance doesn’t seem to provide similar functionality. But it does support hash-based balancing which uses hashed client address to determine the target host. Though it is not same as a sticky session, it satisfies our need for using the same target host for all connections of a client. The command-line option to enable hash balancing is <code class="language-plaintext highlighter-rouge">%</code>. So the revised command-line looks like below.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>...
<span class="nv">$ </span>balance <span class="nt">-f</span> 2100 localhost:2101 localhost:2102 localhost:2103 localhost:2104 localhost:2105 %
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="putting-it-together">Putting it together</h2>

<p>Though the above command-lines do the job, they don’t provide the flexibility to run/stop them quickly. So I crafted the following bash script to run them with a single command and stop them with <strong>Ctrl+C</strong> key press. I hope it would be useful for someone.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
</pre></td><td class="rouge-code"><pre>
<span class="c">#!/bin/bash</span>

<span class="nb">trap</span> <span class="s2">"kill 0"</span> EXIT

<span class="nv">lb_port</span><span class="o">=</span>5100
<span class="nv">port_start</span><span class="o">=</span>5101
<span class="nv">port_end</span><span class="o">=</span>5105
<span class="nv">hosts</span><span class="o">=</span><span class="s2">""</span>

<span class="k">for</span> <span class="o">((</span> <span class="nv">port</span><span class="o">=</span><span class="nv">$port_start</span><span class="p">;</span> port&lt;<span class="o">=</span><span class="nv">$port_end</span><span class="p">;</span> port++ <span class="o">))</span>
<span class="k">do
    </span><span class="nb">echo</span> <span class="s2">"start iperf3 server on port </span><span class="nv">$port</span><span class="s2">..."</span>
    iperf3 <span class="nt">-s</span> <span class="nt">-p</span> <span class="nv">$port</span> &amp;
    
    <span class="nv">hosts</span><span class="o">=</span><span class="s2">"</span><span class="nv">$hosts</span><span class="s2"> localhost:</span><span class="nv">$port</span><span class="s2">"</span>
<span class="k">done</span>

</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="links">Links</h2>

<ol>
  <li><a href="https://iperf.fr/">iperf3 project page</a></li>
  <li><a href="https://balance.inlab.net/">Balance project page</a></li>
  <li><a href="https://linux.die.net/man/1/balance">Balance man page</a></li>
</ol>]]></content><author><name>sivachandran</name></author><category term="linux" /><summary type="html"><![CDATA[Recently I had to set up a TCP load balancer for iperf3 server to allow simultaneous tests from multiple iperf3 clients. iperf3 is a tool to measure network performance, and I used Balance as the load balancer. This post describes the need and steps to run iperf3 servers behind a TCP load balancer running on the same host.]]></summary></entry><entry><title type="html">Go Omitempty Gotcha!</title><link href="https://engineering.qubecinema.com/2019/12/13/problem-of-zero-values-with-omitempty-json-tag.html" rel="alternate" type="text/html" title="Go Omitempty Gotcha!" /><published>2019-12-13T00:00:00+00:00</published><updated>2019-12-13T00:00:00+00:00</updated><id>https://engineering.qubecinema.com/2019/12/13/problem-of-zero-values-with-omitempty-json-tag</id><content type="html" xml:base="https://engineering.qubecinema.com/2019/12/13/problem-of-zero-values-with-omitempty-json-tag.html"><![CDATA[<p>This post is about the experience we had with the <code class="language-plaintext highlighter-rouge">omitempty</code> JSON tag while constructing JSON in Golang. <code class="language-plaintext highlighter-rouge">omitempty</code> ignores the null value of datatypes. This could cause problems when the expected value for an attribute is the null value.</p>

<p>I started experiencing this problem when I was adding a new API. Our distribution platform archives movie content after N days. I was implementing an API that lists movies from the archives along with information like the size of each movie and the time taken for retrieving every movie from the archives.</p>

<h3 id="model">Model:</h3>

<p>This is the <code class="language-plaintext highlighter-rouge">struct</code> definition we use to list movies.</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
</pre></td><td class="rouge-code"><pre><span class="k">type</span> <span class="n">Response</span> <span class="k">struct</span> <span class="p">{</span>
  <span class="n">Movies</span> <span class="p">[]</span><span class="n">Movie</span>
<span class="p">}</span>

<span class="k">type</span> <span class="n">Movie</span> <span class="k">struct</span> <span class="p">{</span>
  <span class="n">ID</span>                    <span class="kt">string</span>
  <span class="n">AvailabilityInMinutes</span> <span class="kt">int</span>    <span class="s">`json:"availabilityInMinutes,omitempty"`</span>
  <span class="n">Size</span>                  <span class="kt">uint64</span> <span class="s">`json:"size,omitempty"`</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Let us consider that this API returns 2 movies of which the first movie is available immediately while the other movie is not present in the archives.</p>

<h2 id="data-initialisation">Data Initialisation</h2>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
</pre></td><td class="rouge-code"><pre><span class="n">movies</span> <span class="o">:=</span> <span class="p">[]</span><span class="n">Movie</span><span class="p">{</span>
  <span class="p">{</span>
    <span class="n">ID</span><span class="o">:</span> <span class="s">"3f00c091-b29d-4b99-8136-5cb65c987250"</span><span class="p">,</span> 
    <span class="n">AvailabilityInMinutes</span><span class="o">:</span> <span class="m">0</span><span class="p">,</span>
    <span class="n">Size</span><span class="o">:</span> <span class="m">54626741</span>
  <span class="p">},</span>
  <span class="p">{</span>
    <span class="n">ID</span><span class="o">:</span> <span class="s">"04ea246b-7e7a-46a4-ba56-867117a90610"</span><span class="p">,</span> 
  <span class="p">}</span>
<span class="p">}</span>

<span class="n">moviesJSON</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">json</span><span class="o">.</span><span class="n">Marshal</span><span class="p">(</span><span class="n">movies</span><span class="p">)</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
  <span class="n">handleErr</span><span class="p">(</span><span class="n">err</span><span class="p">)</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Ideally, for the first movie, the API should list all the properties. The API should only list the <code class="language-plaintext highlighter-rouge">id</code> property for the second movie since the system doesn’t have any information about the movie.</p>

<h3 id="expected-response">Expected Response:</h3>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="rouge-code"><pre><span class="p">{</span><span class="w">
  </span><span class="nl">"movies"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
    </span><span class="p">{</span><span class="w">
      </span><span class="nl">"id"</span><span class="p">:</span><span class="w"> </span><span class="s2">"3f00c091-b29d-4b99-8136-5cb65c987250"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"availabilityInMinutes"</span><span class="p">:</span><span class="w"> </span><span class="s2">"0"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"size"</span><span class="p">:</span><span class="w"> </span><span class="s2">"54626741"</span><span class="w">
    </span><span class="p">},</span><span class="w">
    </span><span class="p">{</span><span class="w">
      </span><span class="nl">"id"</span><span class="p">:</span><span class="w"> </span><span class="s2">"04ea246b-7e7a-46a4-ba56-867117a90610"</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="the-problem">The Problem</h2>

<p>However, the response that we got was:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="rouge-code"><pre><span class="p">{</span><span class="w">
  </span><span class="nl">"movies"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
    </span><span class="p">{</span><span class="w">
      </span><span class="nl">"id"</span><span class="p">:</span><span class="w"> </span><span class="s2">"3f00c091-b29d-4b99-8136-5cb65c987250"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"size"</span><span class="p">:</span><span class="w"> </span><span class="s2">"54626741"</span><span class="w">
    </span><span class="p">},</span><span class="w">
    </span><span class="p">{</span><span class="w">
      </span><span class="nl">"id"</span><span class="p">:</span><span class="w"> </span><span class="s2">"04ea246b-7e7a-46a4-ba56-867117a90610"</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></pre></td></tr></tbody></table></code></pre></div></div>

<p>The availability field for both movies is missing! 
For the first movie,<code class="language-plaintext highlighter-rouge">availabilityInMinutes</code> has been omitted since it holds <code class="language-plaintext highlighter-rouge">0</code> which is the default value/empty value for <code class="language-plaintext highlighter-rouge">int</code> datatype. For the second movie,<code class="language-plaintext highlighter-rouge">availabilityInMinutes</code> has been omitted since the attribute does not hold any value. 
If we remove the omitempty tag in that field, then the second movie that is not present in the system will also have <code class="language-plaintext highlighter-rouge">availabilityInMinutes</code> set to <code class="language-plaintext highlighter-rouge">0</code> which is incorrect.</p>

<p>After discussing with the team, we decided to use the pointer type.</p>

<h2 id="the-solution">The solution:</h2>

<p>The fix is to use the pointer to the primitive type as the pointers have <code class="language-plaintext highlighter-rouge">nil</code> as the zero values.</p>

<h3 id="updated-model">Updated Model</h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="k">type</span> <span class="n">Movie</span> <span class="k">struct</span> <span class="p">{</span>
  <span class="n">ID</span>                    <span class="kt">string</span>
  <span class="n">AvailabilityInMinutes</span> <span class="o">*</span><span class="kt">int</span>   <span class="s">`json:"availabilityInMinutes,omitempty"`</span>
  <span class="n">Size</span>                  <span class="kt">uint64</span> <span class="s">`json:"size,omitempty"`</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="data">Data</h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
</pre></td><td class="rouge-code"><pre><span class="n">movie1Availability</span> <span class="o">:=</span> <span class="m">371367181</span>
<span class="n">movies</span> <span class="o">:=</span> <span class="p">[]</span><span class="n">Movie</span><span class="p">{</span>
  <span class="p">{</span>
    <span class="n">ID</span><span class="o">:</span> <span class="s">"3f00c091-b29d-4b99-8136-5cb65c987250"</span><span class="p">,</span>
    <span class="n">AvailabilityInMinutes</span><span class="o">:</span> <span class="nb">new</span><span class="p">(</span><span class="kt">int</span><span class="p">),</span>
    <span class="n">Size</span><span class="o">:</span> <span class="m">54626741</span>
  <span class="p">},</span>
  <span class="p">{</span>
    <span class="n">ID</span><span class="o">:</span> <span class="s">"04ea246b-7e7a-46a4-ba56-867117a90610"</span><span class="p">,</span>
  <span class="p">}</span>
<span class="p">}</span>

<span class="n">moviesJSON</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">json</span><span class="o">.</span><span class="n">Marshal</span><span class="p">(</span><span class="n">movies</span><span class="p">)</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
  <span class="n">handleErr</span><span class="p">(</span><span class="n">err</span><span class="p">)</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>On marshaling the above data, we get the following JSON:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="rouge-code"><pre>{
  "movies": [
    {
      "id": "3f00c091-b29d-4b99-8136-5cb65c987250",
      "availabilityInMinutes": 0,
      "size": "54626741"
    },
    {
      "id": "04ea246b-7e7a-46a4-ba56-867117a90610"
    }
  ]
}
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And the problem was solved!</p>

<h3 id="tldr">TL;DR</h3>

<p>Using the pointers (whose zero values are <code class="language-plaintext highlighter-rouge">nil</code>) in place of the primitive types solves the problem.</p>]]></content><author><name>sidharth</name></author><category term="json" /><category term="go" /><summary type="html"><![CDATA[This post is about the experience we had with the omitempty JSON tag while constructing JSON in Golang. omitempty ignores the null value of datatypes. This could cause problems when the expected value for an attribute is the null value.]]></summary></entry><entry><title type="html">SQLite Database Schema Migration Using Golang</title><link href="https://engineering.qubecinema.com/2019/09/20/sqlite-database-schema-migration-using-golang.html" rel="alternate" type="text/html" title="SQLite Database Schema Migration Using Golang" /><published>2019-09-20T00:00:00+00:00</published><updated>2019-09-20T00:00:00+00:00</updated><id>https://engineering.qubecinema.com/2019/09/20/sqlite-database-schema-migration-using-golang</id><content type="html" xml:base="https://engineering.qubecinema.com/2019/09/20/sqlite-database-schema-migration-using-golang.html"><![CDATA[<p>SQLite is a nifty choice for client side database applications. It is serverless, works with easily accessible cross-platform database files, and does not need any installation. However, occasionally, there may arise the need to modify the database schema. For instance, one of the tables might need a new column. Such a modification must not, in any manner, tamper with or corrupt any data that already exists on the database. This is where SQLite has a slight disadvantage, discussed in detail later in this post.</p>

<p>The Go programming language has libraries for creation and migration of SQLite databases. This post covers the basics of SQLite DB creation and migration using Go, and provides guidelines on how to update the DB schema without loss or corruption of data. This is based on one of our applications, where we encountered this problem of schema migration.</p>

<p><strong>Note:</strong> If you are aware of how to create and setup migration for databases using Go, please skip the <a href="#let-us-go"><strong>Let Us GO!</strong></a> section below.</p>

<h2 id="migration-scripts-and-versioning">Migration Scripts And Versioning</h2>

<p>A good way of shipping out a client side database application is to equip it with the mechanism to create the DB itself, when launched for the first time. This can be done by embedding a set of scripts containing SQL commands to do the same. We refer to these files as migration scripts. The naming of these files is done in a specific format: <code class="language-plaintext highlighter-rouge">&lt;version&gt;_&lt;description&gt;.&lt;up/down&gt;.sql</code>. This conveys the order in which these must be executed.</p>
<ul>
  <li>The <code class="language-plaintext highlighter-rouge">version</code> is the numerical value used to determine the schema version of the database.</li>
  <li>The <code class="language-plaintext highlighter-rouge">description</code> denotes the changes made by the script. This is ignored in the migration process.</li>
  <li>The <code class="language-plaintext highlighter-rouge">up/down</code> suffix determines whether the script will be used to upgrade or downgrade the database.</li>
</ul>

<p>If the database for the application does not exist (as would be the case for the first launch), the migration code shall run the <code class="language-plaintext highlighter-rouge">up</code> scripts to create it. The schema version of the database shall be set to the latest <code class="language-plaintext highlighter-rouge">version</code> specified among the scripts. For subsequent launches, the migration code searches for any scripts that have a higher <code class="language-plaintext highlighter-rouge">version</code>, and executes the SQL statements in those scripts. For instance, if the DB schema version is <code class="language-plaintext highlighter-rouge">6</code>, only the scripts with version <code class="language-plaintext highlighter-rouge">7</code> or higher, if available, shall be used to update the schema.</p>

<h2 id="let-us-go">Let Us GO!</h2>

<p>This post shall utilize the following Go libraries and packages for creation and migration of the DB:</p>
<ul>
  <li><a href="https://golang.org/pkg/database/sql">database/sql</a>: Generic interface around SQL databases.</li>
  <li><a href="https://github.com/mattn/go-sqlite3">go-sqlite3</a>: Database driver for SQLite. This shall be used in conjunction with <code class="language-plaintext highlighter-rouge">database/sql</code>.</li>
  <li><a href="https://github.com/golang-migrate/migrate">golang-migrate</a>: Used for database migrations.</li>
  <li><a href="https://github.com/jteeuwen/go-bindata">go-bindata</a>: Used for embedding the migration scripts into the application.</li>
</ul>

<p>The process to equip migration into the application is threefold:</p>
<ol>
  <li>Convert the migration scripts into embeddable binary form</li>
  <li>Create the database in the application</li>
  <li>Run the relevant migration scripts on the database</li>
</ol>

<h4 id="a-using-go-bindata">A. Using go-bindata</h4>

<p>The following command shall generate the binary data:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>go-bindata -o &lt;path-to-datafile&gt;.go -prefix "&lt;migration-scripts-dir&gt;" -pkg &lt;package-name&gt; &lt;migration-scripts-dir&gt;
</pre></td></tr></tbody></table></code></pre></div></div>
<p>This shall convert the scripts into binary form and store this information in the specifed Go file. The <code class="language-plaintext highlighter-rouge">-pkg</code> specifies the package for the generated Go file. The path to this file, and the migration scripts directory, should be in the directory of the specified package.</p>

<h4 id="b-creating-the-database">B. Creating The Database</h4>

<p>The <code class="language-plaintext highlighter-rouge">database/sql</code> package, along with the <code class="language-plaintext highlighter-rouge">go-sqlite3</code> driver, can be used to create the DB. The following code snippet performs this task:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
</pre></td><td class="rouge-code"><pre>import (
	"database/sql"

	_ "github.com/mattn/go-sqlite3"
	"github.com/pkg/errors"
)

func NewDB(dbPath string) (*sql.DB, error) {
	sqliteDb, err := sql.Open("sqlite3", dbPath)
	if err != nil {
		return nil, errors.Wrap(err, "failed to open sqlite DB")
	}

	return sqliteDb, nil
}

</pre></td></tr></tbody></table></code></pre></div></div>

<h4 id="c-running-the-migration-scripts">C. Running the Migration Scripts</h4>

<p>The generated Go data file contains the migration scripts listed as <code class="language-plaintext highlighter-rouge">Assets</code>. We shall use this in the migration function, as follows:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
</pre></td><td class="rouge-code"><pre>import (
	"fmt"

	"github.com/golang-migrate/migrate"
	"github.com/golang-migrate/migrate/database/sqlite3"
	bindata "github.com/golang-migrate/migrate/source/go_bindata"
)

func RunMigrateScripts(db *sql.DB) error {
	driver, err := sqlite3.WithInstance(db, &amp;sqlite3.Config{})
	if err != nil {
		return fmt.Errorf("creating sqlite3 db driver failed %s", err)
	}

	res := bindata.Resource(AssetNames(),
		func(name string) ([]byte, error) {
			return Asset(name)
		})

	d, err := bindata.WithInstance(res)
	m, err := migrate.NewWithInstance("go-bindata", d, "sqlite3", driver)
	if err != nil {
		return fmt.Errorf("initializing db migration failed %s", err)
	}

	err = m.Up()
	if err != nil &amp;&amp; err != migrate.ErrNoChange {
		return fmt.Errorf("migrating database failed %s", err)
	}

	return nil
}
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">m.Up()</code> call executes the <code class="language-plaintext highlighter-rouge">up</code> scripts, whichever necessary. For downgrading the DB, this can be replaced by <code class="language-plaintext highlighter-rouge">m.Down()</code>.</p>

<h4 id="d-putting-it-all-together">D. Putting It All Together</h4>

<p>Assuming that we have a <code class="language-plaintext highlighter-rouge">db</code> package for the DB related code, the final product looks something like this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
</pre></td><td class="rouge-code"><pre>package db

import (
	"database/sql"
	"fmt"

	"github.com/golang-migrate/migrate"
	"github.com/golang-migrate/migrate/database/sqlite3"
	bindata "github.com/golang-migrate/migrate/source/go_bindata"
	_ "github.com/mattn/go-sqlite3"
	"github.com/pkg/errors"
)

func NewDB(dbPath string) (*sql.DB, error) {
	sqliteDb, err := sql.Open("sqlite3", dbPath)
	if err != nil {
		return nil, errors.Wrap(err, "failed to open sqlite DB")
	}

	return sqliteDb, nil
}

func RunMigrateScripts(db *sql.DB) error {
	driver, err := sqlite3.WithInstance(db, &amp;sqlite3.Config{})
	if err != nil {
		return fmt.Errorf("creating sqlite3 db driver failed %s", err)
	}

	res := bindata.Resource(AssetNames(),
		func(name string) ([]byte, error) {
			return Asset(name)
		})

	d, err := bindata.WithInstance(res)
	m, err := migrate.NewWithInstance("go-bindata", d, "sqlite3", driver)
	if err != nil {
		return fmt.Errorf("initializing db migration failed %s", err)
	}

	err = m.Up()
	if err != nil &amp;&amp; err != migrate.ErrNoChange {
		return fmt.Errorf("migrating database failed %s", err)
	}

	return nil
}
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The final step is using these two in the main application.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
</pre></td><td class="rouge-code"><pre>package main

import (
	"log"

	"&lt;db-package-import-path&gt;"
	_ "github.com/mattn/go-sqlite3"
)

func main() {
	sqliteDb, err := db.NewDB("&lt;path-to-db-file&gt;")
	if err != nil {
		log.Fatal(err)
	}

	defer sqliteDb.Close()

	err = db.RunMigrateScripts(sqliteDb.DB)
	if err != nil {
		log.Fatal(err)
	}

	log.Info("successfully migrated DB..")

	// rest of application main
}
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="guidelines-for-sqlite-db-schema-migration-scripts">Guidelines for SQLite DB Schema Migration Scripts</h2>

<p>As mentioned earlier, SQLite has a slight disadvantage compared to other database systems. On most DB systems, the <code class="language-plaintext highlighter-rouge">ALTER TABLE</code> statement can be used quite effectively to change the table structure in the database. However, for SQLite databases, this statement can only perform two operations:</p>
<ul>
  <li>Change the name of a table</li>
  <li>Add columns to a table</li>
</ul>

<p>Other operations, such as renaming or removing a column, or changing the data type of a column, cannot be accomplished with the <code class="language-plaintext highlighter-rouge">ALTER TABLE</code> statement. For these operations, a four-step process needs to be followed:</p>
<ol>
  <li>Rename the existing table.</li>
  <li>Create the new table with the same name that the table originally had.</li>
  <li>Populate this newly created table with information from the old table.</li>
  <li>Drop the old table.</li>
</ol>

<p>The following is a template for the the <code class="language-plaintext highlighter-rouge">up</code> migration scripts following this process:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="rouge-code"><pre>ALTER TABLE &lt;table_name&gt; RENAME TO _&lt;table_name&gt;_old;

CREATE TABLE &lt;table_name&gt;
(
	column1 datatype [NULL | NOT NULL],
	column1 datatype [NULL | NOT NULL],
	...
);

INSERT INTO &lt;table_name&gt; (column1, column2, ...)
	SELECT column1, column2, ...
	FROM _&lt;table_name&gt;_old;

DROP TABLE _&lt;table_name&gt;_old
</pre></td></tr></tbody></table></code></pre></div></div>
<p>The corresponding <code class="language-plaintext highlighter-rouge">down</code> migration scripts shall do the inverse.</p>

<h2 id="conclusion">Conclusion</h2>

<p>This post provides a detailed process for writing a client side Go application with a database, without requiring to install any DB system on the host machine. It also provides a safe, albeit slightly indirect, way of migrating the database schema without worrying about any loss or malformation of data on it.</p>]]></content><author><name>bipul</name></author><category term="golang" /><category term="sqlite" /><category term="migration" /><summary type="html"><![CDATA[SQLite is a nifty choice for client side database applications. It is serverless, works with easily accessible cross-platform database files, and does not need any installation. However, occasionally, there may arise the need to modify the database schema. For instance, one of the tables might need a new column. Such a modification must not, in any manner, tamper with or corrupt any data that already exists on the database. This is where SQLite has a slight disadvantage, discussed in detail later in this post.]]></summary></entry><entry><title type="html">The Pitfall of Using PostgreSQL Advisory Locks with Go’s DB Connection Pool</title><link href="https://engineering.qubecinema.com/2019/08/26/unlocking-advisory-locks.html" rel="alternate" type="text/html" title="The Pitfall of Using PostgreSQL Advisory Locks with Go’s DB Connection Pool" /><published>2019-08-26T00:00:00+00:00</published><updated>2019-08-26T00:00:00+00:00</updated><id>https://engineering.qubecinema.com/2019/08/26/unlocking-advisory-locks</id><content type="html" xml:base="https://engineering.qubecinema.com/2019/08/26/unlocking-advisory-locks.html"><![CDATA[<h2 id="we-have-a-problem">We have a problem!</h2>
<p>Imagine for a moment that you have a microservice written in the Go Programming Language that is deployed on more than one instance for reliabilty and performance reasons.  They all share the same underlying PostgreSQL database.  Perhaps you then want to limit certain functionality to only one instance at a time (e.g. background worker, queue consumption).  To achieve this synchronization, you decide to use PostgreSQL session-level advisory locks.  You initialize a session-level advisory lock, pass it to the concerned functionality and then wrap it with a <em>lock</em> and <em>unlock</em> on the session-level advisory lock.</p>

<p>So far, things sound good.  Only one instance enters the critical section.  All other instances are blocked from entering the critical section till the first instance leaves it.  However, it turns out that once the first instance unlocks the session-level advisory lock and leaves the critical section, no instance (including the first instance) can enter the critical section again.  They all block when acquiring the advisory lock.</p>

<p>Sometimes, the critical section can be entered for a second time, and sometimes even for three or more times.  But finally the end result is the same.  Sooner or later, all instances block on acquiring the session-level advisory lock and none can enter the synchronized portion of code forever.</p>

<h2 id="the-offending-code">The offending code…</h2>
<p>This is the situation we found ourselves in.  We went through our code.  Locking and unlocking a <strong>pg_advisory_lock</strong> was done like below:</p>
<h5 id="locking">Locking</h5>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="n">db</span><span class="p">,</span> <span class="n">_</span> <span class="o">:=</span> <span class="n">sql</span><span class="o">.</span><span class="n">Open</span><span class="p">(</span><span class="s">"postgres"</span><span class="p">,</span> <span class="n">_</span><span class="o">&lt;</span><span class="n">url</span><span class="o">&gt;</span><span class="n">_</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">Exec</span><span class="p">(</span><span class="s">`SELECT pg_advisory_lock($1)`</span><span class="p">,</span> <span class="o">&lt;</span><span class="n">a</span> <span class="n">session</span> <span class="n">id</span><span class="o">&gt;</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<h5 id="unlocking">Unlocking</h5>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="n">db</span><span class="o">.</span><span class="n">Exec</span><span class="p">(</span><span class="s">`SELECT pg_advisory_unlock($1)`</span><span class="p">,</span> <span class="o">&lt;</span><span class="n">the</span> <span class="n">same</span> <span class="n">session</span> <span class="n">id</span><span class="o">&gt;</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="finding-the-culprit">Finding the culprit!</h2>
<p>After consulting our in-house expert and digging a little deeper, we discovered a couple of facts:</p>
<ul>
  <li>Session-level <strong>pg_advisory_locks</strong> can only be released in the same database connection in which it was obtained.  For more details, see <a href="https://www.netguru.com/codestories/advisory-locks">here</a>.</li>
  <li>Go’s standard library <strong>sql</strong> package creates a pool of database connections by default.  See <a href="https://golang.org/pkg/database/sql/#DB">here</a>.  Each DB call is done on an arbitrary connection from the DB pool.</li>
</ul>

<p>So deducing from the above, the DB connection pool must not be returning the same connection for the <em>unlock</em> that was used for the <em>lock</em>.  So how do we enforce that the <em>lock</em> and the <em>unlock</em> are done on the same connection?  It turns out that a single connection can be <a href="https://golang.org/pkg/database/sql/#DB.Conn">obtained from</a> and <a href="https://golang.org/pkg/database/sql/#Conn.Close">released to</a> the DB pool.</p>

<h2 id="the-right-solution">The right solution.</h2>
<p>The proper way to <em>lock</em> and <em>unlock</em> session-level <strong>pg_advisory_locks</strong> is to first obtain a connection from the DB connection pool and store it and use it for both operations.</p>
<h5 id="locking-1">Locking</h5>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="n">db</span><span class="p">,</span> <span class="n">_</span> <span class="o">:=</span> <span class="n">sql</span><span class="o">.</span><span class="n">Open</span><span class="p">(</span><span class="s">"postgres"</span><span class="p">,</span> <span class="n">_</span><span class="o">&lt;</span><span class="n">url</span><span class="o">&gt;</span><span class="n">_</span><span class="p">)</span>
<span class="n">conn</span><span class="p">,</span> <span class="n">_err_</span> <span class="o">:=</span> <span class="n">db</span><span class="o">.</span><span class="n">Conn</span><span class="p">(</span><span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">())</span>
<span class="n">conn</span><span class="o">.</span><span class="n">ExecContext</span><span class="p">(</span><span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">(),</span> <span class="s">`SELECT pg_advisory_lock($1)`</span><span class="p">,</span> <span class="o">&lt;</span><span class="n">a</span> <span class="n">session</span> <span class="n">id</span><span class="o">&gt;</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<h5 id="unlocking-1">Unlocking</h5>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="n">conn</span><span class="o">.</span><span class="n">ExecContext</span><span class="p">(</span><span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">(),</span> <span class="s">`SELECT pg_advisory_unlock($1)`</span><span class="p">,</span> <span class="o">&lt;</span><span class="n">the</span> <span class="n">same</span> <span class="n">session</span> <span class="n">id</span><span class="o">&gt;</span><span class="p">)</span>
<span class="n">conn</span><span class="o">.</span><span class="n">Close</span><span class="p">()</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>The size of the DB connection pool should be increased appropriately to compensate for the connections that are permanently taken away to be used in this way.</p>

<h2 id="conclusion">Conclusion</h2>
<p><em>Moral of the story</em>: Always unlock a PostgreSQL session-level advisory lock on the same connection which was used to lock it, otherwise sooner or later, you will end up not unlocking it, and then you will be blocked forever.  Fortunately, this kind of problem will show up in your face immediately rather than later.  Unfortunately, the solution is not straightforward, but a little subtle.  Hopefully, instead of rolling back the changes and avoiding or limiting the use of PostgreSQL session-level advisory locks, this post will help you solve the problem correctly.</p>]]></content><author><name>jayakumar</name></author><category term="postgres" /><category term="postgresql" /><category term="database" /><category term="go" /><summary type="html"><![CDATA[We have a problem! Imagine for a moment that you have a microservice written in the Go Programming Language that is deployed on more than one instance for reliabilty and performance reasons. They all share the same underlying PostgreSQL database. Perhaps you then want to limit certain functionality to only one instance at a time (e.g. background worker, queue consumption). To achieve this synchronization, you decide to use PostgreSQL session-level advisory locks. You initialize a session-level advisory lock, pass it to the concerned functionality and then wrap it with a lock and unlock on the session-level advisory lock.]]></summary></entry><entry><title type="html">Finding Network Routing Path in Golang</title><link href="https://engineering.qubecinema.com/2019/05/13/go-routing-package.html" rel="alternate" type="text/html" title="Finding Network Routing Path in Golang" /><published>2019-05-13T00:00:00+00:00</published><updated>2019-05-13T00:00:00+00:00</updated><id>https://engineering.qubecinema.com/2019/05/13/go-routing-package</id><content type="html" xml:base="https://engineering.qubecinema.com/2019/05/13/go-routing-package.html"><![CDATA[<p>This post shares our experience on finding a way to determine the network interfaces on a Linux Machine that provides the route to a particular remote machine. The use case would’ve been a non-issue if the host machine had a single network interface. But in our case, the host machine has multiple network interfaces bridging different networks. Though this problem seems to be a very trivial one on the first sight from the implementation point of view, the way the problem is solved easily in Golang is laudable.</p>

<p><em>If you are on the rush of finding the solution, scroll down all the way to the end.</em></p>

<p>We were developing a feature, in which, we had to configure the host machine IP address on a remote machine so that the remote machine can pull data at its ease. As the host machine is connected to multiple networks, the challenge is to identify the right network interface that is accessible by the remote machine and configure the corresponding IP address on the remote machine. If we fail to pass on the right network interface to the receiver, our whole purpose of building that feature will be collapsed. So it turned out to be an unprecedented requirement for us to solve.</p>

<p>For the purpose of illustration, let us assume that we have two networks connected to the machine and one of the two networks is a public network and the other is a private network.</p>

<p><img src="https://engineering.qubecinema.com/assets/images/multinetwork.png" alt="multinetwork-setup" /></p>

<p>Given this setup, our goal was to find the network interface the packet takes to reach its destination.  A bit of schooling - If a packet wants to reach a host located in a private network, will take <em>Network Interface 1</em> as its exit route, likewise a packet takes <em>Network Interface 2</em> to reach a host in public network.</p>

<p>Our first thought solution was to leverage the <a href="https://en.wikipedia.org/wiki/Routing_table">Routing table</a>. Fortuitously, we found the Linux command in no time to query the routing table to extract the desired information (network interface).  By formatting the command with the destination host IP address we will be able to figure out the network interface a packet takes to reach the destination host.  The <code class="language-plaintext highlighter-rouge">ip route</code> command looks like the command below.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>ip route get 172.217.160.132
</pre></td></tr></tbody></table></code></pre></div></div>
<p>That seems to be an easy-peasy solution to the problem! Isn’t it? Integrating the raw <code class="language-plaintext highlighter-rouge">ip route</code> command programmatically just as the snippet below solves the problem.</p>
<div class="language-golang highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="rouge-code"><pre><span class="c">// cmd = "ip route get 172.217.160.132"</span>
<span class="k">func</span> <span class="n">exe_cmd</span><span class="p">(</span><span class="n">cmd</span> <span class="kt">string</span><span class="p">,</span> <span class="n">wg</span> <span class="o">*</span><span class="n">sync</span><span class="o">.</span><span class="n">WaitGroup</span><span class="p">)</span> <span class="p">{</span>
	<span class="n">parts</span> <span class="o">:=</span> <span class="n">strings</span><span class="o">.</span><span class="n">Fields</span><span class="p">(</span><span class="n">cmd</span><span class="p">)</span>
	<span class="n">head</span> <span class="o">:=</span> <span class="n">parts</span><span class="p">[</span><span class="m">0</span><span class="p">]</span>
	<span class="n">parts</span> <span class="o">=</span> <span class="n">parts</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="nb">len</span><span class="p">(</span><span class="n">parts</span><span class="p">)]</span>

	<span class="n">out</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">exec</span><span class="o">.</span><span class="n">Command</span><span class="p">(</span><span class="n">head</span><span class="p">,</span> <span class="n">parts</span><span class="o">...</span><span class="p">)</span><span class="o">.</span><span class="n">Output</span><span class="p">()</span>
	<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="n">fmt</span><span class="o">.</span><span class="n">Printf</span><span class="p">(</span><span class="s">"%s"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
	<span class="p">}</span>
	
	<span class="n">fmt</span><span class="o">.</span><span class="n">Printf</span><span class="p">(</span><span class="s">"%s"</span><span class="p">,</span> <span class="n">out</span><span class="p">)</span>
	<span class="n">wg</span><span class="o">.</span><span class="n">Done</span><span class="p">()</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>The output will look similar to the below output.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>172.217.160.132 via 172.17.0.1 dev eth1  src 172.17.0.2
</pre></td></tr></tbody></table></code></pre></div></div>
<p>The caveat with this approach is that you have to parse the output string to extract out the necessary information. Unfortunately, this process will turn out to be more painful in case of any error like network going down or not connected to the internet. Even if it’s plausible for us to parse for any possible output, it will eventually fail to meet the coding standards leaving the reviewers’ eyebrows frowned.  We realized that we got to look for a neat and impeccable solution.</p>

<p>Further exploring led us to a very convincing solution in all the ways. The sub-package <a href="https://godoc.org/github.com/google/gopacket/routing">routing</a> of <a href="https://godoc.org/github.com/google/gopacket">gopacket</a> gave us everything that we felt wanting in the previous approach.  It prevented our code to give a room for string parsing logic and escaped us from damaging the readability of the code.  As an added bonus, we enjoyed the ease of handling any errors on using <code class="language-plaintext highlighter-rouge">routing</code> package.</p>

<p>Without anything stopping us further, we jumped on to using  <code class="language-plaintext highlighter-rouge">routing</code>  sub-package leaving the Linux Command approach as an ephemeral hero.</p>
<div class="language-golang highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
</pre></td><td class="rouge-code"><pre><span class="k">func</span> <span class="n">determineRouteInterface</span><span class="p">(</span><span class="n">serverAddr</span> <span class="kt">string</span><span class="p">)</span> <span class="kt">error</span> <span class="p">{</span>
	<span class="k">var</span> <span class="n">ip</span> <span class="n">net</span><span class="o">.</span><span class="n">IP</span>
	<span class="k">if</span> <span class="n">ip</span> <span class="o">=</span> <span class="n">net</span><span class="o">.</span><span class="n">ParseIP</span><span class="p">(</span><span class="n">serverAddr</span><span class="p">);</span> <span class="n">ip</span> <span class="o">==</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="k">return</span> <span class="n">fmt</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"error as non-ip target %s is passed"</span><span class="p">,</span> <span class="n">serverAddr</span><span class="p">)</span>
	<span class="p">}</span>

	<span class="n">router</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">routing</span><span class="o">.</span><span class="n">New</span><span class="p">()</span>
	<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="k">return</span> <span class="n">errors</span><span class="o">.</span><span class="n">Wrap</span><span class="p">(</span><span class="n">err</span><span class="p">,</span> <span class="s">"error while creating routing object"</span><span class="p">)</span>
	<span class="p">}</span>

	<span class="n">_</span><span class="p">,</span> <span class="n">gatewayIP</span><span class="p">,</span> <span class="n">preferredSrc</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">router</span><span class="o">.</span><span class="n">Route</span><span class="p">(</span><span class="n">ip</span><span class="p">)</span>
	<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="k">return</span> <span class="n">errors</span><span class="o">.</span><span class="n">Wrapf</span><span class="p">(</span><span class="n">err</span><span class="p">,</span> <span class="s">"error routing to ip: %s"</span><span class="p">,</span> <span class="n">serverAddr</span><span class="p">)</span>
	<span class="p">}</span>

	<span class="n">fmt</span><span class="o">.</span><span class="n">Printf</span><span class="p">(</span><span class="s">"gatewayIP: %v preferredSrc: %v"</span><span class="p">,</span> <span class="n">gatewayIP</span><span class="p">,</span> <span class="n">preferredSrc</span><span class="p">)</span>
	<span class="k">return</span> <span class="no">nil</span>
<span class="p">}</span>

</pre></td></tr></tbody></table></code></pre></div></div>
<p>The output of the above code snippet will look similar to the below output.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>gatewayIP: 172.17.0.1 preferredSrc: 172.17.0.2
</pre></td></tr></tbody></table></code></pre></div></div>
<p>With no surprise, the output is very neatly present eliminating the need for fancy string parsing logic.</p>

<p>In the first place, we were totally surprised to see a package available in Golang to query the routing table on a Linux machine. We understood that the problem of determining a network interface a packet takes to reach the destination is very rare and least underscored on the web.  Hence, we decided to bring the <code class="language-plaintext highlighter-rouge">routing</code> package to the light.  Indeed, this blog post was a result of our excitement on finding a package to fetch routing information from an underlying routing table on a Linux machine ;)</p>]]></content><author><name>ajith</name></author><category term="go" /><category term="routing" /><summary type="html"><![CDATA[This post shares our experience on finding a way to determine the network interfaces on a Linux Machine that provides the route to a particular remote machine. The use case would’ve been a non-issue if the host machine had a single network interface. But in our case, the host machine has multiple network interfaces bridging different networks. Though this problem seems to be a very trivial one on the first sight from the implementation point of view, the way the problem is solved easily in Golang is laudable.]]></summary></entry><entry><title type="html">Optimising Time Window Queries with Postgres Timestamp Range Data Types</title><link href="https://engineering.qubecinema.com/2019/05/04/time-window-queries-with-postgres-timestamp-range.html" rel="alternate" type="text/html" title="Optimising Time Window Queries with Postgres Timestamp Range Data Types" /><published>2019-05-04T00:00:00+00:00</published><updated>2019-05-04T00:00:00+00:00</updated><id>https://engineering.qubecinema.com/2019/05/04/time-window-queries-with-postgres-timestamp-range</id><content type="html" xml:base="https://engineering.qubecinema.com/2019/05/04/time-window-queries-with-postgres-timestamp-range.html"><![CDATA[<p>This post explains how we were able to improve a database query performance by replacing two individual <code class="language-plaintext highlighter-rouge">timestamptz</code> columns with single Postgres’s <code class="language-plaintext highlighter-rouge">tstzrange</code> range column.</p>

<p>It all began when our operation team complaint that a particular report is taking really lots of time to generate. Initial analysis revealed the report spent most of the time in executing a particular query in our Postgres database. Though the query is joining(inner) three tables, the join and where clause are straightforward. We also confirmed indexes exist for the columns involved in join and where clause. So we fired up <a href="https://github.com/ankane/pghero">pgHero</a> and generated the execution plan for the query. Loading the execution plan into <a href="https://tatiyants.com/pev">PEV</a> indicated that the where clause on timestamp columns are not using the indexes and had to scan all rows.</p>

<p>To understand the problem deeper let us consider the following pseudo table models.</p>

<h4 id="theatres-table"><em>theatres</em> table</h4>

<table>
  <thead>
    <tr>
      <th>id</th>
      <th>name</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>1000</td>
      <td>Theatre 1</td>
    </tr>
    <tr>
      <td>1001</td>
      <td>Theatre 2</td>
    </tr>
  </tbody>
</table>

<h4 id="screens-table"><em>screens</em> table</h4>

<table>
  <thead>
    <tr>
      <th>id</th>
      <th>name</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>2000</td>
      <td>Screen 1</td>
    </tr>
    <tr>
      <td>2001</td>
      <td>Screen 2</td>
    </tr>
  </tbody>
</table>

<h4 id="licenses-table"><em>licenses</em> table</h4>

<table>
  <thead>
    <tr>
      <th>id</th>
      <th>screen_id</th>
      <th>theatre_id</th>
      <th>valid_from</th>
      <th>valid_till</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>3000</td>
      <td>2000</td>
      <td>1000</td>
      <td>2019-04-01T00:00:00+0000</td>
      <td>2019-05-15T23:59:59+0000</td>
    </tr>
    <tr>
      <td>3001</td>
      <td>2001</td>
      <td>1001</td>
      <td>2019-04-01T00:00:00+0000</td>
      <td>2019-05-15T23:59:59+0000</td>
    </tr>
    <tr>
      <td>3002</td>
      <td>2000</td>
      <td>1000</td>
      <td>2019-05-01T00:00:00+0000</td>
      <td>2019-05-31T23:59:59+0000</td>
    </tr>
    <tr>
      <td>3003</td>
      <td>2001</td>
      <td>1001</td>
      <td>2019-05-01T00:00:00+0000</td>
      <td>2019-05-31T23:59:59+0000</td>
    </tr>
  </tbody>
</table>

<p>The report is about listing all theatre, screen pairs having any valid movie license within the specified date/time range. The offending query was written like below:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="k">SELECT</span> <span class="k">DISTINCT</span> <span class="n">t</span><span class="p">.</span><span class="n">id</span> <span class="k">AS</span> <span class="n">theatre_id</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">id</span> <span class="k">AS</span> <span class="n">screen_id</span>
<span class="k">FROM</span> <span class="n">theatres</span> <span class="n">t</span>
<span class="k">INNER</span> <span class="k">JOIN</span> <span class="n">licenses</span> <span class="n">l</span> <span class="k">ON</span> <span class="n">l</span><span class="p">.</span><span class="n">theatre_id</span> <span class="o">=</span> <span class="n">t</span><span class="p">.</span><span class="n">id</span>
<span class="k">INNER</span> <span class="k">JOIN</span> <span class="n">screens</span> <span class="n">s</span> <span class="k">ON</span> <span class="n">s</span><span class="p">.</span><span class="n">id</span> <span class="o">=</span> <span class="n">l</span><span class="p">.</span><span class="n">screen_id</span>
<span class="k">WHERE</span> <span class="p">(</span><span class="n">l</span><span class="p">.</span><span class="n">valid_from</span><span class="p">,</span> <span class="n">l</span><span class="p">.</span><span class="n">valid_till</span><span class="p">)</span> <span class="k">OVERLAPS</span> <span class="p">(</span><span class="s1">'2019-04-01T00:00:00+0000'</span><span class="p">,</span> <span class="s1">'2019-05-31T23:59:59+0000'</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The above query took <strong>~450ms</strong> to select <strong>140</strong> rows on tables <strong>theatres</strong>, <strong>screens</strong> and <strong>licenses</strong> with row count of <strong>37339</strong>, <strong>147850</strong>, <strong>350693</strong> respectively. The execution plan indicated the query uses indexes for the various <code class="language-plaintext highlighter-rouge">id</code> columns but didn’t use the indexes for <code class="language-plaintext highlighter-rouge">valid_from</code> and <code class="language-plaintext highlighter-rouge">valid_till</code> column.</p>

<p>Suspecting the timestamp indexes, we rewrote the query like below to check the usage of operator <code class="language-plaintext highlighter-rouge">OVERLAPS</code> prevents the query using the indexes.</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="k">SELECT</span> <span class="k">DISTINCT</span> <span class="n">t</span><span class="p">.</span><span class="n">id</span> <span class="k">AS</span> <span class="n">theatre_id</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">id</span> <span class="k">AS</span> <span class="n">screen_id</span>
<span class="k">FROM</span> <span class="n">theatres</span> <span class="n">t</span>
<span class="k">INNER</span> <span class="k">JOIN</span> <span class="n">licenses</span> <span class="n">l</span> <span class="k">ON</span> <span class="n">l</span><span class="p">.</span><span class="n">theatre_id</span> <span class="o">=</span> <span class="n">t</span><span class="p">.</span><span class="n">id</span>
<span class="k">INNER</span> <span class="k">JOIN</span> <span class="n">screens</span> <span class="n">s</span> <span class="k">ON</span> <span class="n">s</span><span class="p">.</span><span class="n">id</span> <span class="o">=</span> <span class="n">l</span><span class="p">.</span><span class="n">screen_id</span>
<span class="k">WHERE</span> <span class="p">(</span><span class="n">l</span><span class="p">.</span><span class="n">valid_from</span> <span class="o">&gt;=</span> <span class="s1">'2019-04-01T00:00:00+0000'</span> <span class="k">AND</span> <span class="n">l</span><span class="p">.</span><span class="n">valid_from</span> <span class="o">&lt;=</span> <span class="s1">'2019-05-31T23:59:59+0000'</span><span class="p">)</span>
    <span class="k">OR</span> <span class="p">(</span><span class="n">l</span><span class="p">.</span><span class="n">valid_till</span> <span class="o">&gt;=</span> <span class="s1">'2019-04-01T00:00:00+0000'</span> <span class="k">AND</span> <span class="n">l</span><span class="p">.</span><span class="n">valid_till</span> <span class="o">&lt;=</span> <span class="s1">'2019-05-31T23:59:59+0000'</span><span class="p">)</span>
     <span class="k">OR</span> <span class="p">(</span><span class="n">l</span><span class="p">.</span><span class="n">valid_from</span> <span class="o">&lt;=</span> <span class="s1">'2019-04-01T00:00:00+0000'</span> <span class="k">AND</span> <span class="n">l</span><span class="p">.</span><span class="n">valid_till</span> <span class="o">&gt;=</span> <span class="s1">'2019-05-31T23:59:59+0000'</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>To our surprise the query is fast, it took just <strong>~180ms* to select the same **140</strong> rows from the same dataset. The execution plan showed it uses the timestamp indexes. We couldn’t understand why the modified query is able to use indexes and faster, but the original query with <code class="language-plaintext highlighter-rouge">OVERLAPS</code> operator couldn’t and slower. We did a little search on the Internet but we couldn’t gather much info. But thinking about it aloud we realized rationale. The operator <code class="language-plaintext highlighter-rouge">OVERLAPS</code> checks whether two timestamp ranges overlaps or not. But the indexes are created for the individual timestamp columns. It is understandable that the operator can’t take advantage of the indexes as it requires a single timestamp range comprising both the start and end timestamps, i.e. the <code class="language-plaintext highlighter-rouge">valid_from</code> and <code class="language-plaintext highlighter-rouge">valid_till</code>. So the query had to scan all rows to construct the timestamp range, combined value of <code class="language-plaintext highlighter-rouge">valid_from</code> and <code class="language-plaintext highlighter-rouge">valid_till</code>, to check against the specified timestamp range.</p>

<p>As we realized the separate indexes doesn’t help time window overlap condition, we were set to find out is there a way to create an index for time window/timestamp ranges. We found that Postgres has builtin timestamp range data types <code class="language-plaintext highlighter-rouge">tsrange</code> (without time zone info) and <code class="language-plaintext highlighter-rouge">tstzrange</code> (with time zone info). So we replaced the two columns <code class="language-plaintext highlighter-rouge">valid_from</code> and <code class="language-plaintext highlighter-rouge">valid_till</code> with single column <code class="language-plaintext highlighter-rouge">validity</code> of type <code class="language-plaintext highlighter-rouge">tstzrange</code>. We also created a compatible index, GiST, for the column data. While migrating the data from existing columns, we also found invalid data of kind <code class="language-plaintext highlighter-rouge">valid_till</code> being less than <code class="language-plaintext highlighter-rouge">valid_from</code>. The <code class="language-plaintext highlighter-rouge">tstzrange</code> type’s builtin validation caught these invalid data and we were able to fix them.</p>

<h4 id="licenses-table-with-column-validity"><em>licenses</em> table with column <em>validity</em></h4>

<table>
  <thead>
    <tr>
      <th>id</th>
      <th>screen_id</th>
      <th>theatre_id</th>
      <th>validity</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>3000</td>
      <td>2000</td>
      <td>1000</td>
      <td>[“2019-04-01T00:00:00+0000”,”2019-05-15T23:59:59+0000”]</td>
    </tr>
    <tr>
      <td>3001</td>
      <td>2001</td>
      <td>1001</td>
      <td>[“2019-04-01T00:00:00+0000”,”2019-05-15T23:59:59+0000”]</td>
    </tr>
    <tr>
      <td>3002</td>
      <td>2000</td>
      <td>1000</td>
      <td>[“2019-05-01T00:00:00+0000”,”2019-05-31T23:59:59+0000”]</td>
    </tr>
    <tr>
      <td>3003</td>
      <td>2001</td>
      <td>1001</td>
      <td>[“2019-05-01T00:00:00+0000”,”2019-05-31T23:59:59+0000”]</td>
    </tr>
  </tbody>
</table>

<p>We rewrote the query like below to achieve the result with the new <code class="language-plaintext highlighter-rouge">validity</code> column of type <code class="language-plaintext highlighter-rouge">tstzrange</code>.</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="k">SELECT</span> <span class="k">DISTINCT</span> <span class="n">t</span><span class="p">.</span><span class="n">id</span> <span class="k">AS</span> <span class="n">theatre_id</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">id</span> <span class="k">AS</span> <span class="n">screen_id</span>
<span class="k">FROM</span> <span class="n">theatres</span> <span class="n">t</span>
<span class="k">INNER</span> <span class="k">JOIN</span> <span class="n">licenses</span> <span class="n">l</span> <span class="k">ON</span> <span class="n">l</span><span class="p">.</span><span class="n">theatre_id</span> <span class="o">=</span> <span class="n">t</span><span class="p">.</span><span class="n">id</span>
<span class="k">INNER</span> <span class="k">JOIN</span> <span class="n">screens</span> <span class="n">s</span> <span class="k">ON</span> <span class="n">s</span><span class="p">.</span><span class="n">id</span> <span class="o">=</span> <span class="n">l</span><span class="p">.</span><span class="n">screen_id</span>
<span class="k">WHERE</span> <span class="n">l</span><span class="p">.</span><span class="n">validity</span> <span class="o">&amp;&amp;</span> <span class="n">tstzrange</span><span class="p">(</span><span class="s1">'["2019-04-01T00:00:00+0000", "2019-05-31T23:59:59+0000"]'</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now the query is as fast as the previous modified version, took <strong>~180ms</strong> to produce same <strong>140</strong> rows from same the dataset. But the usage of <code class="language-plaintext highlighter-rouge">validity</code> column with <code class="language-plaintext highlighter-rouge">tstzrange</code> type’s operator <code class="language-plaintext highlighter-rouge">&amp;&amp;</code> made the query succinct. Additionally <code class="language-plaintext highlighter-rouge">tstzrange</code> type also brought in automatic validation for the timestamp range data.</p>

<p>We were happy that we not only made the query fast but also learned something new in Postgres. We hope this experience and learning of ours will be useful to you.</p>

<h3 id="references">References</h3>
<p><a href="https://www.postgresql.org/docs/11/rangetypes.html">Postgres Range Types</a></p>]]></content><author><name>sivachandran</name></author><category term="postgres" /><category term="database" /><category term="sql" /><summary type="html"><![CDATA[This post explains how we were able to improve a database query performance by replacing two individual timestamptz columns with single Postgres’s tstzrange range column.]]></summary></entry></feed>